# Intro Earlier today I was reading the Kaspersky report on [Operation Triangulation](https://securelist.com/triangulation-validators-modules/110847/), interesting, worth the read (really bad play on the `MD5`, just saying, what year is it). Anyway one of the identified modules is for audio-capture and uses the `Speex` codec, you can learn more about `Speex` [here](https://www.speex.org/). Personally I have no experience with audio-capture (/programming), it's not really something which is useful in `Red Teaming` because client-objectives live on digital infrastructure. I thought it would be interesting though to see how hard it is to design a similar type of capability. It turns out that it is not very hard, in about an hour I wrote a cross-platform capability called `Speec`. In this short post I just want to go through my thought-process for novel capability development using a practical example. ### Design `Speec` is able to enumerate accessible audio capture devices on the OS. ![[1-Speec.png]] You can see some sample output below. I am pulling only the most relevant information from the devices. ![[2-Speec.png]] That part is pretty straight-forward. Then of course comes the recording part, again not very hard but I wanted to add some extra design requirements. - I want to receive audio in small buffers (`1000ms`) because if anything happens to interrupt the capture we technically have the ability to salvage the recording - I learned that raw audio (PCM) is just a data-stream which you can concatenate, that's a very practical property because it means we can manipulate the data freely - I want to store raw PCM encrypted in memory while it is recording, if the process memory is captured then the data remains inaccessible. Technically not true, forensics would likely be able to recover cryptographic materials from memory. In the POC I `Gzip` compress and then `AES` encrypt - I want to be able to flush encrypted PCM data to disk based on a timer or based on a threshold for the total size of encrypted data in memory - I presumably want to exfiltrate the data so size is important, one assumes this is the reason why the `Triangulation` implant uses `Speex`. I chose to use `Opus` which seems to be the successor to `Speex`, you can find details [here](https://opus-codec.org/) The design layout is shown below: ![[3-Speec.png]] ### Example This is obviously just POC code but it illustrates the idea. ![[4-Speec.png]] Some things to note here: - You can actually stream the data over a web transport which is quite interesting - You cannot directly play the `Opus` file, you would have to reconvert it `Opus -> PCM -> wav (or somethign else)` but the big benefit is the size compression as you can see # Thoughts It's an interesting developer experiment but again not actually useful from a `Red Team` perspective. I think this also illustrates a point often missed in the OST debate where discussions are framed as if creating a capability is hard somehow and not just basic programming. ![[5-Speec.png]] ``` They walk unseen in lonely places where the Words have been spoken and the Rites howled. They bend the forest and crush the city, yet may not forest or city behold the hand that smites. Kadath in the cold waste hath known Them, and what man knows Kadath? ```