Mar13

Project Difficulty

A USB microphone for online meetings

Posted on Mar 13, 2021 in audio, hardware, stm32

Here in the UK the new reality of working in the IT business over the past year has been that we’re all at home working remotely over virtual desktop connections and for someone engaged in software development this is a setup that works well. Having to commute 90 minutes each way into London every day on the train is not something I’ll ever miss.

Team meetings are still an important part of the day though and that meant digging out and dusting off my old webcam, a Logitech something-or-other that works fine in every scenario except when I use it through my company’s Citrix-hosted virtual desktop. The video is fine but the audio frequency on the VDI is mismatched to the actual frequency on the physical device. I sound like Mickey Mouse on helium.

This is a well-known problem and the solution is to change the frequency on the VDI, which requires administrator level access. And that is never going to happen. I could try calling our support desk just to see how long it takes before they realise it’s not company-issued hardware and therefore the ticket has just been closed and is there anything else they can help me with today? No, not going to do that.

So I’ve been muddling through by using a rather useful Android USB microphone app called Wo-Mic. You install a PC server component, connect your phone by cable and voila your PC has a new USB microphone that’s actually your phone. It’s not a bad solution and the audio quality is very good but I’d really like a dedicated microphone that I can just plug in and place on the desk in front of my keyboard.

Any normal person would open Amazon and either buy the cheapest microphone available, or perhaps a Røde if they know decent audio when they hear it. Not me though. This sounds like a project if ever I heard one. A good chance to see if I can build a USB microphone and learn a thing or two along the way.

The design

I came up with two possible options for the design. One would be part-analogue and the other all-digital.

Both options are designed around a MEMS microphone for translating sound pressure levels into an electrical signal. Physically a MEMS microphone comes in a miniature metal ‘can’ that houses the required circuitry inside. The can serves both as physical protection for the sensitive receiver and as electrical protection from interference for the analogue circuitry. There is a small hole drilled in either the top or the bottom of the can to allow sound waves to enter.

This is a TDK ICS43432

If the port is on the bottom then you need to provide a hole of the same size in your PCB. This is the most tricky design for a hobbyist to work with because you absolutely must not get any gunk in that port which means being super-careful where you put flux on the pads that will be very, very close by.

The tiny circuit board on the bottom with those difficult pads

In option 1, the part-analogue design, I would connect a MEMS microphone up to some analogue signal conditioning and amplification circuitry before feeding it to an AD converter and then into an MCU for digital signal processing and output over a USB interface using the USB audio device class.

The tricky parts in this design are the signal conditioning and the ADC. MEMS microphones output a very low AC signal typically around ±1V. This would need to be amplified using an op-amp before sending to an analog-to-digital converter. I’ve modelled this design in LT-Spice and am convinced that I could do it but the devil would be in the implementation details such as noise problems from the mixed-signal circuit board, choice of op-amp and ADC.

Option 2 does away with those tricky analogue signal conditioning parts and uses a digital MEMS microphone. These devices have the analogue processing inside the metal can and their output is a digital I²S signal. From a noise-reduction and signal quality perspective keeping those sensitive analogue parts inside the shielding of the can is the best decision.

I decided to go with Option 2, the all-digital (to me) design.

Selecting a MEMS microphone

ST’s range of digital microphones provide a PDM output which needs some fairly involved but perfectly do-able software decoding in the MCU to get the PCM samples that I need. Best though are the microphones from TDK and Knowles that provide an I²S output. I²S is supported in hardware by many of the STM32 devices (it’s just a specific application of SPI) so the microphone can be hooked up directly to the MCU and we’ll be provided with a constant stream of PCM samples that we can work with directly.

I’ve had mixed sucess in other projects where I’ve used these microphones. The first build I did was a total write-off because like a complete idiot I washed the board after building it, immediately destroying the microphone by getting water in the port. Undeterred I built another one and it did work but, being an analogue microphone the audio quality wasn’t great because my first attempt at filtering and amplification didn’t hit the mark.

The next board I built was a different design that used a Knowles I²S microphone and I had big problems getting it to reflow to the board. I think the metal can caused problems with heat transfer to the pads underneath. And that’s the crux of the problem – the pads are completely underneath the microphone so it’s impossible to check for bad joints. I decided that going forward I’d simply buy a breakout board with the microphone already on it and design a carrier board around it.

This is the one I decided to use. It’s the INMP441 by TDK, Invensense or whatever they’re calling themselves this week. It’s actually gone NRND now but that doesn’t matter because I’m not going into mass production and I can get these breakout boards on ebay for less than £5.

MCU selection

It’s an STM32 and that’ll come as no surprise to anyone that follows this blog because they’re just so versatile. I need one that has I²S and USB peripherals and will be capable of translating the I²S format into the USB format in real time. Since this is a one-off build overkill is not an issue so I’ve selected the STM32F446RCT7 in the LQFP64 package.

This has 256kB of flash, 128kB of SRAM and can run at up to 180MHz. A lot of those resources are going to go unused but for the sake of a few pounds I’d rather have resources to spare than find I needed more to complete the project.

Detailed design

Here’s a more detailed view of the proposed design.

The STM32 sits at the center of the design and acts as an I²S master, providing the I²S clock and word select (WS) signals and receiving serial data as the response. Here’s a view of the protocol from the INMP441 datasheet.

The protocol in the middle is the one I’ll be using

The STM32 will provide a 48kHz WS signal and since there are 64 clocks per period then the clock will be 3.072MHz. Providing this clock accurately with zero error will require a dedicated external oscillator that you will see in the schematic. The data from the INMP441 is provided as 24-bit signed PCM samples, MSB first and left-justified into 32-bit words. The data bits are offset by one clock from the change in the WS signal and are available to read on the rising edge of the clock. This is known as the ‘Philips’ standard. Presumably the one clock offset was to allow early hardware implementations to use a clock cycle to reset their registers and prepare for a new sample.

The I²S peripheral will be connected to the DMA peripheral so it can operate in ‘hands free’ mode. The firmware will receive interrupts when buffers of data are ready to be processed and sent.

A buffer of data from the microphone will be presented as 64-bit samples with 24-bits of data in the left channel and zero in the right channel. ST provide a reference implementation of the USB audio device class that operates on 16-bit samples and I’ve decided not to try to change that, at least not in my first release. Therefore the first task is to downsample the sparse array of 64-bit samples into a new buffer of 16-bit PCM samples.

The next task in the pipeline is to apply any ‘graphic equalizer’ filtering that might be required. For example I might need to suppress high or low frequency noise to clean the signal up. If I’m really lucky the signal will be perfect out-of-the-box but somehow I doubt it.

Secondly I’ll need to adjust the signal volume (amplitude) according to my preference. The USB audio device class provides for a volume control and PC operating systems expose that in the form of a microphone volume slider in their ‘settings’ control panels.

Finally the fully transformed buffer of PCM signals can be sent to the USB firmware for transmission to the PC.

All of this has to be designed so that the number of samples that we gather in one DMA buffer is sufficiently large that we have enough time to do all the processing before the next buffer is available but cannot exceed the maximum size permitted to be sent in a single USB packet. This is why I’ve selected a 180MHz CPU with DSP instructions — finding that I’m CPU-bound would be a show-stopper for the project.

Schematic

Here’s the full schematic for this project.

Click for a PDF

The schematic is quite modular in design so let’s have a walk-through of each section.

The power supply and USB connection

5V is delivered to the board over the USB cable, filtered through the usual LC network that I use and connected to a Texas Instruments LP5907 3.3v ultra low noise regulator. The USB data signals and the 5V input are passed through an ST Micro USBLC6 ESD protection IC.

The INMP441 microphone

The footprint for the INMP441 is just two rows of 3 female pin headers spaced at 300mil. L/R is pulled down to GND with R5 which should cause the INMP441 to output data in the left channel. Rather than leave this hardwired I also decided to connect it up to the MCU just in case I needed to assert manual control over that line.

The INMP441 specification requires a 100k pull-down on the SD line and when I wrote this schematic my INMP441 was on the slow-boat from China so I decided to include the footprint for R2 and if it turned out that the board included it then I’d leave mine off the final build.

The STM32F446RC

The smallest available LQFP package has 64 pins so lots of them are going to be unused. On the left side we’ve got the I²S signals connected to the I2S3 peripheral. There’s also a simple GPIO input for a physical mute button on board. When muting is enabled the device will ignore DMA interrupts. USB I/O and the necessary SWD programming ports complete the left side of the picture.

Two LED outputs are provided. The blue link LED will light when the USB connection is active and running. If there’s a software crash in the form of a hard-fault then I’ll rapidly flash this LED and to that end I’m providing a reset button that can be used to get me out of this situation without having to unplug the USB cable. The red live LED will light up when data is actively being sent over the audio connection. When software or hardware muting is enabled then this light will go out.

Optimum USB audio quality requires accurate clocks. I2S_CK on PC9 is one such clock. It’s connected to an external 12.288MHz Microchip oscillator. Internally the STM32 can divide this by 256 to get exactly 48kHz and by 4 to get exactly 3.072MHz.

The STM32’s core clock will be derived from an external 8MHz crystal that also guarantees an accurate 48MHz clock for the USB peripheral.

Bill of materials

Here’s the full bill of materials for this project.

Identifiers	Value	Quantity	Description	Footprint
C1	10n	1	Ceramic capacitors	0603
C2, C7, C8, C9, C10, C11, C15, C16, C18	100n	9	Ceramic capacitors	0603
C3, C6, C12	4.7µ	3	Ceramic capacitors	0603
C4, C5	1µ	2	Ceramic capacitors	0603
C13, C14	22p	2	Ceramic capacitors	0603
C17	22µ	1	electrolytic capacitor	2.5mm lead pitch
D1	Live LED	1	Red LED	2012
D2	Link LED	1	Blue LED	2012
FB1	BLM18PG221SN1D	1	Ferrite bead	0603
P1	USB connector	1	USB mini-B	custom
P2	INMP441	1	2x 3-pin headers	100mil headers, 300mil spacing between headers
P3	JST XHP-5	1	Female SWD header	5x 2.5mm
R1, R2, R5	100k	3	Chip SMD resistor	0603
R3	330	1	Chip SMD resistor	0805
P4	150	1	Chip SMD resistor	0805
SW1, SW2		2	PCB button	custom
U1	STM32F446RCT7	1	MCU	LQFP64
U2	LP5907	1	LDO regulator	SOT23-5
U3	USBLC6-2SC6	1	USB ESD protection	SOT23-6
Y1	Abracon ABM3B	1	8MHz crystal	custom
Y2	DSC6011CI2A-012.2880T	1	Microchip 12.288MHz oscillator	custom

PCB layout

Before starting the layout I had a quick look at the current JLCPCB prices and could scarcely believe my eyes when I saw that controlled impedance 4-layer boards are available up to 100x100mm for just $8 a pack of five. It wasn’t that long ago when 4 layer boards would run into hundreds of dollars!

Are 4-layer boards going to be the new normal? They are for me that’s for sure. The benefits of being able to just drop a via when you need power or ground are hard to ignore. The only minor annoyance with JLPCB’s implementation is that they won’t accept a negative gerber for an internal plane. You have to do everything with polygons and fills.

Here’s the PCB layout.

I shelved the polgon pours, including the two internal pours for these screenshots hence the rats-nest of apparently unconnected nets.

The first internal layer is GND and the second is VDD. The layout of the header pins for the INMP441 is designed so that the hole in its board that allows sound to pass through into the microphone can is facing upwards. I do have some concerns about dust getting into that hole over time so I may have to consider a housing for the microphone with some foam over that area.

The 5-pin SWD connector is a female JST XHP-5. These box connectors have a 2.5mm pitch and mate with a male connector fitted to a custom cable that I’ve made up. I now use these connectors for SWD connections when the board is small and won’t take the much bigger 10×2 100mil headers.

The 3D view is best for checking whether any of the silkscreen labels overlap parts that they should not and making minor changes to their position. Four M3 screw holes round off a very simple physical design measuring around 46mm square.

Building the board

I uploaded my design to the JLPCB website, selected a black solder mask and used the 3 weeks it took to arrive to source the parts from Mouser.

JLPCB were the first to offer a matte-black soldermask at a reasonable price and I do have one of their early boards and the finish was very similar to a black chalkboard. Apparently that finish suffered from poor adhesion and so they’ve now changed it slightly and whilst it’s still matte there’s a very slight shine to it and it’s more black than grey. It does look very nice and not at all like the horrible glossy uneven black of old.

My process for building a board is quite slow but I find it reliable. I first tin the pads with solder, then I reapply flux to the pads, then I place the surface mount components on the tinned pads and then I reflow the board in my halogen reflow oven. When the reflow is complete I touch up any problems manually and then solder in the through-hole components by hand. Finally I wash the board using hot soapy water and a toothbrush, rinse it off with cold water and leave it out to dry for at least a day.

Looking good! The reflow was completely successful with no touch-up required. Now I can get on with the firmware development.

Firmware

I wrote the firmware in Ubuntu Linux using the STM32Cube IDE. This is basically Eclipse with plugins developed by ST Micro as well as some other open-source plugins. To kick-start the project I used the Cube GUI to configure the peripherals and clocks and then write out a template project. Here’s the peripherals view:

It’s very helpful to be able to use the graphical clock tree configurator to set everything up and know in advance that it’s going to be perfect.

Once Cube has generated the template project then I take it from there. I edit the generated source code to remove the huge ugly (sorry ST) comments and reformat the source to make it more readable.

Debugging

The first time I hooked everything up I was initially pleased to find that the MCU was responding, firmware was being flashed and I was getting regular interrupts from the DMA controller handling the I²S peripheral. On the USB side my microphone was detected by Ubuntu and I was able to capture samples using the free Audacity software.

However, it sounded dreadful. My voice was audible but it was harsh and way too loud. If I got anywhere near the microphone then it would clip badly and harshly. Something was clearly wrong.

The difficulty with debugging audio is that you’re presented with a continuous stream of what you have to assume is valid PCM samples but what does a correct sample look like? You can’t just look at a buffer of data in the debugger and be able to tell good from bad. I’ve been known to hold up the microphone to my computer speakers whilst playing an online sine-wave generator to see if the captured data would exhibit a constant pattern!

The first thing I did was verify what could easily be probed. The voltages were fine. My oscilloscope showed exactly 48kHz on the WS line and exactly 3.072MHz on the CK line so no problems there.

To rule out the microphone I ordered another one online from a different seller and waited a few days. The new microphone showed exactly the same issue so it was apparent that the problem was elsewhere.

For the next step I decided that I needed to know exactly how the DMA peripheral was delivering the I²S data into memory. I could see from the data in-memory that the first 32-bit word had the data and the second was always zero but was the 24-bit sample in the 32-bit double-word being byte-swapped or swapped around 16-bit words? This was a nagging concern that had to be investigated because the DMA peripheral is programmed for 32-bit transfers but the I²S peripheral is inherently a 16-bit device.

I got out an old STM32F4 discovery board and wrote a simple firmware program to act as an I²S slave that would output fixed 24-bit samples. I removed the INMP441 from my board and hooked up the discovery board, splicing in my logic analyser into the middle so I could see what was going on.

This was very enlightening. Not only could I see exactly how a 24-bit value in memory is placed on to the wire by the I²S peripheral in the discovery board but I could also see how that would end up in the receiving board’s memory.

The issue, in the end, was that I was losing the sign bit on the PCM sample during my processing. To downsample from 24 to 16 bits I simply take the top 16-bits, discarding the lower 8. With the sign bit preserved correctly the audio started working.

Audio processing

ST provide a suite of audio effects expansion software called X-CUBE-AUDIO. This is implemented as a closed-source but freely available package that integrates easily into firmware using consistent APIs designed to be used as part of an audio-processing pipeline.

The first package I was interested in was the Graphic Equalizer (GREQ) package documented in UM1798. The audio that I was sampling was clear and noise-free but because of the location where I sit it tended to sound quite boomy and bass-heavy. If I could attenuate those frequencies then my speaking voice would sound more crisp and natural with less reverberation.

You can select the center frequencies for the equalizer configuration from 5, 8 or 10 preset bands with a maximum adjustment of ±12dB. I selected 10 bands and configured it to attenuate 62, 115 and 215Hz by -6dB and to amplify the other frequencies by +6dB. The reason for the amplification of the higher frequencies is to compensate for an overall attenuation performed by the filter — see UM1798 for details.

This is the resource usage required by the GREQ filter.

To use the filters I simply copied the two header files and the binary library into my project, rebuilt and ran. Surprisingly and very happily it just worked first time. The audio was notably more clear with much less unwanted boomy reverberation.

I will have to leave these settings hardcoded into the firmware because although the USB audio device class does provide for graphic equalizer control I have not seen it implemented in PC operating system software even though I declare in my audio device descriptor that I support it and I can see it there in the UsbTreeView PC software.

The next part would be volume control. Until now I’d used the normalisation filter in Audacity to bring up the input signal to a loud enough level to listen to critically. This stage in the pipeline is achieved by ST’s Smart Volume Control (SVC) filter documented in UM1642.

It is possible to do dumb amplification by simply multiplying the PCM signals by a constant value and if the levels are low enough you might get away with that without clipping but you’ll also amplify any noise in the signal because you’ll treat low levels where there is no useful sound the same as the higher levels. ST’s filter takes into account the dynamic range of the input signal to give more consideration to the peaks without touching the troughs.

The configuration for this filter is a simple selection of the gain from -80 to +36dB in 0.5dB steps so the actual limits for the parameter are -160 to +72. This is the resource usage required by the SVC filter.

Once again this just worked and I was able to hook up the volume control to the USB control input so I could use the volume slider in the Ubuntu settings app to set it on-the-fly.

I have found that I need to apply the full +36dB amplification to get acceptable volume from the microphone. I used the Skype echo test call to check that I’m sounding good and, Skype’s obvious compression of the voice channel aside, it’s all good on the audibility front.

Here’s the important interrupt handler that processes the incoming samples.

/**
 * 1. Transform the I2S data into 16 bit PCM samples in a holding buffer
 * 2. Use the ST GREQ library to apply a graphic equaliser filter
 * 3. Use the ST SVC library to adjust the gain (volume)
 * 4. Transmit over USB to the host
 *
 * We've got 10ms to complete this method before the next DMA transfer will be ready.
 */

inline void Audio::sendData(volatile int32_t *data_in, int16_t *data_out) {

  // only do anything at all if we're not muted and we're connected

  if (!_muteButton.isMuted() && _running) {

    // transform the I2S samples from the 64 bit L/R (32 bits per side) of which we
    // only have data in the L side. Take the most significant 16 bits, being careful
    // to respect the sign bit.

    int16_t *dest = _processBuffer;

    for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {
      *dest++ = data_in[0];     // left channel has data
      *dest++ = data_in[0];     // right channel is duplicated from the left
      data_in += 2;
    }

    // apply the graphic equaliser filters using the ST GREQ library then
    // adjust the gain (volume) using the ST SVC library

    _graphicEqualiser.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);
    _volumeControl.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);

    // we only want the left channel from the processed buffer

    int16_t *src = _processBuffer;
    dest = data_out;

    for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {
      *dest++ = *src;
      src += 2;
    }

    // send the adjusted data to the host

    if (USBD_AUDIO_Data_Transfer(&hUsbDeviceFS, data_out, MIC_SAMPLES_PER_PACKET / 2) != USBD_OK) {
      Error_Handler();
    }
  }
}

/**
 * Override the I2S DMA half-complete HAL callback to process the first MIC_MS_PER_PACKET/2 milliseconds
 * of the data while the DMA device continues to run onward to fill the second half of the buffer.
 */

inline void Audio::I2S_halfComplete() {
  sendData(_sampleBuffer, _sendBuffer);
}

/**
 * Override the I2S DMA complete HAL callback to process the second MIC_MS_PER_PACKET/2 milliseconds
 * of the data while the DMA in circular mode wraps back to the start of the buffer
 */

inline void Audio::I2S_complete() {
  sendData(&_sampleBuffer[MIC_SAMPLES_PER_PACKET], &_sendBuffer[MIC_SAMPLES_PER_PACKET / 2]);
}

The DMA peripheral is configured to transfer 20ms of data into our buffer and to provide 'half-complete' and 'complete' interrupts. Therefore we have 10ms to decode, process and send 10ms of data before the next interrupt happens.

The first stage decodes the 64-bit mono samples from _sampleBuffer into 16-bit interleaved stereo signals in _processBuffer. Stereo is required because the GREQ filter will not operate on mono. To simulate stereo I simply duplicate the left channel into the right.

The second stage calls the GREQ filter to process the data in-place. Nice that these filters can work on data in-place.

The third stage calls the SVC filter to also process the data in-place.

The final stage takes the processed left channel from _processBuffer, copies it into _sendBuffer and calls USBD_AUDIO_Data_Transfer to transmit it. USBD_AUDIO_Data_Transfer has some constraints. It cannot be called more frequently than once per millisecond — check. You must pass it an amount of data that matches the calling frequency — I'm calling it once every 10ms with 480 mono samples which is exactly correct for a 48kHz stream. There may also be an upper cap of 1000 bytes on the packet to match a USB buffer size but that's not documented in the ST source code.

Performance

To measure the performance of the audio transformation and processing pipeline I inserted some debug code to toggle a GPIO pin at the start of the sendData method and then again just after the call to USBD_AUDIO_Data_Transfer. The actual data transfer performed by USBD_AUDIO_Data_Transfer is interrupt-driven so that part is not included in the performance figures. I used my oscilloscope to probe the oscillating GPIO pin and measured the time between the rising and falling edges.

Recall that I have 10ms to do all the work in the interrupt handler. In debug mode processing takes 1.7ms. In release mode it takes 1.5ms. I'm pleasantly surprised by this performance and it does indicate that the opaque audio processing blocks provided by ST Micro perform very well.

Watch the video

I've made short video that talks about this project and shows the microphone in operation with some sound samples so you can hear it.

It looks better when viewed directly from the YouTube website. Click here for that.

Free gerber files

If you'd like to build this board yourself then you can download the Gerber files from here. These can be uploaded to the JLPCB website.

Free firmware

It's all available on Github. Click here to go to the repository.

Update Aug 2021

I've just posted an article that explains how I fixed an issue with the click of the mute button being audible. Click here to read it.

Rob Jordan

Ah the things we do for love. You know you can buy one of these for £6.49? https://bit.ly/2PVZ1EF. Yes, I know you do know that.

This is very interesting to me and a tremendous piece of work. I have a temporarily shelved project, a little similar to your analogue approach. For wildlife recording, I had the idea to build a a device that connects electret capsule mic -> opamp preamp -> off-the-shelf I2S ADC -> ESP32 -> SD card. This would potentially replace a setup I have that uses Raspberry Pi and a Focusrite USB audio interface, that gives me good quality recordings but it’s a bit expensive and sometimes trashes SD cards if the RPi is not shut down cleanly. I think Linux is overkill for this application, hence the appeal of a microcontroller-based solution.

ESP32 because it has I2S and I have some experience with it. I can see STM32 may well be a better choice but it would be another thing to learn for me.

I used a TI reference design for the preamp, converted it for SMD components and designed a JLC PCB, which to my delight seemed to work first time (though not saying anything about audio quality). (https://bit.ly/3cuIXBh)

The ADC is off-the-shelf and seems to have quite good specs and reasonably good manufacturing quality (https://bit.ly/3cuIXBh).

One thing I noticed when observing the I2S bus with a scope, is that the pattern of zeros and ones acts like a pseudo VU meter, which is kind of nice for monitoring levels 🙂

I managed to decode the I2S on the ESP32, and write some data to the SD card, but sooner or later (usually sooner) the delay in writing to the card causes a chunk of lost data from I2S. I need to implement a two thread, buffered approach to overcome this, but another project demanded my attention for a while so I’ve let it drop. If you have any simplifying ideas for pulling audio data from I2S and persisting it inexpensively to permanent storage, I’d love to hear them.

Thanks for the article, it’s great.

Rob
- Andy Brown
  
  Hi Rob, an SD card attached to the ESP32 has to be the way to go if you can sort out the timing issues which I assume are intermittent because every now and then it has to do a file-system update operation that needs more time than usual? If so then maybe just increasing the I2S DMA buffer size to give you more time to process it before the next buffer is ready could help?
  
  I did play around with the ESP32 for a while and found its SDK to be intuitive and easy to use even if the documentation at the time was a little sparse. A nice system with loads of power on tap but like you with the ESP32, I’ve got so much STM32 knowledge that I don’t think I could take on another big one in the time available.
  - Rob Jordan
    
    Thanks for the suggestion. Time to get the project off the shelf again.
Daniel

I think the second sample sounds a lot better than the third which sounds super tinny and artificial. Both pale in comparison to the audio you recorded for the video (which is what it really should sound like in the end). I also don’t approve of the use of HASL (and leaded? no less) but hey, personal preferences, right… There’s a way of using USB signals to auto-trim the exact 48MHz needed, STM32f0x2 support that and can be used without external crystal. They also support I2S so it might be possible to use those; you’d probably not get any of those audio algorithms to run there unless STM implemented those in integer rather than using the FP capabilities of the F4.
- Andy Brown
  
  Hi Daniel, thanks for your comments. The audio for the video is recorded through a fairly ordinary lavalier microphone into an Olympus LS-12. I bought that because I wasn’t pleased at all with the built-in audio recording from the video camera. ST Micro lock the audio processing algorithms to the F4 and F7.
Arik

Hi Andy, thanks for sharing this interesting project. I learned a lot.

One thing that I am confused about is converting from 24bit to 16bit. When the I2S data goes into DMA, each sample is converted from 24bit to 32bit. Does it know to convert signed integers correctly or does it just pad the number with 8 zeros?

You mention in the code that you “Take the most significant 16 bits” but AFAICS when you do “*dest++ = data_in[0]” you are converting int32_t to int16_t which will (usually) take the 16 least significant bits and strange things can happen if the volume from the microphone if higher than 16bits (not sure what will happen). Don’t you get compiler warnings on those lines?

Regardless of the casting, you would want to take the least significant bits out of the 24 because, as you mentioned, the microphone output volume is low. If you take the 16 most significant bits you will loose valuable data from the stream. Taking the higher 16bits would be like dividing by 8, which you wouldn’t want to do if the result goes into amplification later on. I do believe that one needs to address clipping when taking the lower 16bits. Should not cast it in a way that could lead to overflow.

What do you think?
- Andy Brown
  
  Hi Arik, thank you for commenting and I’m glad you like the project.
  
  The 8 unused bits are zero’d by the I2S peripheral and the data is left-justified. Because the STM32 is little-endian and the data is stored MSB first by the DMA running in 16-bit transfer mode, the conversion down from 24 to 16 bits is correctly done by simple assignment where the lower 16-bits appear to be taken but actually the ordering of the data delivered by the DMA peripheral means you get the top 16 bits of sample. This was verified in my testing where I attached a second STM32 as an I2S slave sending a fixed bit-pattern. If I hadn’t done that then I’d just be guessing at the data layout because real example code that does this is hard to find. It does look a bit weird in the code but it’s the method that uses the fewest CPU cycles.
  
  Yes it’s legal to assign a signed 32 bit to a signed 16 bit. According to the standard the results are “implementation defined (or an implementation-defined signal can be raised)”. gcc truncates without a warning, even with -Wall -Wextra -pedantic. In “real life” this risks losing the sign bit but since I know the PCM sign bit along with the most significant 16-bits are in the lower word as the STM32 sees it then it works.
  
  It would only be dividing by 8 if the target space was 24 bits. It’s not, it’s 16 bits. You need to fill the target space by taking the most significant data and discarding the least significant bits. You lose resolution but not amplitude. I too was unsure of the ‘correct’ technique when first researching this but could not find anything to contradict this answer:
  
  https://stackoverflow.com/questions/28291509/convert-24-bit-signed-int-signal-to-16-bit-signed-int-signal
  
  This was a fun project and I’ve been using the microphone daily in meetings with great results. If you want to experiment you could recreate this with one of the F4 discovery/nucleo boards and the same INMP441 breakout board.
Arik

Andy, looks like the https certificate for andybrown.me.uk expired few days ago. fyi.
- Andy Brown
  
  Thanks for letting me know, it’s fixed now.
Everett Hoskins

Andy
Thanks for sharing this project. I found reading the DSP bit enlightening. Just wondering about how the design USB enumerates with Windows computers? Does it simply use the default USB audio drivers built into Windows or do you need to install specific drivers like the STLink USB drivers? Thanks again.
- Gogon
  
  I’m also curious about this. Could you please show me a starting point where I can develop a driver (perhaps ASIO-compatible) so the I2S signals can be read by Windows, similar to a sound card. 🙏#️⃣
alessandro boschi

Hi.
I have read the article many times and i have look at the code on github but i didn’t understand how you compute some values, like for example the AUDIO_IN_PACKET or the values of the buffer which will be used in order to transfer the data. Can you explain better?
- Andy Brown
  
  Hi, AUDIO_IN_PACKET is one of ST’s internal library definitions and not one that I define. The buffer sizes that I define are all in the USB_DEVICE/App/usbd_audio_if.h file. I can explain any of those that might be confusing, but not ST’s code. ST’s code seems designed to be confusing at the best of times 🙁
  - alessandro boschi
    
    Hello. Thanks a lot for the answer.
    Analyzing more in detail AUDIO_IN_PACKET I believe that its value comes from the following operation:
    ((maximum frequency supported / 1000) + (maximum single channel size)) * 8 (supported channels) * 2 (halfword).
    
    Meanwhile, looking at your code on GitHub (thank you because it helped me a lot), I wanted to try changing the operating frequency from 48kHz to 32kHz, for an university project. Leaving aside the audio data of the microphone for the moment, I transmit a signal that I created using a sine wave and try to sample at 32kHz. The problem I am experiencing seems to be in the USBD_AUDIO_Init_Microphone_Descriptor function: any samplingfrequency that I pass to it other than 48kHz turns out to be a problem and does not make me initialize my device correctly. Do you know if there could be a solution? From the internet I can’t find any kind of solution and many of the examples I find always use 48kHz as a frequency ..
    Thanks again for your availability
Venkat Krishnan

Hi, I am trying to implement this using a NUCLEO-L476RG board connected to the INMP441 over the SAI interface and sending the data to my computer over UART, but am getting very muffled sounds. I have been trying to debug this for a while now without much luck and was wondering if it would be possible for me to show you my code/results over a Zoom call. Thank you!
ksasahu

Hello Andy, thanks for sharing.
I was able to get it running but I use a different I2S interface. When I change it in the CubeIDE congig (CubeMX) then the generated code deletes the usbd_audio_in files and replaces with usbd_audio only. Do you know how can the generated code keep the audio in files? Tried googling but I was not able to find solution for this. Currently 1.8.0 version of CubeIDE is installed here.
Thank you for checking.
- Andy Brown
  
  Hi good to hear that you’ve got it running. Basically you cannot trust the Cube IDE very much. What I did was to use Cube to create an initial project, verify clocks and hardware configuration etc. then I more-or-less abandoned it. I would work by either having a mirror “test” project where I played with Cube to see what it would do then import the changes manually into my project or I would examine the code in the sample projects available online and import that. Often the code in the sample projects that appeared to be generated with Cube did not actually match what the latest versions of Cube would generate.

Tags

Related Posts

Share This