stm32plus 3.2.0: Supporting the STM32F0 Cortex M0

A few months ago I made the decision to start supporting the lower priced, hobbyist friendly STM32 devices in my stm32plus C++ library. These lower-end devices come in lower pin-count, smaller packages that are easier to work with and they have reduced clock speeds that make for fewer PCB layout headaches.

The first low-end STM32 series to be supported was the medium density ‘value line’ F1 (Cortex M3) as exemplified by ST’s Value Line Discovery board. This was supported in stm32plus 3.1.1. Now with release 3.2.0 stm32plus is supporting the STM32F0 (Cortex M0) series of MCUs.

About the STM32F0

The STM32F0 is an implementation of the 32-bit ARM Cortex M0 core. The Cortex M0 core is very similar to that of the M3 and M4 except that some of the instructions and addressing modes are not present. This is not something that you will notice as a C++ programmer but it does impose limitations if you like to dabble in assembly language now and then.

The device in the picture is the STM32F051C8T7. At the time of writing this MCU costs £2.60 plus tax from Farnell in single units. For that you get a 48MHz core with 64Kb of flash and 8Kb of SRAM in an LQFP package that does not require an external oscillator, thereby saving even more off the total design cost. This compares very favourably with similarly resourced 8-bit MCUs from companies such as ATmel.

The same flat 32-bit address structure is present on the M0 with the flash memory, SRAM and peripheral registers all mapped in to the usual address regions. If you’ve programmed the Cortex M3 or M4 then you’ll be right at home with the M0.

ST provide a very low cost development board for the M0 in the ‘discovery’ range.

The STM32F0 Discovery comes with an STM32F051R8T6 on board as well as an ST-Link v2 USB debugger interface. It’s on sale from Farnell at the moment for £6.78 plus tax.

The diagram above shows the pinout of the LQFP64 package included on the F0 discovery board. The 16-bit GPIO ports A, B and C are included in their entirety with a few pins each from ports D and F.

There’s a very important warning buried deep inside reference manual RM0091 that applies to GPIO ports PC13 to 15. It’s well hidden in the power control (PWR) section and I’m going to quote it here because you need to know this.

Due to the fact that the analog switch can transfer only a limited amount of current (3 mA), the use of GPIOs PC13 to PC15 in output mode is restricted: the speed has to be limited to 2 MHz with a maximum load of 30 pF and these IOs must not be used as a current source (e.g. to drive an LED).

I produced a quick video that walks you through the STM32 F0 Discovery board, click on the embedded player below to watch it or click here to watch it in HD on YouTube.

http://youtu.be/TCXGuYDWuv0

stm32plus support

The M0 support in stm32plus is generic enough that it should work on all of the M0 devices, however I did the development against the F0 discovery board so support is officially for the F051 series. All of the examples target the F051 at 48MHz with 64Kb flash, 8Kb SRAM and running off the internal 8MHz oscillator (HSI).

Assuming that you’ve downloaded and extracted the source code archive from github then you can build the library and all the compatible examples from a terminal prompt:

scons mode=debug mcu=f051 hse=8000000 -j4

Some notes on the above command:

  • Even though the examples use the 8MHz HSI oscillator you still need to supply a value for the hse parameter. 8000000 is a suitable default value.
  • You need to have installed scons on your system. Consult your package management system for the exact installation syntax. On Ubuntu it would be sudo apt-get install scons.
  • Windows users must use a Unix-alike subsystem such as Cygwin or msys. I use Cygwin on Windows 7 x64.
  • Where I have used the mode=debug option I could also have used small or fast mode options to build the library optimised for size or speed, respectively.
  • The parameter to the -j option should reflect the number of cores in your build system.

OpenOCD and the ST-Link V2 debugger

Interactive debugging using the Eclipse CDT edition does not need any additional hardware because there is an ST-Link chip built on to the board. It’s interesting to note that ST have implemented the ST-Link interface using an STM32 F103 MCU.

To drive the ST-Link you need to get a copy of the freeware OpenOCD utility. At the time of writing the latest version is 0.7.0 and the source can be downloaded from Sourceforge.

If you’re not interested in building from source then Linux users can install it using their package manager, e.g. for Ubuntu it would be sudo apt-get install openocd.

Windows users can download compiled binaries from this location.

OpenOCD runs in the foreground as a server process so you need to fire up a terminal window and run it with the appropriate options. I like to create trivial scripts that I can use to start OpenOCD on demand. For example, this is my script for Windows 7 x64 on Cygwin:

#!/bin/sh

cd openocd-0.7.0
bin-x64/openocd-x64-0.7.0.exe -f scripts/board/stm32f0discovery.cfg

If it’s all working then you should see output like this when you run OpenOCD:

Open On-Chip Debugger 0.7.0 (2013-05-05-10:44)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.sourceforge.net/doc/doxygen/bugs.html
srst_only separate srst_nogate srst_open_drain connect_deassert_srst
Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v14 API v2 SWIM v0 VID 0x0483 PID 0x3748
Info : Target voltage: 2.888784
Info : stm32f0x.cpu: hardware has 4 breakpoints, 2 watchpoints

At this stage OpenOCD is running as a server and you can either telnet to it and issue direct commands or you can use Eclipse to flash the board and do some visual debugging. Let’s first look at directly sending commands to it using a telnet session.

Controlling OpenOCD with telnet

Here’s a log of a real telnet session to OpenOCD.

$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
> reset init
target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0xc1000000 pc: 0x08000aec msp: 0x20002000
> flash write_image erase p:/stm32plus-examples-blink.hex
auto erase enabled
device id = 0x20006440
flash size = 64kbytes
wrote 3072 bytes from file p:/stm32plus-examples-blink.hex in 0.531030s (5.649 KiB/s)
> reset

Let’s take a look at what’s going on here. Firstly I use telnet to establish a session with OpenOCD:

telnet localhost 4444

OpenOCD responds with a one-liner greeting and a simple > prompt. The first thing I need to do is to reset the board and halt it so that I can flash my program.

reset init

OpenOCD logs the fact that the MCU was reset and is now halted awaiting my next command. I will now flash the board with an ihex file that was produced when I built the stm32plus package and all its examples.

flash write_image erase p:/stm32plus-examples-blink.hex

OpenOCD logs its progress and eventually tells me that it’s written 3072 bytes to the MCU. The device is still halted so now I’m going to reset it and this time let it go so my program will run.

reset

The program is now running and the onboard LED is blinking at a rate of 1Hz.

Controlling OpenOCD with Eclipse

If you’re using Eclipse to do your development then you can have all the OpenOCD interaction automated by the gdb debugger and you’ll get visual debugging with breakpoints and variable/memory inspection. Pretty much everything you’d expect to get from a PC-based debugger.

Assuming that you’ve got a project that builds and produces a .elf file as its output, open the Run -> Debug Configurations form and create a new GDB Hardware Debugging configuration for your project.

All the important options are on the first three configuration tabs. Rather than list them all by rote I’ll show some screenshots of one of my configurations. It should be easy for you to adapt these details to your project:

Now when I launch this configuration the latest build of the project will be automatically flashed to the board and it will reset and halt. I can then use the Resume Eclipse command to start the program.

Finally

Support for the F0 is now a core feature of stm32plus and will be maintained accordingly in each release. I’ve got a project or two in mind that will use the F0 MCU and of course those projects will be written up here on this website as they come to fruition.

  • Thomas

    Hi Andy, I just want to thank you very much for your hard work that you put into this library – it's much appreciated!

    And I got a question too: What do you think about adding support for the ILI9341 that is driving the F4 disco TFT? I would dig into this myself but I won't start when you're signalising that you're already looking into this – are you?

    Many greetings and have a nice day, Thomas

  • Hi Thomas, thank you very much for your comment. I have noticed the F429 discovery board but have not yet bought one – though I almost certainly will do at some point soon. At first glance it looks like ST are driving the ILI9341 using the parallel pixel clock method. I would like to add this to the library but it will need some new levels of abstraction to support the SDRAM as a framebuffer and to provide double-buffering for flicker-free animation. In short, I will do it but it will take some time before I get to it.

    • Thomas

      Hi Andy, that's great to hear! Just send me your mail – I'll donate you a board via paypal 🙂 Many greetings, Thomas

      • You're very generous Thomas but it's OK I'll buy a board myself quite soon. I actually need to put together enough in my Farnell shopping cart to break through the GBP20 minimum to qualify for free delivery 🙂

        • Thomas

          😉 Ok Andy, but if you need something… just let me know! Maybe I could contribute with a bit of code once I start developing with the board – thanks for all your efforts – Thomas

  • Robert S

    Hi,
    I have noticed that only the R61523-example is ported to use the "GPIO access mode" which is compatible with the stm32F0-series.
    Can I port other display examples to use the GPIO access mode, or are there obstacles with the other displays?

    • Hi Robert,

      Yes you can use any of the display examples where that display has a 16-bit data bus. However… you do need to be sure that the resulting timings do not exceed the write cycle time allowed by the display controller (see your controller's datasheet for the details). For example, at 48MHz the optimised assembly language version of the GPIO access mode (Gpio16BitAccessMode_64K_48_42_42) results in a write cycle of about 84ns which would be too fast for the 100ns limit of the ILI9325 for example but is fine for the 60ns limit of the R61523.

      – Andy

  • Robert S

    Ah, thank you, got it. To summarize (if someone is looking for this), these are the alternatives:

    • For 24 MHz MCU: ~160ns write cycle (Gpio16BitAccessMode_24_80_80 / Gpio16BitAccessMode_64K_24_80_80)
    • For 48 MHz MCU: ~84ns write cycle (Gpio16BitAccessMode_48_42_42 / Gpio16BitAccessMode_64K_48_42_42)
    • For 72 MHz MCU: ~100ns write cycle (Gpio16BitAccessMode_72_50_50 / Gpio16BitAccessMode_64K_72_50_50)

    A display controller with a >100ns write limit, such as the ILI9325, should then work with the 24MHz & 72MHz MCU:s, i.e. any series except the F0-series (which is only 48 Mhz).

  • Yes that's right. If you wanted to use the accelerated GPIO driver with a 100ns limited controller on the F0 then a new specialisation could be written that adds the appropriate delays. For example taking Gpio16BitAccessMode_48_42_42 as a start and doubling up each 'str' instruction that writes to the [wr] line would double the write cycle to 84 high, 84 low = 168ns. OK, but we can do better…

    Another option to get a 100ns cycle is to vary the system clock. If you ran the MCU at 40MHz then using Gpio16BitAccessMode_48_42_42 would actually work out at 100ns spot on. To do this you'd need to change RCC_CFGR_PLLMULL12 in System.c:
    https://github.com/andysworkshop/stm32plus/blob/m

    to RCC_CFGR_PLLMULL10 to get (HSI / 2) * 10 = 40MHz (HSI = 8MHz) for the PLL which is then used as the core clock.

  • Robert S

    Cool! After my last post, I actually created a "Gpio16BitAccessMode_48_83_83" file with 83ns high / 83ns low (rounding put it closer to 83ns than 84 ns 🙂 ) by doubling the "str" instructions, basically copying the approach from the Gpio16BitAccessMode_64K_72_50_50 file.
    I put it here (http://pastebin.com/Eyqua9SU) if someone needs it, I haven't tested it yet but there is not so much that could go wrong (do a diff vs.
    the file Gpio16BitAccessMode_64K_48_42_42.h to see all changes, there's not many).

    But of course it is smarter to vary the system clock, didn't figure that out, now I have new stuff to try 🙂

  • Stephane

    I find your library really wonderful. Unfortunately, our project is using the STM32F427 and this chip does not seem to be supported. I tried just to change the #define STM32F40_41xxx for #define STM32F427_437xx in stm32plus.h and stm32f4xx.h, but this creates compilation errors, as, for exemple, the FMSC is now called FMC for these newer chips, and there are other little important differences. Do you plan to support this chip eventually? Thanks a lot!

    • Hi Stephane,

      I haven't evaluated the 427 yet so I can't say. I'm sure the differences are limited though as the core is basically the same.

      – Andy

  • Robert S

    Hi again!
    I have another question regarding the F0-series, hope you don't mind.
    I am looking at the Flash_spi_reader-example and trying to understand why the F0-series is not listed as a compatible MCU:s. What is the deal-breaker in this case?

    – Robert

    • Hi Robert, that particular example reads a bitmap from flash and displays it on an attached LCD. It's the LCD and its attachment via the FSMC peripheral that makes the example incompatible with the F0.

      The SPI flash functionality absolutely is compatible with the F0. In fact, in an article to be published in a week or so I will show how I use the SPI DMA channel to achieve a 24 megabit/sec sustained transfer rate out of a SPI flash device attached to the F0 in order to achieve fast interactive graphics in a GUI.

  • Markus Reinhardt

    Hi Andy,

    was is necessary to get the DAC examples dac_noise and dac_triangle running on the STM32F0discovery ?
    I have tried by adding the line f051 to the compat.txt file and copied the system files for the F051
    from other examples to the system directory. Compilation succeeds but I don’t see any output signals at the STM32F0Discovery. Any hint ?
    Thanks a lot.

    Markus

    • Hi Markus,

      The DAC on the F0 does not support triangle wave or noise generation. If you take a look at the DAC_CR register in RM0091 you’ll see that the WAVEx bits that are present in the F4 have been removed and replaced with “Reserved”.

      • Markus Reinhardt

        Hi Andy,

        o.k. thank you for the explanation.
        Do you know if it is possible then to send (based on a timer driven approach) periodically a single sample that is calculated during the timer period to the DAC of the F051 without involving the DMA ?
        Thank you.

        Best Regards,

        Markus

        • Timers can trigger interrupts so if your frequency is low enough that the calculations can be done within the period then you could use the timer interrupt to write the newly calculated data to the DAC.

  • Jordan Tewell

    Hi Andy

    There doesn’t seem to be an Gpio8bitAccessMode template specialisation for the F0 for LCDs that support 8 bit data buses, and I can’t use the Fsmc8BitAccessMode because F0 doesn’t support Fsmc. How difficult would it be to change the Gpio16BitAccessMode_48_42_42.h example?

    • It’s possible but it’s going to be inefficient and slow. The 16-bit modes are fast because they write a whole word to 16 pins in one instruction. To create an 8-bit implementation that doesn’t interfere with the ‘other’ 8 pins in the port you’d need to mask off the 8-bits you want to write, then write the “ones” to BSRR (low 16) then invert your 8-bits to get the “zeros” and write to BSRR (high 16). That’s quite a lot of cycles per 8-bit write.