stm32plus::net, a C++ TCP/IP stack for the STM32

Welcome to a landmark release, version 3.0.0, of my stm32plus C++ library for the STM32F1 and STM32F4 series of microcontrollers.

This release introduces support for the ethernet MAC peripheral in the form of an object-oriented TCP/IP stack as well as support for the STM32F107 connectivity line of MCUs. Furthermore, all the source code is now available on github.com to enable easy browsing and collaborative development.

Read on for all the details.

stm32plus::net

It’s been some time now since I published the designs for my ethernet PHY for the STM32F107 based on the Micrel KSZ8051MLL.

I was naturally very pleased with the success of that design and at the same time frustrated that the only usable TCP/IP stack was LwIP. Now there’s nothing functionally wrong with LwIP, it does exactly what it sets out to do and works on a wide range of processors. My problems were that it’s just a bad fit for a C++ design, doesn’t gel well with modern programming techniques, and as a general solution it can never take full advantage of everything the STM32 MAC has to offer.

So I decided to dig out my nineteen year old hardback copy of TCP/IP illustrated and put my twenty years of commercial experience of programming against the TCP/IP protocols to use and write a completely new TCP/IP stack consisting of all original code and designed to be object-oriented, very fast and very efficient.

Several months-worth of weekend and evening hacking later and it’s ready. Here’s a non-exhaustive list of some of features.

  • ARP, IPv4, UDP, TCP, ICMP, DHCP, DNS, LLIP support.
  • KSZ8051MLL and DP83848C PHY support included in MII/RMII mode.
  • IP large packet fragmentation/reassembly support.
  • IP/TCP/UDP hardware checksum offload support.
  • Hardware MAC address filtering.
  • FTP and HTTP application layer examples included.

The design

A TCP/IP stack is a many-layered affair. Each layer builds on the capabilities of the one below it in order to present new functionality.

The picture shows the network stack as implemented by stm32plus. I say, as implemented by… because purists may notice that I’ve included ICMP in the transport layer and not the network layer. That’s because it requires the services of the network layer, specifically IP, to work so I’ve lifted it up into the transport layer. Really it straddles both transport and network layers.

Declarative implementation

A design goal was to emulate the operation of the layers as closely as possible and to allow the user to pick and choose the protocols required at compile-time. If there are any incompatibilities or missing components then the program should fail to compile.

I decided to use the C++0x variadic template feature to model the stack. Each layer has its own variadic template that you can configure with the components for that layer and have it link automatically to the layers below.

For example, the Transport layer template looks like this:

template<class TNetworkLayer,template<class> class... Features>
class TransportLayer : public virtual TNetworkLayer,
                       public Features<TNetworkLayer>... {

Don’t worry if the template syntax makes your head spin, that’s a normal reaction! What’s achieved here is that we pull in all the transport layer components and give them access to the network layer as well as linking the transport layer directly to the network layer.

All the layers follow this pattern and when put together we achieve a complete stack modelled in code just like the theoretical diagram.

From the user’s point of view, configuring a stack is as simple as selecting the components that you want and then creating types for them. Here’s an example taken from one of the many sample applications.

typedef PhysicalLayer<DP83848C> MyPhysicalLayer;
typedef DatalinkLayer<MyPhysicalLayer,DefaultRmiiInterface,Mac> MyDatalinkLayer;
typedef NetworkLayer<MyDatalinkLayer,DefaultIp,Arp> MyNetworkLayer;
typedef TransportLayer<MyNetworkLayer,Udp,Tcp> MyTransportLayer;
typedef ApplicationLayer<MyTransportLayer,DhcpClient> MyApplicationLayer;
typedef NetworkStack<MyApplicationLayer> MyNetworkStack;

No code is generated here, we’re just creating types that will define the stack. Actually declaring the stack is really easy:

MyNetworkStack::Parameters params;
MyNetworkStack stack;

if(!stack.initialise(params))
  error();

// start the ethernet MAC Tx/Rx DMA channels
// this will trigger the DHCP transaction

if(!stack.startup())
  error();

Note the declaration of MyNetworkStack::Parameters. Every module in the stack has the option of exposing configuration parameters that you can change at runtime if the defaults are not to your requirements. The MyNetworkStack::Parameters structure is dynamically put together using inheritance at compile-time from the modules in your stack so you never see options that are not relevant to you and memory usage is kept to a minimum.

Inter-communication within the stack

Any stack module can expose protected or public methods and have them made available to higher layers. By convention I expose some such methods that can be considered well known and not subject to change.

Generally they will be prefixed with the name of the module. For example, the TCP module exposes some methods beginning with tcp that upper layers can use. If your application was to use the tcpConnect method but you didn’t configure in the TCP module then your application will fail to compile. That’s what we want, compile time errors are infinitely preferable to runtime errors.

That’s all very well for downward communication but what about upward and lateral communication? For example when an ethernet frame arrives at the MAC we need to pass it up the stack through the layers. We do that using a signal/slot implementation based on Don Clugston’s Fastest Possible C++ Delegates. I was so impressed by these delegates that they have now completely replaced the Observable/Observer pattern that used virtual functions throughout stm32plus.

Many modules expose events that you can subscribe to. Some of the events are internal but some of them are quite useful, particularly the error notification and frame reception events. For example, the asynchronous UDP receiver example subscribes to errors and UDP packet notifications.

// subscribe to error events from the network stack

_net->NetworkErrorEventSender.insertSubscriber(
    NetworkErrorEventSourceSlot::bind(this,&NetUdpReceiveAsyncTest::onError)
  );

// subscribe to incoming datagrams from the UDP module

_net->UdpReceiveEventSender.insertSubscriber(
    UdpReceiveEventSourceSlot::bind(this,&NetUdpReceiveAsyncTest::onReceive)
  );

Buffer handling within the stack

The stack goes to great lengths to avoid wasting cycles by copying memory buffers around. Outgoing data is transmitted in-place from the buffer that you supply and incoming data is passed up the stack with zero copying. In the UDP subscriber example above, the delegate will be called with a data pointer that points directly into the MAC’s receive buffer.

Observer out, slots and delegates in

Previous versions of stm32plus used the Observer pattern whenever the library had to call back to you. While this did work well it was not the most efficient design, did not have type-safe callback parameters and it forced your implementation class to have a vtable so you could implement the virtual onNotify callback method.

So, starting with version 3.0.0 of stm32plus the Observer pattern has been replaced by a type-safe, high-performance signal/slot implementation. As you can see in the above code sample you call the insertSubscriber method to add your callback method. There is a corresponding removeSubscriber call for de-registering your callback.

Error Handling

stm32plus::net follows the stm32plus convention of returning false from a method to indicate failure and sets the values in the global errorProvider instance of the ErrorProvider class to indicate the source and reason for failure.

In addition to this stm32plus::net will raise an error event that you can subscribe to in order to get asynchronous notification of failures. The main reason for this additional feature is that stm32plus::net does a considerable amount of work in the background for you and it needs a way to report a failure. For example, an automatically generated ICMP echo reply may fail to be sent and without the event reporting method this failure would be lost.

Here’s how to subscribe to failure events in the stack. Several of the examples do this:

// subscribe to error events from the network stack

_mystack->NetworkErrorEventSender.insertSubscriber(
  NetworkErrorEventSourceSlot::bind(
            this,
            &NetUdpSendTest::onError));

Your class method implementation of onError might look like this:

void onError(NetEventDescriptor& ned) {

  NetworkErrorEvent& errorEvent(
     static_cast<NetworkErrorEvent&>(ned)
   );

  // do something with the NetworkErrorEvent

You should be aware that, depending on the source of the error, your method may be called within the context of an IRQ. stm32plus provides a static method Nvic::isAnyIrqActive() that can be used to detect whether you are in an IRQ context.

Module documentation

Each of the modules in the stack has its own options and methods. This section details them all.

Application layer
HTTP FTP DNS DHCP LLIP Static IP
Transport layer
TCP UDP ICMP
Network layer
ARP IPv4
Datalink layer
MAC
Physical layer
PHY

Example applications

stm32plus::net ships with examples that cover all aspects of using the network stack. Here’s a list, along with a link to the source code in github.

Example Purpose
net_dhcp Demonstrates the use of the DHCP client to fetch your IP address, subnet mask, default gateway and DNS servers.
net_dns This examples demonstrates the use of the DNS client to look up a host name on the internet. In this example we will look up “www.google.co.uk”. After obtaining an IP address and our DNS servers via DHCP this example will perform the DNS lookup.
net_ftp_server This demo brings together a number of the stm32plus components, namely the network stack, the RTC, the SD card and the FAT16/32 filesystem to build a simple ftp server that listens on port 21.
net_llip This examples demonstrates the use of the Link Local IP client to automatically select an unused IP address from the “link local” class B network: 169.254/16. Link-local addresses can be used in a scenario where a DHCP server is not available, such as when a number of computers are directly connected to each other.
net_ping_client This example demonstrates the ICMP transport by sending echo requests (pings) to a hardcoded IP address (change it to suit your network).
net_tcp_client This example demonstrates a TCP ‘echo’ client. It will attempt to connect to a server on a remote computer and send it a line of text. The server will read that line of text and then send it back in reverse. An example server, written in perl, is included in this example code directory.
net_tcp_server This example demonstrates a TCP ‘echo’ server. Telnet to this server and type lines of text at it to see them echo’d back. Maximum 100 characters per line, please. Multiple simultaneous connections are supported up to the configured maximum per server.
net_udp_receive This example demonstrates how to receive UDP packets from a remote host. After obtaining an IP address via DHCP this example will wait for UDP datagrams to arrive on port 12345. When a datagram arrives it will print the first 10 bytes to USART #3.
net_udp_receive_async This example demonstrates how to receive UDP packets from a remote host. After obtaining an IP address via DHCP this example will wait for UDP datagrams to arrive on port 12345. When a datagram arrives it will print the first 10 bytes to USART #3. The reception is done asynchronously via a subscription to an event provided by the network stack’s UDP module.
net_udp_send This example demonstrates how to send UDP packets to a remote host. After obtaining an IP address via DHCP this example will send three 2Kb UDP packets to a remote host every 5 seconds. The target IP address is hardcoded into this example code and you can change it to fit your network configuration.
net_web_client This example shows how to use the HttpClientConnection to retrieve an HTTP resource. In this example we will connect to http://www.st.com and ask for the root document. We will write the response to the USART.
net_web_pframe This example demonstrates a cycling ‘picture frame’ of JPEG images downloaded from the internet and displayed on the attached LCD screen. The images are pre-sized to fit the QVGA screen and are located in a directory on my website.
net_web_server This demo brings together a number of the stm32plus components, namely the network stack, the RTC, the SD card and the FAT16/32 filesystem to build a simple web server that listens on port 80.

Preferred toolchain

The CodeSourcery EABI arm-2012.09 release is the minimum supported toolchain. Other toolchain providers may work but I cannot provide any support for them. At a bare minimum the following requirements must be met by a toolchain:

  1. C++11 support to a level compatible with gcc 4.7.x.
  2. Support for a ‘locking’ callback for the malloc() libc call. All toolchains built around ‘newlib’ support this through __malloc_lock() and __malloc_unlock(). See the LibraryHacks.cpp file in any of the network examples. This is very important.

Test systems

I use two test boards to verify the net code, each one is pictured above.

The first is the WaveShare Port107V which is an STM32F107VCT6-based board. The PHY is a Micrel KSZ8051MLL mounted on a development board that I designed myself. The KSZ8051MLL operates in MII mode.

The second board is a daughter-board for the STM32F4DISCOVERY that I picked up on ebay. You slot the discovery board into it and immediately gain access to a number of additional peripherals including an SDIO slot, a USART port, a TI DP83848C ethernet PHY, and a QVGA LCD screen attached to the FSMC. The DP83848C runs in RMII mode.

If you’re thinking of buying one of those daughterboards for yourself then generally I would recommend it as most of the peripherals such as the QVGA screen (ST7783 driver) and the ethernet PHY ‘just work’. However there are some design decisions that I consider to be poor that you should be aware of:

  • The SDIO data and control lines are not pulled up to 3V3. This means that using the SDIO interface is impossible unless you add external pullups (I use 10K). The pins that you need to pull up are PC8,PC9,PC10,PC11 and PD2.
  • The USART is hardwired to USART 3 using PC10/11 as TX/RX. These clash with SDIO D2 and D3 so you can have SDIO or USART but not both at the same time. A jumper is provided to choose. SDIO cannot be remapped, but there are four USARTs and two UARTs with a myriad of remapping possibilities so I find it frankly bizarre that the designers chose a pin-pair that clashes with an unremappable peripheral.
  • The ST7783 QVGA LCD has an SPI resistive touchscreen. The touchscreen inputs are mapped to exactly none of the three available SPI peripherals on the F4. Doh!

Get the source code from github

As of version 3.0.0 you can now find all the source code on github.com. If you’re interested in extending the library or just curious as to how it works then please feel free to get involved.

If you don’t have or want to use the git client then you can download one of the releases as a zip or tar.gz file from github.com.

Watch the video

Networks are not exactly the most photogenic of subjects, after all they’re just a bunch of wires and connectors. I’ll spare you the dubious pleasure of a presentation showing Wireshark captures and instead regale you with a short video showing the net_web_pframe example running on the STM32F4 Discovery board.

The example streams JPEG images from my website direct to the LCD panel and then pauses for 10 seconds before getting the next one.

License change

This new release is now licensed under the terms of the Apache License, version 2. Previous versions used the BSD license and the reason for the change is primarily the migration of the source code to Github. The Apache license preserves all the rights that the BSD license conveyed as well as formally recognising and protecting the role of the contributor and it includes protection against patent abuse.

  • STM324ME

    I've gotten several of the example programs to run on my WaveShare Open407I-C board with minor pin changes. In particular the network code appears to work well with the DP83848 ethernet adapter in RMII mode.

    I've noticed that stm32plus does not have support for USART6.

    Are there any plans for USB support? I'm interested in USB host mode so I can use USB keyboards and flash drives.

    • Good to hear that it works on the Open407. The DP83848C is included on the F4 expansion board that I have so it is one of my test cases.

      USART6 is supported on the F4 and you'll get access to it by including "config/usart.h".

      Yes there are plans for USB support, I have a code branch created locally here to work on it. It won't be ready for some time though. ST did a thorough job on supporting the various modes and speeds so it's going to be a big piece of work, similar in size to the Ethernet support I think.

  • Michael

    Will the net_web_server example also run on the F1? The compat file says it runs only on F4.

    • Hi Michael,

      The web server (and ftp server) examples both require the SDIO peripheral to provide access to the sites that they host on the SD card therefore they won't work on the F107 because that device does not support SDIO.

  • Knut Jelitto

    Hi Andy!

    Great work so far. My question: are there any principally difficulties to extend configuration to something like STM32PLUS_F1_MD? I like to use your Framework for my Olimex H103 (STM32F103RB) but you placed guards all over the place in the stdperiph files. I yet only have a coarse overview of the implications.

    Thx
    Knut

    • Hi, I don't think there would be any real difficulty, just a lot of configuration updates. I'd start by basing it on the HD and seeing what doesn't compile due to missing peripherals (e.g. SDIO, FSMC). Also all the examples would need to have their compat.txt files updated to exclude examples for peripherals that don't exist and also each example would need to get an entry in the 'system' folder that has startup code and a linker script for the MD device.

  • dienbk

    Hi Andy,
    Please implements ADC feature in your next release. Thank you!

    • Hi there, yes the ADC is high on the list of desirable features.

  • Frank

    Hi Andy,

    May I ask if you will include a mDNS device discovery service?
    Regards,
    Frank

    • Hi Frank, mDNS is not there at the moment but it looks so simple to implement the multicast query and answer services as application layer modules that I'll probably do it soon.

  • Henrik

    Wow, this library is very well designed and a new fresh approach. I like it.

    – Would it be possible to integrate with an RTOS (for example ARM's free CMSIS-RTOS RTX)? Looking at the net_tcp_client's run method, it would be great to be able to run multiple of those concurrently so we can connect to different servers.
    – Do you intend to migrate to ST's new HAL drivers rather the older Standard Peripheral Library?
    – Does the stack support delayed acknowledgement and retransmission?
    – Is there a forum for it where discussion can take place?

    • Hi Henrik,

      The library was always designed as a bare-metal solution rather than for RTOS integration but on the other hand it doesn't require a "main loop" and can do everything asynchronously so there's no clash of designs with respect to an RTOS although in one or two places I do a global interrupt disable/enable to ensure data structure consistency. There's no hard limit on the number of open TCP connections, you can have as many open as performance/resources allow.

      I haven't looked at ST's new drivers yet, that's something to add to the list. I'm always a bit wary of early versions of anything though.

      The stack supports retransmission, that's taken care of when you send data because we sync up with the peers acks after we've pushed out the maximum amount of data we can. Delayed acks (a.k.a Nagle) are not supported. In reality during testing the biggest problem was being swamped with data by a much faster peer and then ending up in silly window syndrome which really killed the performance. I've designed it now to avoid the situation where it advertises a zero receive window as much as it can.

      I don't run a forum anywhere I'm afraid, sorry about that.

  • Mario

    Any plans to get LNA8710A PHY chip supported? Or, can you provide some guidelines on how to port the library to a different PHY chip?

    • Hi Mario,

      I don’t have plans to support it myself but it should be an easy addition if you wanted to do it. Just take a copy of one of the existing 3 phy implementations and edit where necessary. Since many of the registers are standardised you probably won’t have to change much. I’ll help you out if you get stuck.

  • Jan Mrázek

    Hi,
    I’m trying to use your library in CooCox or EmBlocks IDE. I use gcc-arm-none-eabi. Whenever I compile a project, I get error: “libincludestl/string: In member function ‘void std::_String_base::_M_throw_length_error() const’:
    libincludestl/string:242:42: error: there are no arguments to ‘length_error’ that depend on a template parameter, so a declaration of ‘length_error’ must be available [-fpermissive]”

    I also get a similar error with out_of_range. It seems like exception library wasn’t defined… Do you have any ideas how to fix it?

    • Hi Jan, C++ exceptions are not supported by stm32plus.

      • Jan Mrázek

        Thanks a lot! Adding -fno-exceptions helped.

  • George

    Hello,
    This lib is great work. My question: How can I change or preview GPIO pinout for RMII transmision between PHY and microcontroller (I am using STM32F4Discovery board, DP83848 phy, and trying to run udp_send example)

  • Isaque

    Can I use your library in F2 MCUs?

    • I don’t think so. There’s no official support for the F2 so I doubt you’d get much value from the library because I would expect the fundamentals such as GPIO and DMA would be different.

      • Carl Jacobs

        This may be a bit simplistic, but in my experience STM32F4 = STM32F2 + FPU. The behaviour of the peripherals and their registers is virtually identical – as far as I can tell.

        • In my experience the more recent STM32s (F2/3/4/0) are very similar in areas where the peripherals must implement a shared standard. So SPI, USART etc. are very close across the range. Oddly I2C is one where they can’t seem to stop tinkering, perhaps because of all the negative feedback they got from the F1 implementation.

          The major differences I find are in their own proprietary areas. AFIO on the GPIO pins and the DMA peripheral have always been the ones to give me the most work in creating a unified interface.

          I haven’t yet inspected the F2, it would be nice to find that it is indeed a reduced F4. If there’s enough demand I may yet revisit the F2. To be honest, nearly everyone that contacts me seems to be using the F4 and in particular the discovery board.

  • Benni

    Hey Andy,nice solution you got there. I do have a question though; In the examples listed, like the net_udp_receive, I don’t find any subscription that binds the lower layers to the upper layers. Did I miss something? is there a more detailed description for that? Thank you!

    • If you look inside the implementation, for example in Udp.h you will see an initialise() method where the UDP implementation (transport layer) subscribes to events sent by the IP layer (network layer). This works because each protocol implementation in a layer is templated with, and inherits from, the layer below it and since each layer template is a variadic template parameterised with, and inheriting from, its feature classes then all public/protected members in lower layers are in the base hierarchy of the upper layers.

      e.g. Udp.h inherits from your network layer implementation. Your network layer implementation is a variadic template parameterised with and inheriting from Ip.h. Therefore the event sources in Ip.h are available by inheritence to Udp.h. And so on thoughout the layers.