Debugging AVR dynamic memory allocation
The avr-libc port of the standard C library for gcc supports dynamic memory allocation through malloc() and free(). These functions (particularly when used to implement the new and delete C++ operators) allow you to use more memory throughout the life of your program as it’s needed than you could possibly statically allocate upfront.
The caveat to this is that an AVR MCU does not have much RAM available for you to use. Memory leaks due to poor coding have always been bad. On a system with only a few kilobytes of RAM memory leaks can be catastrophic. Combine that with no visual debugger and you can see how the advice that is sometimes given is to avoid dynamic memory allocation entirely.
In my view the advice to avoid is too extreme. I would say that if you don’t understand how it works or you don’t trust your ability to write leak-free code then do stay away because you’ll just run into very hard to debug problems as your codebase grows. However, for those of us that have been doing this for longer than we care to remember and know exactly what’s going on, I’m going to present a few debug functions that we can use in lieu of a real debugger.
How dynamic memory allocation is implemented
It’s all in a couple of hundred lines of well commented code in avr-libc, the file location is libc/stdlib/malloc.c. I encourage you to read it, but here’s the gist.
The amount of memory available for dynamic allocation is the difference between __malloc_heap_end and __malloc_heap_start. You can customize these values, for example to put the heap in external RAM if you’re using an AVR that has an external SRAM chip mapped into the address space. As memory is free’d up it is accounted for in a data structure called the free list.
When you call malloc this is how it tries to satisfy your request:
- Very small size requests are rounded up to an internal minimum size.
- An exact match is sought from the free list, if one is found then that’s used.
- If the free list contains blocks larger than the request then the smallest of those blocks will be used and the remainder kept in the free list as a new, smaller entry.
- A fresh block is taken from the heap as long as there is space between the current heap end and the stack pointer.
- Memory is full, fail and return NULL.
It’s worth discussing what happens during a call to free() so we can see how the library reduces fragmentation of the heap.
- If there is no free list then one might be created with this deallocation on it, unless this is the last allocated entry in which case there is no need for a free list and the end-of-heap pointer is just reduced instead.
- The correct location in the free list is found and compaction occurs, that is adjacent free memory blocks are combined into one.
- If the compaction results in a new topmost chunk then the end-of-heap marker is reduced accordingly.
Debugging functions
Now that we know how memory allocation works we can provide some useful functions that you can use in your code to monitor how you are using memory.
memdebug.h
/* * memdebug.h * * Created on: 15 Dec 2010 * Author: Andy Brown * * Use without attribution is permitted provided that this * header remains intact and that these terms and conditions * are followed: * * http://andybrown.me.uk/ws/terms-and-conditions */ #ifdef DEBUG #ifdef __cplusplus extern "C" { #endif #include <stddef.h> extern size_t getMemoryUsed(); extern size_t getFreeMemory(); extern size_t getLargestAvailableMemoryBlock(); extern size_t getLargestBlockInFreeList(); extern int getNumberOfBlocksInFreeList(); extern size_t getFreeListSize(); extern size_t getLargestNonFreeListBlock(); #ifdef __cplusplus } #endif #endif // DEBUG
I’ve surrounded everything with a conditional test on a symbol called DEBUG. You’ll need to make sure that your project defines this if you want to use these functions and you can easily compile them out of your release builds.
memdebug.c
/* * memdebug.c * * Created on: 15 Dec 2010 * Author: Andy Brown * * Use without attribution is permitted provided that this * header remains intact and that these terms and conditions * are followed: * * http://andybrown.me.uk/ws/terms-and-conditions */ #ifdef DEBUG #include <stdlib.h> #include <avr/io.h> #include "memdebug.h" /** * This must match the definition in "stdlib_private.h" */ typedef struct __freelist { size_t sz; struct __freelist *nx; } FREELIST; extern FREELIST *__flp; extern char *__brkval; /** * Get the total memory used by your program. The total will * include accounting overhead internal to the library */ size_t getMemoryUsed() { size_t used; FREELIST *fp; // __brkval=0 if nothing has been allocated yet if(__brkval==0) return 0; // __brkval moves up from __malloc_heap_start to // __malloc_heap_end as memory is used used=__brkval-__malloc_heap_start; // memory free'd by you is collected in the free list and // compacted with adjacent blocks. This, combined with malloc's // intelligent picking of candidate blocks drastically reduces // heap fragmentation. Anyway, since blocks in the free list // are available to you at no cost we need to take them off. for(fp=__flp;fp;fp=fp->nx) used-=fp->sz+sizeof(size_t); return used; } /** * Get the total free bytes */ size_t getFreeMemory() { return (size_t)AVR_STACK_POINTER_REG- (size_t)__malloc_margin- (size_t)__malloc_heap_start- getMemoryUsed(); } /** * Get the largest available block that can be successfully * allocated by malloc() */ size_t getLargestAvailableMemoryBlock() { size_t a,b; a=getLargestBlockInFreeList(); b=getLargestNonFreeListBlock(); return a>b ? a : b; } /** * Get the largest block in the free list */ size_t getLargestBlockInFreeList() { FREELIST *fp; size_t maxsize=0; for(fp=__flp;fp;fp=fp->nx) if(fp->sz>maxsize) maxsize=fp->sz; return maxsize; } /** * Get the number of blocks in the free list */ int getNumberOfBlocksInFreeList() { FREELIST *fp; int i; for(i=0,fp=__flp;fp;fp=fp->nx,i++); return i; } /** * Get total size of free list (includes library overhead) */ size_t getFreeListSize() { FREELIST *fp; size_t size; for(size=0,fp=__flp;fp;fp=fp->nx,size+=fp->sz+sizeof(size_t)); return size; } /** * Get the largest block that can be successfully allocated * without reuse from the free list */ size_t getLargestNonFreeListBlock() { char *cp,*brkval; // this code is an adapted fragment from malloc() itself brkval=__brkval == 0 ? __malloc_heap_start : __brkval; if((cp=__malloc_heap_end)==NULL) cp=(char *)AVR_STACK_POINTER_REG-__malloc_margin; if(cp<=brkval) return 0; return cp-brkval; } #endif // DEBUG
size_t getMemoryUsed()
This returns the total number of bytes allocated so far. The total includes internal memory that the library uses for accounting purposes.
size_t getFreeMemory()
Returns the amount of free memory in the heap. Note that this does not mean that a call to malloc this amount in one chunk will succeed because some of that free memory may be tied up in smaller blocks in the free list.
size_t getLargestAvailableMemoryBlock()
The largest block of available memory for which a call to malloc() for that amount should succeed. Calling malloc for this amount would be a bad idea though because it would leave the end of the block perilously close to the current stack pointer.
Test Code
Here’s test.cpp, a one-file test program that will exercise the above functions to show how the library performs when memory allocations and deallocations are performed first sequentially and then in random order.
#include <wiring.h> #include <hardwareserial.h> #include "memdebug.h" void shuffle(void **array, int n); void showmem(); void dotest(bool shuffle); // Compatibility stub for undefined pure virtual extern "C" void __cxa_pure_virtual() { for(;;); } int main(void) { init(); Serial.begin(9600); dotest(false); dotest(true); // freeze the mcu until reset exit(0); } void dotest(bool shuffle_) { void *ptrs[20]; uint16_t i; Serial.println("allocating"); showmem(); for(i=0;i<sizeof(ptrs)/sizeof(ptrs[0]);i++) { ptrs[i]=malloc(20); showmem(); } if(shuffle_) shuffle(ptrs,sizeof(ptrs)/sizeof(ptrs[0])); Serial.println("freeing"); for(i=0;i<sizeof(ptrs)/sizeof(ptrs[0]);i++) { free(ptrs[i]); showmem(); } } void showmem() { char buffer[100]; sprintf(buffer,"%04u %04u %04u : used/free/large", getMemoryUsed(), getFreeMemory(), getLargestAvailableMemoryBlock() ); Serial.println(buffer); } int rand_int(int n) { int limit = RAND_MAX - RAND_MAX % n; int rnd; do { rnd = rand(); } while (rnd >= limit); return rnd % n; } void shuffle(void **array, int n) { int i, j; void *tmp; for (i = n - 1; i > 0; i--) { j = rand_int(i + 1); tmp = array[j]; array[j] = array[i]; array[i] = tmp; } }
When run on an ATmega328P this produces the following output.
allocating 0000 1307 1307 : used/free/large 0022 1285 1285 : used/free/large 0044 1263 1263 : used/free/large 0066 1241 1241 : used/free/large 0088 1219 1219 : used/free/large 0110 1197 1197 : used/free/large 0132 1175 1175 : used/free/large 0154 1153 1153 : used/free/large 0176 1131 1131 : used/free/large 0198 1109 1109 : used/free/large 0220 1087 1087 : used/free/large 0242 1065 1065 : used/free/large 0264 1043 1043 : used/free/large 0286 1021 1021 : used/free/large 0308 0999 0999 : used/free/large 0330 0977 0977 : used/free/large 0352 0955 0955 : used/free/large 0374 0933 0933 : used/free/large 0396 0911 0911 : used/free/large 0418 0889 0889 : used/free/large 0440 0867 0867 : used/free/large freeing 0418 0889 0867 : used/free/large 0396 0911 0867 : used/free/large 0374 0933 0867 : used/free/large 0352 0955 0867 : used/free/large 0330 0977 0867 : used/free/large 0308 0999 0867 : used/free/large 0286 1021 0867 : used/free/large 0264 1043 0867 : used/free/large 0242 1065 0867 : used/free/large 0220 1087 0867 : used/free/large 0198 1109 0867 : used/free/large 0176 1131 0867 : used/free/large 0154 1153 0867 : used/free/large 0132 1175 0867 : used/free/large 0110 1197 0867 : used/free/large 0088 1219 0867 : used/free/large 0066 1241 0867 : used/free/large 0044 1263 0867 : used/free/large 0022 1285 0867 : used/free/large 0000 1307 1307 : used/free/large allocating 0000 1307 1307 : used/free/large 0022 1285 1285 : used/free/large 0044 1263 1263 : used/free/large 0066 1241 1241 : used/free/large 0088 1219 1219 : used/free/large 0110 1197 1197 : used/free/large 0132 1175 1175 : used/free/large 0154 1153 1153 : used/free/large 0176 1131 1131 : used/free/large 0198 1109 1109 : used/free/large 0220 1087 1087 : used/free/large 0242 1065 1065 : used/free/large 0264 1043 1043 : used/free/large 0286 1021 1021 : used/free/large 0308 0999 0999 : used/free/large 0330 0977 0977 : used/free/large 0352 0955 0955 : used/free/large 0374 0933 0933 : used/free/large 0396 0911 0911 : used/free/large 0418 0889 0889 : used/free/large 0440 0867 0867 : used/free/large freeing 0418 0889 0867 : used/free/large 0396 0911 0867 : used/free/large 0374 0933 0889 : used/free/large 0352 0955 0889 : used/free/large 0330 0977 0889 : used/free/large 0308 0999 0889 : used/free/large 0286 1021 0889 : used/free/large 0264 1043 0889 : used/free/large 0242 1065 0889 : used/free/large 0220 1087 0889 : used/free/large 0198 1109 0889 : used/free/large 0176 1131 0889 : used/free/large 0154 1153 0889 : used/free/large 0132 1175 0889 : used/free/large 0110 1197 0889 : used/free/large 0088 1219 0889 : used/free/large 0066 1241 0955 : used/free/large 0044 1263 1131 : used/free/large 0022 1285 1131 : used/free/large 0000 1307 1307 : used/free/large
This shows that the library is performing as expected. Free/available memory changes in line with our actions and we can see that compactions occur during the free() calls. Obviously the second data set that deallocates randomly can achieve less compaction.
A note on re-entrancy
As it stands neither malloc() nor free() are re-entrant. That means you must not use them in interrupt handlers if they are also used in the main code and vice-versa.
The end
That’s it. I hope that you can leave with some more tools in your box that will help you to write more reliable code, and perhaps now that you understand what’s going on under the hood you’ll be more confident in your coding.