Volatile Delays

You want to write a simple delay loop on your embedded microcontroller. Let’s say you’re waiting for the crystal oscillator to stabilise before you use it.

static void delay( int loops )
{
    while(loops--)
        ;
}

You’re then very surprised to find that this takes a total time of 0ms when you call it. You go looking in the assembler output and find that not only is there no loop, the call to _delay() isn’t even present.

The only time that won’t happen is if you have given -O0 to your compiler (i.e. disabled optimisation). The optimiser has noticed that this is a loop with no side effects and optimised it away, then it’s noticed that delay is a static function that does nothing, and it can optimise that away too.

A few moments poking on the internet, or in the majority of embedded code you might be shown will lead you to advice that you can trick the optimiser by using the volatile storage class.

static void delay( int loops )
{
    volatile int i;

    for( i = 0; i < loops; i++ )
        ;
}

volatile has special meaning to a C compiler. It tells the compiler that it may not assume the value of the variable has not changed between accesses. It is mostly used to indicate a global variable that is accessed in both normal context and interrupt context – when an ISR can alter that variable the compiler isn’t allowed to optimise accesses away because it thinks the value hasn’t changed.

The problem with this approach is that it’s still dangerous – don’t do it, volatile is not for this purpose and can’t be relied upon. volatile doesn’t say to the compiler “changing this variable has side effects”, it says “you may not assume you are the only one changing this variable”. That means a sophisticated optimizer is still perfectly within its rights to optimise the above loop away if it decides it has no effect (which it doesn’t – i is a local variable so you don’t care what it ends up at, or what it passes through to get there).

There is one final argument against delay loops implemented like this: they are completely unpredictable. On one particular day, with one particular CPU and one particular version of one particular compiler, you might get consistent results (until you change the compiler command line), and feel that you know how much actual time this delay loop represents; mostly however – you don’t. You’re going to have to resort to assembler, that way you can count every single instruction. Here’s a version for AVR:

static void delay( int loops )
{
    __asm__ volatile (
            "1: dec %0" "\n\t"
            "brne 1b"
            : "=r" (__count)
            : "0" (__count)
    );
}

The advantage though is that we can get the datasheet out and count one standard instruction (dec) and one branch instruction (brne), and know that these two add up to three CPU cycles per iteration. The disadvantage is that it’s pretty obtuse.

Fortunately, the nice people who write GCC compilers for embedded processors know that nobody likes assembler and provide a bit of help. Here’s the AVR libc version of a busy-wait loop (in fact, I pinched the above assembler straight from delay_basic.h’s _delay_loop_1()):

#include <util/delay_basic.h>

int main( void )
{
    while(1) {
        LEDToggle();
        _delay_loop_2(0xffff);
    }
}

We don’t care about loops though; we care about time. Those nice people have got us covered.

#define F_CPU 1000000UL
#include <util/delay.h>

int main( void )
{
    while(1) {
        LEDToggle();
        _delay_ms(500);
    }
}

_delay_ms() is a clever macro (and hence the argument to _delay_ms() has to be a compile-time constant) that uses knowledge of F_CPU to correctly calculate the number of iterations that need to be passed to _delay_loop_2().

Having said all that – don’t use busy-wait for your delays in anything but the most simple of programs. They tie up the CPU, they waste power and inevitably you’ll find that you’re constantly chasing cycles over there because you needed a delay over here. As an example, try and write a program that flashes an LED every 500ms while applying a 70ms debounce time to a button using busy loops.

This entry was posted in FussyLogic and tagged , , , . Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

You must be logged in to post a comment.