{"id":737,"date":"2012-09-12T01:00:00","date_gmt":"2012-09-11T23:00:00","guid":{"rendered":"https:\/\/www.fussylogic.co.uk\/blog\/?p=737"},"modified":"2012-10-08T17:53:06","modified_gmt":"2012-10-08T16:53:06","slug":"volatile-delays","status":"publish","type":"post","link":"https:\/\/www.fussylogic.co.uk\/blog\/?p=737","title":{"rendered":"Volatile Delays"},"content":{"rendered":"<p>You want to write a simple delay loop on your embedded microcontroller. Let\u00e2\u20ac\u2122s say you\u00e2\u20ac\u2122re waiting for the crystal oscillator to stabilise before you use it.<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"dt\">static<\/span> <span class=\"dt\">void<\/span> delay( <span class=\"dt\">int<\/span> loops )\n{\n    <span class=\"kw\">while<\/span>(loops--)\n        ;\n}<\/code><\/pre>\n<p>You\u00e2\u20ac\u2122re then very surprised to find that this takes a total time of 0ms when you call it. You go looking in the assembler output and find that not only is there no loop, the call to <code>_delay()<\/code> isn\u00e2\u20ac\u2122t even present.<\/p>\n<p>The only time that won\u00e2\u20ac\u2122t happen is if you have given <code>-O0<\/code> to your compiler (i.e.\u00c2\u00a0disabled optimisation). The optimiser has noticed that this is a loop with no side effects and optimised it away, then it\u00e2\u20ac\u2122s noticed that delay is a <code>static<\/code> function that does nothing, and it can optimise that away too.<\/p>\n<p>A few moments poking on the internet, or in the majority of embedded code you might be shown will lead you to advice that you can trick the optimiser by using the <code>volatile<\/code> storage class.<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"dt\">static<\/span> <span class=\"dt\">void<\/span> delay( <span class=\"dt\">int<\/span> loops )\n{\n    <span class=\"dt\">volatile<\/span> <span class=\"dt\">int<\/span> i;\n\n    <span class=\"kw\">for<\/span>( i = <span class=\"dv\">0<\/span>; i &lt; loops; i++ )\n        ;\n}<\/code><\/pre>\n<p><code>volatile<\/code> has special meaning to a C compiler. It tells the compiler that it may not assume the value of the variable has not changed between accesses. It is mostly used to indicate a global variable that is accessed in both normal context and interrupt context \u00e2\u20ac\u201c when an ISR can alter that variable the compiler isn\u00e2\u20ac\u2122t allowed to optimise accesses away because it thinks the value hasn\u00e2\u20ac\u2122t changed.<\/p>\n<p>The problem with this approach is that it\u00e2\u20ac\u2122s still dangerous \u00e2\u20ac\u201c don\u00e2\u20ac\u2122t do it, <code>volatile<\/code> is not for this purpose and can\u00e2\u20ac\u2122t be relied upon. <code>volatile<\/code> doesn\u00e2\u20ac\u2122t say to the compiler \u00e2\u20ac\u0153changing this variable has side effects\u00e2\u20ac\u009d, it says \u00e2\u20ac\u0153you may not assume you are the only one changing this variable\u00e2\u20ac\u009d. That means a sophisticated optimizer is <em>still<\/em> perfectly within its rights to optimise the above loop away if it decides it has no effect (which it doesn\u00e2\u20ac\u2122t \u00e2\u20ac\u201c <code>i<\/code> is a local variable so you don\u00e2\u20ac\u2122t care what it ends up at, or what it passes through to get there).<\/p>\n<p>There is one final argument against delay loops implemented like this: they are completely unpredictable. On one particular day, with one particular CPU and one particular version of one particular compiler, you might get consistent results (until you change the compiler command line), and feel that you know how much actual time this delay loop represents; mostly however \u00e2\u20ac\u201c you don\u00e2\u20ac\u2122t. You\u00e2\u20ac\u2122re going to have to resort to assembler, that way you can count every single instruction. Here\u00e2\u20ac\u2122s a version for AVR:<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"dt\">static<\/span> <span class=\"dt\">void<\/span> delay( <span class=\"dt\">int<\/span> loops )\n{\n    __asm__ <span class=\"dt\">volatile<\/span> (\n            <span class=\"st\">&quot;1: dec %0&quot;<\/span> <span class=\"st\">&quot;<\/span><span class=\"ch\">\\n\\t<\/span><span class=\"st\">&quot;<\/span>\n            <span class=\"st\">&quot;brne 1b&quot;<\/span>\n            : <span class=\"st\">&quot;=r&quot;<\/span> (__count)\n            : <span class=\"st\">&quot;0&quot;<\/span> (__count)\n    );\n}<\/code><\/pre>\n<p>The advantage though is that we can get the datasheet out and count one standard instruction (<code>dec<\/code>) and one branch instruction (<code>brne<\/code>), and know that these two add up to three CPU cycles per iteration. The disadvantage is that it\u00e2\u20ac\u2122s pretty <a href=\"http:\/\/www.ibiblio.org\/gferg\/ldp\/GCC-Inline-Assembly-HOWTO.html\">obtuse<\/a>.<\/p>\n<p>Fortunately, the nice people who write GCC compilers for embedded processors know that nobody likes assembler and provide a bit of help. Here\u00e2\u20ac\u2122s the AVR libc version of a busy-wait loop (in fact, I pinched the above assembler straight from <code>delay_basic.h<\/code>\u00e2\u20ac\u2122s <code>_delay_loop_1()<\/code>):<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"ot\">#include &lt;util\/delay_basic.h&gt;<\/span>\n\n<span class=\"dt\">int<\/span> main( <span class=\"dt\">void<\/span> )\n{\n    <span class=\"kw\">while<\/span>(<span class=\"dv\">1<\/span>) {\n        LEDToggle();\n        _delay_loop_2(<span class=\"bn\">0xffff<\/span>);\n    }\n}<\/code><\/pre>\n<p>We don\u00e2\u20ac\u2122t care about loops though; we care about time. Those nice people have got us covered.<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"ot\">#define F_CPU 1000000UL<\/span>\n<span class=\"ot\">#include &lt;util\/delay.h&gt;<\/span>\n\n<span class=\"dt\">int<\/span> main( <span class=\"dt\">void<\/span> )\n{\n    <span class=\"kw\">while<\/span>(<span class=\"dv\">1<\/span>) {\n        LEDToggle();\n        _delay_ms(<span class=\"dv\">500<\/span>);\n    }\n}<\/code><\/pre>\n<p><code>_delay_ms()<\/code> is a clever macro (and hence the argument to <code>_delay_ms()<\/code> has to be a compile-time constant) that uses knowledge of <code>F_CPU<\/code> to correctly calculate the number of iterations that need to be passed to <code>_delay_loop_2()<\/code>.<\/p>\n<p>Having said all that \u00e2\u20ac\u201c don\u00e2\u20ac\u2122t use busy-wait for your delays in anything but the most simple of programs. They tie up the CPU, they waste power and inevitably you\u00e2\u20ac\u2122ll find that you\u00e2\u20ac\u2122re constantly chasing cycles over there because you needed a delay over here. As an example, try and write a program that flashes an LED every 500ms while applying a 70ms debounce time to a button using busy loops.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You want to write a simple delay loop on your embedded microcontroller. Let\u00e2\u20ac\u2122s say you\u00e2\u20ac\u2122re waiting for the crystal oscillator to stabilise before you use it. static void delay( int loops ) { while(loops&#8211;) ; } You\u00e2\u20ac\u2122re then very surprised to find that this takes a total time of 0ms when you call it. You\u2026 <span class=\"read-more\"><a href=\"https:\/\/www.fussylogic.co.uk\/blog\/?p=737\">Read More &raquo;<\/a><\/span><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[39,37,38,6],"_links":{"self":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/737"}],"collection":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=737"}],"version-history":[{"count":9,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/737\/revisions"}],"predecessor-version":[{"id":815,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/737\/revisions\/815"}],"wp:attachment":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}