STM32F0 Discovery Development III

By | 2013-03-13

The ARM startup code we discussed last time was enough to get a program to compile, link and run, but it wasn’t enough to support a real C program. There is one primary feature that we’re missing, pre-initialised variables need their initial values copying from flash to RAM.

The discovery-basic-template is full of useful information, and forms the basis for this article.

Here’s the section summary from an object file we’ll make later:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .isr_vector   000000c4  08000000  08000000  00008000  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         00000024  080000c4  080000c4  000080c4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .data         00000004  20000000  080000e8  00010000  2**2
                  CONTENTS, ALLOC, LOAD, DATA

The thing to note here is the difference between the .text section and the .data section: the .text section has its LMA (load address) and VMA (virtual address) at the same place – the .data section does not. In essence that means that the linker has arranged things so that the variable memory that the rest of the program modified is at 0x20000000 (absolutely correct for an STM32F0), but that the section content is stored at 0x80000e8. We achieve this by making our linker script look like this:

.text : {
    /* ... removed for brevity ... */
    _end_of_text = .;

    /* Create a symbol for where the .data initial values will start */
    _start_of_data_init = _end_of_text;
} >FLASH

.data : AT (_start_of_data_init) {
    _start_of_data = .;
    /* ... removed for brevity ... */
    _end_of_data = .;
} >RAM

We have to understand the linker script syntax a little, the AT() tells the linker the load address, and the >RAM tells it the runtime memory block for this section. We can use the additional symbols that we’ve added to the linker script to do the run-time initialisation of .data. Specifically, .data is loaded at _start_of_data_init, defined in the FLASH region; but it is linked at _start_of_data, defined in the RAM region. Therefore we fill the RAM from _start_of_data to _end_of_data with the data in FLASH from _start_of_data_init.

Here’s the code, it should go before jumping to main, and after establishing the stack.

    /* --- START: initialise .data */
    movs    r1, #0
    b       isDataInitCopyComplete
copyDataInitToData:
    /* here with current index in r1; start_of_data in r0 */
    ldr     r3, =_start_of_data_init
    /* _start_of_data[r1] = _start_of_data_init[r1] */
    ldr     r3, [r3, r1]
    str     r3, [r0, r1]
    /* index += 4 */
    adds    r1, r1, #4
    /* fall through to check end condition */
isDataInitCopyComplete:
    /* here with current index in r1 */
    ldr     r0, =_start_of_data
    ldr     r3, =_end_of_data
    /* r2 = _start_of_data + index */
    adds    r2, r0, r1
    /* if( _start_of_data + index <= _end_of_data ) */
    cmp     r2, r3
    bcc     copyDataInitToData
    /* --- END: initialise .data */

This is still not quite sufficient for C. The .bss section needs clearing too. You might expect that the bss section, being “uninitialised data” can just be left – not so, the compiler is allowed to put variables initialised to zero in bss and assume the startup code sets all of bss to zero. Let’s add that startup code too:

    /* --- START: initialise .bss */
    ldr     r2, =_start_of_bss
    b       isBSSFillComplete
fillZeroBSS:
    /* _start_of_bss[i] = 0 */
    movs    r3, #0
    str     r3, [r2]
    /* ptr += 4 */
    adds    r2, r2, #4
isBSSFillComplete:
    /* here with r2 at current bss address */
    ldr     r3, = _end_of_bss
    /* if( r2 pointing at _end_of_bss ) */
    cmp     r2, r3
    bcc     fillZeroBSS
    /* --- END: initialise .bss */

szczys’s basic template adds an additional step – a call to a function, SystemInit(). I’m choosing not to add that to our startup code. In an embedded application, initialisation is main()’s job too, startup code should be initialising for C, not for the CPU.

We are done – a call to main() is already present, and then we’re running C code. Everything we talk about now will be about using the particular peripherals of the STM32F0. The above discussion is useful in a wider sense though, because we’ve now seen the technique for performing a startup for any CPU type at all; and we’re able to look at other linker scripts and appreciate their operation.

All that remains is to be able to cope with all the various sections the C compiler might throw out. Here then is my complete linker script.

ENTRY(_ISR_Reset)

MEMORY
{
    FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 64K
    RAM  (xrw) : ORIGIN = 0x20000000, LENGTH = 8K
}
_top_of_stack = 0x20002000;

SECTIONS
{
    /* interrupt vectors */
    .isr_vector : {
        . = ALIGN(4);
        KEEP(*(.isr_vector))
        . = ALIGN(4);
    } >FLASH

    /* program code -- in FLASH */
    .text : {
        . = ALIGN(4);
        *(.text)
        *(.text.*)
        *(.rodata)
        *(.rodata.*)
        *(.glue_7)
        *(.glue_7t)
        KEEP (*(.init))
        KEEP (*(.fini))

        . = ALIGN(4);
        _end_of_text = .;

        /* Create a symbol for where the .data initial values will start */
        _start_of_data_init = _end_of_text;
    } >FLASH

    /* initialised data -- in RAM */
    .data : AT (_start_of_data_init) {
        . = ALIGN(4);
        _start_of_data = .;

        *(.data)
        *(.data.*)
        *(.RAMtext)

        . = ALIGN(4);
        _end_of_data = .;
    } >RAM

    /* uninitialised data -- in RAM */
    .bss : {
        . = ALIGN(4);
        _start_of_bss = .;

        *(.bss)
        *(.bss.*)
        *(COMMON)

        . = ALIGN(4);
        _end_of_bss = .;
    } >RAM
}

Leave a Reply