EmbeddedRelated.com
Blogs
The 2026 Embedded Online Conference

GNU Linker Scripts. Part 1. .data, .bss, and the Startup Contract

Alexey KarelinMay 5, 2026

Before your first line of C executes, the runtime must establish a contract: RAM contents are undefined at power-on, and the CPU has no concept of "initialized variable." The startup code bridges that gap - a Reset_Handler() function running before main().

Classical vision of memory map usually includes widely known sections - .text and sometimes .rodata, .bss and .data. Plus heap and stack growing toward each other. 

What the default startup does

The initialization procedure most commonly includes two primary operations: zero bss and copy data. All of the global variabled declared in your program that should be initializaed with a value of 0 are put into a BSS section - historical acronym for Block Started by Symbol. After all of the Translation Units are compiled, the linker gathers all zero-initialized variables into a single contiguous .bss section in RAM, and exports two symbols - _sbss and _ebss - marking its boundaries. The startup code uses these symbols to know exactly what range to zero-fill. 

Here is how this looks in a linker script itself (all examples are reduced for clarity):

/* Specify the memory areas, STM32F411RE */
MEMORY
{
RAM (xrw)      : ORIGIN = 0x20000000, LENGTH = 128K
FLASH (rx)      : ORIGIN = 0x8000000, LENGTH = 512K
}

SECTIONS
{
  . = ALIGN(4);
  .bss : 
  {
    /* This is used by the startup in order to initialize the .bss secion */
    _sbss = .;         /* define a global symbol at bss start */
    *(.bss) *(.bss*) *(COMMON) /* These are the input sections. A compiler automatically assigns zero-initialized variables to input .bss section */
    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
  } >RAM
}

MEMORY is a special keyword representing physical memory. The simplest example consists of two distinct regions. RAM - where lives mutable data (global variables, heap, stack), and FLASH - where lives the compiled code. The origin and length should fit the real system memory map and getting it wrong produces no error - the code simply runs from the wrong address.

SECTIONS is a keyword representing logical definitions of distinct memory areas. It defines sections and maps them into the physical memory using > symbol.

And the startup code itself for zero-initialized objects:

extern uint32_t _sbss; 
extern uint32_t _ebss;  
// These variables are not defined in any source file, but are created by the linker

void zero_bss(void)
{
  uint32_t *dst = &_sbss;
  while (dst != &_ebss)
    *dst++ = 0;
}

Global variables initialized to a non-zero value need to store the initialization values in non-volatile memory (e.g. built-in Flash), and then copied to the proper locations. The linker uses two distinct terms to specify the locations. 

VMA - Virtual Memory Address - where the variable lives at runtime. Every pointer, every array index, every function call uses VMA addresses. The CPU never knows about LMA.

LMA - Load Memory Address - where the initial values are stored in Flash after programming. The startup code reads from LMA, writes to VMA, and from that point the runtime sees only VMA.

For .bss there is no LMA - zeros cost nothing to store.

Here is how this looks in linker script for the same STM32F411:

SECTIONS
{
  /* The program code and other data goes into FLASH. Simplified version. */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH

  /* used by the startup to initialize data */
  _sidata = LOADADDR(.data);

  /* Initialized data sections goes into RAM, load LMA copy after code */
  .data : 
  {
    . = ALIGN(4);
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */
    . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM AT> FLASH
}

>RAM sets the VMA - the section runs in RAM. AT> FLASH sets the LMA - the initial values are stored in Flash immediately after .text. LOADADDR(.data) retrieves that Flash address so the startup code knows where to copy from.

And the initialization in C:

extern uint32_t _sdata;
extern uint32_t _edata;
extern uint32_t _sidata;

void copy_data(void)
{
  uint32_t *src = &_sidata;
  uint32_t *dst = &_sdata;
  while (dst != &_edata)
    *dst++ = *src++;
}

Variable names are not standardized. You can meet different names across different vendors and linker scripts, e.g. instead of _sbss you can also see names like __bss_start__, _szero etc. The only requirement is that declared variables in startup file and in linker script must match. This is also why copying startup code verbatim from one vendor's project into another often fails silently - the symbol names in the startup file no longer match what the linker script exports.

If you're interested in a full startup sequence, feel free to attend my talk:
Beyond Main: Deconstructing the Cortex-M Startup Sequence from The Very First Cycle (Embedded Online Conference 2026).


The 2026 Embedded Online Conference

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: