BSS Data

by: burt rosenberg
at: university of miami
date: sep 2024

Goals

This page came about during a discussion of virtual memory layout. It was presented in class that in order of ascending virtual addresses, virtual memory was organized in these segments,
  1. text: The machine code of the compiled programming.
  2. bss: All the static variables, that is, those variables whose lifetime is that of the container process.
  3. heap: The dynamically allocated memory. In the C library this is managed by malloc.
  4. stack: Used for all variables with function lifetime. It is placed high and grows down.
  5. kernel: This is particular to Linux 32-bit kernels. It is the highest 1G of virtual address space.
FIGURE 1

#include <stdio.h>
int main(int argc, char * argv[]) {
	printf("hello world!\n.") ;
	return 0 ;
}
FIGURE 2

# cc -o hello hello.c
# object -d hello

...


000000000000076c <main>:
 76c:	a9be7bfd 	stp	x29, x30, [sp, #-32]!
 770:	910003fd 	mov	x29, sp
 774:	b9001fe0 	str	w0, [sp, #28]
 778:	f9000be1 	str	x1, [sp, #16]
 77c:	90000000 	adrp	x0, 0 <_init-0x5d0>
 780:	9120e000 	add	x0, x0, #0x838
 784:	97ffffb3 	bl	650 <printf@plt>
 788:	52800000 	mov	w0, #0x0                   	// #0
 78c:	a8c27bfd 	ldp	x29, x30, [sp], #32
 790:	d65f03c0 	ret
 794:	d503201f 	nop

...

FIGURE 3

# objdump -s hello

hello:     file format elf64-littleaarch64

Contents of section .interp:
 0238 2f6c6962 2f6c642d 6c696e75 782d6161  /lib/ld-linux-aa
 0248 72636836 342e736f 2e3100             rch64.so.1.  
 
...

 07f0 e003162a 60003fd6 9f0213eb 21ffff54  ...*`.?.....!..T
 0800 f35341a9 f55b42a9 f76343a9 fd7bc4a8  .SA..[B..cC..{..
 0810 c0035fd6 1f2003d5 c0035fd6           .._.. ...._.    
Contents of section .fini:
 081c fd7bbfa9 fd030091 fd7bc1a8 c0035fd6  .{.......{...._.
Contents of section .rodata:
 0830 01000200 00000000 68656c6c 6f20776f  ........hello wo
 0840 726c6421 0a2e00                      rld!...         
Contents of section .eh_frame_hdr:
 0848 011b033b 44000000 07000000 68feffff  ...;D.......h...
 0858 5c000000 98feffff 70000000 d8feffff  \.......p.......
 
...
To illustrate the hello world program of Figure 1. was considered. The question was were does the data for the constant hello world string go.

The ELF

To investigate this we look at the output of the linking-loader, ld, which is invoked after compiling. The format of hello, the output of the linking-loader.

The format of this file is called ELF, the Executable and Linkable Format. It is almost a database of element, each called a section. The tool objdump is used to investigate this file.

In Figure 2 I show the decompilation of the main function of hello.c, obtained using objdump -d. It looks line 780 prepares for the call to printf by placing an address offset from an init location onto the stack.

I then do a full dump of all the sections and look for the hello string. I show the relevant section in Figure 3. It is in section .rodata. So the question is, were does the contents of an ELF file .rodata section go when loading the program into virtual memory?

Does it go into BSS? Well not quite.

The loader script

To answer that ld --verbose provides the loader script that lays out the sections in virtual memory. To talk of the BSS is a simplification. The .rodata is placed after the .text and before the .bss. But at least this loader places for instance the code for object constructors, destructors, and the GOT (Global Object Table, more about that later).

Caution

The linker has a deep connection to the hardware and the operating system. While the common concepts are applicable everywhere, in details your milage may vary.
FIGURE 4

# ld --verbose 
GNU ld (GNU Binutils for Ubuntu) 2.34
...
SECTIONS
{
  /* Read-only sections, merged into text segment: */
...

  .plt            : ALIGN(16) { *(.plt) *(.iplt) }
  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(SORT(.text.sorted.*))
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf.em.  */
    *(.gnu.warning)
  } =0x1f2003d5
  .fini           :
  {
    KEEP (*(SORT_NONE(.fini)))
  } =0x1f2003d5
  PROVIDE (__etext = .);
  PROVIDE (_etext = .);
  PROVIDE (etext = .);
  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
  .rodata1        : { *(.rodata1) }


...
 
  .got            : { *(.got) *(.igot) }
  . = DATA_SEGMENT_RELRO_END (24, .);
  .got.plt        : { *(.got.plt) *(.igot.plt) }
  .data           :
  {
    PROVIDE (__data_start = .);
    *(.data .data.* .gnu.linkonce.d.*)
    SORT(CONSTRUCTORS)
  }
  .data1          : { *(.data1) }
  _edata = .; PROVIDE (edata = .);
  . = .;
  __bss_start = .;
  __bss_start__ = .;
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   /* Align here to ensure that the .bss section occupies space up to
      _end.  Align after .bss to ensure correct alignment even if the
      .bss section disappears because there are no input sections.
      FIXME: Why do we need it? When there is no .bss section, we do not
      pad the .data section.  */
   . = ALIGN(. != 0 ? 64 / 8 : 1);
  }
  _bss_end__ = .; __bss_end__ = .;

...

}


==================================================

References

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 16 sep 20204
update: 16 sep 20204