Introduction
Compiling and executing a C program is a multi-stage process. In this
post I’ll walk through each stages of compiling and executing the
following C program with filename test.c
:
#include <stdio.h>
#define LOOP_TIMES 10
int main(int argc, char *argv[])
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Our testings are performs on Debian Bullseye AMD64, intermediate result may vary depending on the OS and hardware.
Preprocessing
The first stage of compilation is called preprocessing. In this stage,
the C pre-processor is responsible for handling pre-processor
directives (lines starting with a #
character). These
pre-processor directives form a simple macro language with its own
syntax and semantics. This language is used to reduce repetition in
source code, e.g. lines with #include
are replaced by the
contents of the referenced file (with different search rules for names
in quotes versus those in angle brackets). Names introduced with
#define
are systematically replaced with their
definitions throughout the program, #if
and its relatives
are processed to conditionally omit code, etc…
To get the result of the preprocessing stage, we can pass
-E
option to gcc
gcc -E -o test.i test.c
The output after preprocessing stage in my machine look like following
// ... omitted for brevity
# 873 "/usr/include/stdio.h" 3 4
# 2 "test.c" 2
# 5 "test.c"
int main(int argc, char *argv[])
{
for (int i = 0; i < 10; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Compilation
In this stage, the actual compiler translates pre-processed source
into assembly language. These form an intermediate human-readable
language. The existence of this step allows for C code to contain
inline assembly instructions and for different assemblers to be used.
To get the result of the compilation stage, pass the
-S
option to gcc
:
gcc -S -o test.s test.i
The output after compilation stage in my machine look like following
.file "test.c"
.text
.section .rodata
.LC0:
.string "Hello World #%i!\n"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
movl -4(%rbp), %eax
movl %eax, %esi
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf@PLT
addl $1, -4(%rbp)
.L2:
cmpl $9, -4(%rbp)
jle .L3
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 10.2.1-6) 10.2.1 20210110"
.section .note.GNU-stack,"",@progbits
Assembly
During this stage, The assembler converts the assembly language source
to an unlinked relocatable object file in ELF format. The output
contains the actual instructions to be run by the target processor.
However, an unlinked relocatable object file is not executable yet: it
may require definitions from other files, including libraries. To get
the result of the assembly stage, pass the -c
option to
gcc
:
gcc -c -o test.o test.s
or we can manually invoke as
as -o test.o test.s
Running the above command will produce an unlinked relocatable object
file in ELF format named test.o
. We can inspect the ELF
sections with readelf -a test.o | less
and to see the
content of specific section we can use
readelf -x .text test.o
.
Linking
The object files generated in the assembly stage is composed of
machine instructions that the processor understands, but some pieces
of the program are out of order or missing. The linker resolves all
the references in a set of object files or archive so that functions
in some pieces can successfully call functions in other ones, and then
produces an executable. To get the final executable use following
command, -v
option give us detail information of linking
process.
gcc -v -o test.elf test.o
We can also manually invoke linker separately using ld
to
get the final executable.
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
STARTFILES="$GLIBC_LIB_DIR/crt1.o $GLIBC_LIB_DIR/crti.o"
ENDFILES="$GLIBC_LIB_DIR/crtn.o"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 $STARTFILES test.o $GLIBC_LIB_DIR/libc.so $ENDFILES
The final executable is also a ELF file.
ELF
Executable and Linkable Format (ELF) is a common standard file format used in UNIX system for executable files, object code, shared libraries, and core dumps.
Execution
At first, it seems when a program is executed, it starts with the
int main(int argc, char *argv[])
, however it is not quite true.
Load Executable with Interpreter
Firstly, when we try to run a program, it trigger an
execve
system call
to the kernel. The kernel
allocates the structure linux_binprm
for a new
process,
open the executable file from disk,
find the corresponding interpreter for the executable, in case of our C program executable in ELF format is then
executed
with
ELF loader.
Load Dynamic Linker
The ELF loader
read program headers table of executable which contains a field
INTERP
. For dynamically linked program INTERP
is the path to
dynamic linker. We can use
readelf --program-headers test.elf
to see the program
headers table and use readelf -x .interp test.elf
to see
the value of INTERP
, its value is
/lib64/ld-linux-x86-64.so.2
in my machine. The kernel
opens
and
reads the dynamic linker executable in ELF format.
Auxiliary Vector
Kernel uses a special structure called the
auxiliary vector or auxv
to comminicate with dymanic linker. Kernel
prepares auxv
and pass auxv
by putting on the stack for the newly
created program. Thus, when the dynamic linker starts it can use its
stack pointer to find the all the startup information required. It
contains system specific information that may be required, such as the
default size of a virtual memory page on the system or hardware
capabilities. We can request the dynamic linker to show some debugging
output of the auxv by specifying the environment value
LD_SHOW_AUXV=1
Call Dynamic Linker with Program Entry Point
Kernel
looks for the e_entry
field
from the ELF
header of our program executable which
contains the entry point address which by default is symbol
_start
. We can examine the entry point with
objdump -f test.elf
. We can use option
--entry=<symbol name>
of ld
to change
entry point to other symbol.
Kernel
adds the value of e_entry
to auxv. Kernel then
starts the execution
from the
entry point address as specified by dynamic linker.
Dynamic Linker
Investigating the dynamic linker with command
objdump -f /lib64/ld-linux-x86-64.so.2
and
objdump --disassemble --section=.text
/lib64/ld-linux-x86-64.so.2
we found the entry point of dynamic linker is
function _dl_rtld_di_serinfo
. It does some linking process on the fly by loading any libraries as
specified in the dynamic section of the program executable in ELF
format and then continue execution from our program executable entry
point address which was passed in.
Kernel Library
To avoid the overheads of system calls by triggering a trap to the
processor which is slow. Kernel loads a shared library (ref:
#1,
#2,
#3) into the address space of every newly created process which
contains a function that makes system calls for you. When the kernel
starts the dynamic linker it adds an entry
AT_SYSINFO_EHDR
to the auxv
structure (ref:
#1,
#2) which is the address in the memory that the special kernel library
lives in. When the dynamic linker starts it can look for the
AT_SYSINFO_EHDR
pointer, and if found load that library
for the program. The program has no idea this library exists; this is
a private arrangement between the dynamic linker and the kernel.
The programmers make system calls indirectly through calling functions in the standard C library. The standard C library can check to see if the special kernel binary is loaded, and if so use the functions within that to make system calls. If the kernel determines the hardware is capable, this will use the fast system call method.
The role of _start
function
As you might have already noticed, in the linking section we have to
include somes extras files, this is because
the symbol _start
is defined in crt1.o
(Some systems use
crt0.o, while some use crt1.o
and a few even use
crt2.o
or higher). It takes care of bootstrapping the
initial execution of the program, e.g. setup arguments, prepare
environment variables for program execution etc. What exactly that
entails is highly
libc
implementation
dependent. The objects are provided by
different implementations of libc
and cannot be mixed with other ones.
The following code is disassembled version of _start
with
objdump --disassemble=_start test.elf
:
0000000000401040 <_start>:
401040: 31 ed xor %ebp,%ebp
401042: 49 89 d1 mov %rdx,%r9
401045: 5e pop %rsi
401046: 48 89 e2 mov %rsp,%rdx
401049: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40104d: 50 push %rax
40104e: 54 push %rsp
40104f: 49 c7 c0 10 11 40 00 mov $0x401110,%r8 # __libc_csu_fini
401056: 48 c7 c1 b0 10 40 00 mov $0x4010b0,%rcx # __libc_csu_init
40105d: 48 c7 c7 71 10 40 00 mov $0x401071,%rdi # our main function
401064: ff 15 86 2f 00 00 callq *0x2f86(%rip) # 403ff0 <__libc_start_main@GLIBC_2.2.5>
40106a: f4 hlt
On glibc 2.31, _start
initializes very early ABI requirements
(like the stack or frame pointer),
setting up the argc
/argv
/env
values, and then
pass pointers
of
__libc_csu_init,
__libc_csu_fini
and main function to
__libc_start_main
which in turn does more general bootstrapping before finally calling
the real main function.
The
implementation of __libc_start_main
is quite complicated as it needs to be portable across the very wide
number of systems and architectures that glibc
can run
on. It does a number of specific things related to setting up the C
library which the most of the programmers don’t need to worry about.
Initialization and Termination Routines
init
and fini
are two special parts of code
in shared libraries that may need to be called before the library
starts, and before the library is unloaded respectively. This might be
useful for library programmers to setup variables when the library is
started, or to clean up at the end. __libc_start_main
call the __libc_csu_init
before calling our main function and
register __libc_csu_fini
as a callback to be called
before program exit
with
__cxa_atexit. What __libc_csu_init
/__libc_csu_fini
do is simply loop the list of
init/fini
function and invokes them.
In order to traverse the list of init
functions, two
symbols __init_array_start
and
__init_array_end
is defined during the linking process
and exported as part of ELF symbol table .symtab
.
We can use __attribute__((constructor))
and
__attribute__((destructor))
(ref:
#1) to add
initialization and termination routines
to our program, e.g.
void __attribute__((constructor)) program_init(void)
{
printf("init\n");
}
void __attribute__((destructor)) program_fini(void)
{
printf("fini\n");
}
In the new realease of glibc
the process of
fini
was changed as part of this
commit.
Call Main Function
Once __libc_start_main
has completed with the
initialization it finally
calls the main function! Remember that it had the stack setup initially with the arguments
and environment pointers from the kernel; this is how main gets its
argc
, argv[]
,
envp[]
arguments.
Exit
When the main function returns __libc_start_main
call
void exit(int exit_code)
with return value of main function as exit code. The
implementation of exit
is trigger a syscall exit_group (ref:
#1,
#2,
#3,
#4,
#5,
#6) to immediately stops the current process.
Writing program without startfiles
Now we know how the call to the main
is made. We can
override the _start
function to make it call our
main()
.
#include <stdio.h>
#include <stdlib.h>
#define LOOP_TIMES 10
void _start()
{
exit(main());
}
int main(void)
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Now we have to force gcc
to use our implementation of
_start()
.
gcc -nostartfiles -o test.elf test.c
We can also manually invoke ld
:
gcc -c -o test.o test.c
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 test.o $GLIBC_LIB_DIR/libc.so
Reference
- The Four Stages of Compiling a C Program
- Computer Science from the Bottom Up - Chapter 8. Behind the process - Starting a process
- What happens when you compile?
- GAS: Explanation of .cfi_def_cfa_offset
- segfault when linking with ld
- How to build a C program using a custom version of glibc and static linking?
-
Linking a C program directly with ld fails with undefined
reference to
__libc_csu_fini
- Linking a dynamically linked executable with ld
- What is the difference between crtbegin.o, crtbeginT.o and crtbeginS.o?
- ld(1) - Linux man page
- readelf(1) — Linux manual page
- GCC - Options for Linking: -nostartfiles
- GCC Options for Code Generation Conventions
- gcc(1) — Linux manual page
- When is the gcc flag -nostartfiles used?
- How do I tell GCC not to link with the runtime library and the standard library?
- Executing main() in C/C++ – behind the scene
- objdump(1) - Linux man page
- Objcopy elf to bin file
- objcopy(1) - Linux man page
- BFD
- Wikipedia: Executable and Linkable Format
- elf(5) — Linux manual page
- elf.h
- How can I examine contents of a data section of an ELF file on Linux?
- exec(3) - Linux man page
- execve(2) — Linux manual page
- Executing a flat binary file under Linux
- load_flat_binary
- What is the difference between exit and return?
- What is the use of _start() in C?
- Syscall implementation of exit()