Introduction
Compiling and executing a C program is a multi-stage process. In this post I’ll walk through each stages of compiling and executing the following C program with filename test.c
:
#include <stdio.h>
#define LOOP_TIMES 10
int main(int argc, char *argv[])
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Our testings are performs on Debian Bullseye AMD64, intermediate result may vary depending on the OS and hardware.
Preprocessing
The first stage of compilation is called preprocessing. In this stage, the C pre-processor is responsible for handling pre-processor directives (lines starting with a #
character). These pre-processor directives form a simple macro language with its own syntax and semantics. This language is used to reduce repetition in source code, e.g. lines with #include
are replaced by the contents of the referenced file (with different search rules for names in quotes versus those in angle brackets). Names introduced with #define
are systematically replaced with their definitions throughout the program, #if
and its relatives are processed to conditionally omit code, etc…
To get the result of the preprocessing stage, we can pass -E
option to gcc
gcc -E -o test.i test.c
The output after preprocessing stage in my machine look like following
// ... omitted for brevity
# 873 "/usr/include/stdio.h" 3 4
# 2 "test.c" 2
# 5 "test.c"
int main(int argc, char *argv[])
{
for (int i = 0; i < 10; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Compilation
In this stage, the actual compiler translates pre-processed source into assembly language. These form an intermediate human-readable language. The existence of this step allows for C code to contain inline assembly instructions and for different assemblers to be used. To get the result of the compilation stage, pass the -S
option to gcc
:
gcc -S -o test.s test.i
The output after compilation stage in my machine look like following
.file "test.c"
.text
.section .rodata
.LC0:
.string "Hello World #%i!\n"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
movl -4(%rbp), %eax
movl %eax, %esi
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf@PLT
addl $1, -4(%rbp)
.L2:
cmpl $9, -4(%rbp)
jle .L3
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 10.2.1-6) 10.2.1 20210110"
.section .note.GNU-stack,"",@progbits
Assembly
During this stage, The assembler converts the assembly language source to an unlinked relocatable object file in ELF format. The output contains the actual instructions to be run by the target processor. However, an unlinked relocatable object file is not executable yet: it may require definitions from other files, including libraries. To get the result of the assembly stage, pass the -c
option to gcc
:
gcc -c -o test.o test.s
or we can manually invoke as
as -o test.o test.s
Running the above command will produce an unlinked relocatable object file in ELF format named test.o
. We can inspect the ELF sections with readelf -a test.o | less
and to see the content of specific section we can use readelf -x .text test.o
.
Linking
The object files generated in the assembly stage is composed of machine instructions that the processor understands, but some pieces of the program are out of order or missing. The linker resolves all the references in a set of object files or archive so that functions in some pieces can successfully call functions in other ones, and then produces an executable. To get the final executable use following command, -v
option give us detail information of linking process.
gcc -v -o test.elf test.o
We can also manually invoke linker separately using ld
to get the final executable.
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
STARTFILES="$GLIBC_LIB_DIR/crt1.o $GLIBC_LIB_DIR/crti.o"
ENDFILES="$GLIBC_LIB_DIR/crtn.o"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 $STARTFILES test.o $GLIBC_LIB_DIR/libc.so $ENDFILES
The final executable is also a ELF file.
ELF
Executable and Linkable Format (ELF) is a common standard file format used in UNIX system for executable files, object code, shared libraries, and core dumps.
Execution
At first, it seems when a program is executed, it starts with the int main(int argc, char *argv[])
, however it is not quite true.
Load Executable with Interpreter
Firstly, when we try to run a program, it trigger an execve
system call to the kernel. The kernel allocates the structure linux_binprm
for a new process, open the executable file from disk, find the corresponding interpreter for the executable, in case of our C program executable in ELF format is then executed with ELF loader.
Load Dynamic Linker
The ELF loader read program headers table of executable which contains a field INTERP
. For dynamically linked program INTERP
is the path to dynamic linker. We can use readelf --program-headers test.elf
to see the program headers table and use readelf -x .interp test.elf
to see the value of INTERP
, its value is /lib64/ld-linux-x86-64.so.2
in my machine. The kernel opens and reads the dynamic linker executable in ELF format.
Auxiliary Vector
Kernel uses a special structure called the auxiliary vector or auxv to comminicate with dymanic linker. Kernel prepares auxv
and pass auxv
by putting on the stack for the newly created program. Thus, when the dynamic linker starts it can use its stack pointer to find the all the startup information required. It contains system specific information that may be required, such as the default size of a virtual memory page on the system or hardware capabilities. We can request the dynamic linker to show some debugging output of the auxv by specifying the environment value LD_SHOW_AUXV=1
Call Dynamic Linker with Program Entry Point
Kernel looks for the e_entry
field from the ELF
header of our program executable which contains the entry point address which by default is symbol _start
. We can examine the entry point with objdump -f test.elf
. We can use option --entry=<symbol name>
of ld
to change entry point to other symbol.
Kernel adds the value of e_entry
to auxv. Kernel then starts the execution from the entry point address as specified by dynamic linker.
Dynamic Linker
Investigating the dynamic linker with command objdump -f /lib64/ld-linux-x86-64.so.2
and objdump --disassemble --section=.text /lib64/ld-linux-x86-64.so.2
we found the entry point of dynamic linker is function _dl_rtld_di_serinfo
. It does some linking process on the fly by loading any libraries as specified in the dynamic section of the program executable in ELF format and then continue execution from our program executable entry point address which was passed in.
Kernel Library
To avoid the overheads of system calls by triggering a trap to the processor which is slow. Kernel loads a shared library (ref: #1, #2, #3) into the address space of every newly created process which contains a function that makes system calls for you. When the kernel starts the dynamic linker it adds an entry AT_SYSINFO_EHDR
to the auxv
structure (ref: #1, #2) which is the address in the memory that the special kernel library lives in. When the dynamic linker starts it can look for the AT_SYSINFO_EHDR
pointer, and if found load that library for the program. The program has no idea this library exists; this is a private arrangement between the dynamic linker and the kernel.
The programmers make system calls indirectly through calling functions in the standard C library. The standard C library can check to see if the special kernel binary is loaded, and if so use the functions within that to make system calls. If the kernel determines the hardware is capable, this will use the fast system call method.
The role of _start
function
As you might have already noticed, in the linking section we have to include somes extras files, this is because the symbol _start
is defined in crt1.o
(Some systems use crt0.o, while some use crt1.o
and a few even use crt2.o
or higher). It takes care of bootstrapping the initial execution of the program, e.g. setup arguments, prepare environment variables for program execution etc. What exactly that entails is highly libc
implementation dependent. The objects are provided by different implementations of libc and cannot be mixed with other ones.
The following code is disassembled version of _start
with objdump --disassemble=_start test.elf
:
0000000000401040 <_start>:
401040: 31 ed xor %ebp,%ebp
401042: 49 89 d1 mov %rdx,%r9
401045: 5e pop %rsi
401046: 48 89 e2 mov %rsp,%rdx
401049: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40104d: 50 push %rax
40104e: 54 push %rsp
40104f: 49 c7 c0 10 11 40 00 mov $0x401110,%r8 # __libc_csu_fini
401056: 48 c7 c1 b0 10 40 00 mov $0x4010b0,%rcx # __libc_csu_init
40105d: 48 c7 c7 71 10 40 00 mov $0x401071,%rdi # our main function
401064: ff 15 86 2f 00 00 callq *0x2f86(%rip) # 403ff0 <__libc_start_main@GLIBC_2.2.5>
40106a: f4 hlt
On glibc 2.31, _start
initializes very early ABI requirements (like the stack or frame pointer), setting up the argc
/argv
/env
values, and then pass pointers of __libc_csu_init, __libc_csu_fini and main function to __libc_start_main
which in turn does more general bootstrapping before finally calling the real main function.
The implementation of __libc_start_main
is quite complicated as it needs to be portable across the very wide number of systems and architectures that glibc
can run on. It does a number of specific things related to setting up the C library which the most of the programmers don’t need to worry about.
Initialization and Termination Routines
init
and fini
are two special parts of code in shared libraries that may need to be called before the library starts, and before the library is unloaded respectively. This might be useful for library programmers to setup variables when the library is started, or to clean up at the end. __libc_start_main
call the __libc_csu_init
before calling our main function and register __libc_csu_fini
as a callback to be called before program exit with __cxa_atexit. What __libc_csu_init
/__libc_csu_fini
do is simply loop the list of init/fini function and invokes them.
In order to traverse the list of init
functions, two symbols __init_array_start
and __init_array_end
is defined during the linking process and exported as part of ELF symbol table .symtab
.
We can use __attribute__((constructor))
and __attribute__((destructor))
(ref: #1) to add initialization and termination routines to our program, e.g.
void __attribute__((constructor)) program_init(void)
{
printf("init\n");
}
void __attribute__((destructor)) program_fini(void)
{
printf("fini\n");
}
In the new realease of glibc
the process of fini
was changed as part of this commit.
Call Main Function
Once __libc_start_main
has completed with the initialization it finally calls the main function! Remember that it had the stack setup initially with the arguments and environment pointers from the kernel; this is how main gets its argc
, argv[]
, envp[] arguments.
Exit
When the main function returns __libc_start_main
call void exit(int exit_code)
with return value of main function as exit code. The implementation of exit
is trigger a syscall exit_group (ref: #1, #2, #3, #4, #5, #6) to immediately stops the current process.
Writing program without startfiles
Now we know how the call to the main
is made. We can override the _start
function to make it call our main()
.
#include <stdio.h>
#include <stdlib.h>
#define LOOP_TIMES 10
void _start()
{
exit(main());
}
int main(void)
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Now we have to force gcc
to use our implementation of _start()
.
gcc -nostartfiles -o test.elf test.c
We can also manually invoke ld
:
gcc -c -o test.o test.c
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 test.o $GLIBC_LIB_DIR/libc.so
Reference
- The Four Stages of Compiling a C Program
- Computer Science from the Bottom Up - Chapter 8. Behind the process - Starting a process
- What happens when you compile?
- GAS: Explanation of .cfi_def_cfa_offset
- segfault when linking with ld
- How to build a C program using a custom version of glibc and static linking?
- Linking a C program directly with ld fails with undefined reference to
__libc_csu_fini
- Linking a dynamically linked executable with ld
- What is the difference between crtbegin.o, crtbeginT.o and crtbeginS.o?
- ld(1) - Linux man page
- readelf(1) — Linux manual page
- GCC - Options for Linking: -nostartfiles
- GCC Options for Code Generation Conventions
- gcc(1) — Linux manual page
- When is the gcc flag -nostartfiles used?
- How do I tell GCC not to link with the runtime library and the standard library?
- Executing main() in C/C++ – behind the scene
- objdump(1) - Linux man page
- Objcopy elf to bin file
- objcopy(1) - Linux man page
- BFD
- Wikipedia: Executable and Linkable Format
- elf(5) — Linux manual page
- elf.h
- How can I examine contents of a data section of an ELF file on Linux?
- exec(3) - Linux man page
- execve(2) — Linux manual page
- Executing a flat binary file under Linux
- load_flat_binary
- What is the difference between exit and return?
- What is the use of _start() in C?
- Syscall implementation of exit()