Introduction
Compiling and executing a C program is a multi-stage process. In
this post I’ll walk through each stages of compiling and executing
the following C program with filename test.c
:
#include <stdio.h>
#define LOOP_TIMES 10
int main(int argc, char *argv[])
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Our testings are performs on Debian Bullseye AMD64, intermediate result may vary depending on the OS and hardware.
Preprocessing
The first stage of compilation is called preprocessing. In this
stage, the C pre-processor is responsible for handling
pre-processor directives (lines starting with a
#
character). These pre-processor directives form a
simple macro language with its own syntax and semantics. This
language is used to reduce repetition in source code, e.g. lines
with #include
are replaced by the contents of the
referenced file (with different search rules for names in quotes
versus those in angle brackets). Names introduced with
#define
are systematically replaced with their
definitions throughout the program, #if
and its
relatives are processed to conditionally omit code, etc…
To get the result of the preprocessing stage, we can pass
-E
option to gcc
gcc -E -o test.i test.c
The output after preprocessing stage in my machine look like following
// ... omitted for brevity
# 873 "/usr/include/stdio.h" 3 4
# 2 "test.c" 2
# 5 "test.c"
int main(int argc, char *argv[])
{
for (int i = 0; i < 10; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Compilation
In this stage, the actual compiler translates pre-processed source
into assembly language. These form an intermediate human-readable
language. The existence of this step allows for C code to contain
inline assembly instructions and for different assemblers to be
used. To get the result of the compilation stage, pass the
-S
option to gcc
:
gcc -S -o test.s test.i
The output after compilation stage in my machine look like following
.file "test.c"
.text
.section .rodata
.LC0:
.string "Hello World #%i!\n"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
movl -4(%rbp), %eax
movl %eax, %esi
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf@PLT
addl $1, -4(%rbp)
.L2:
cmpl $9, -4(%rbp)
jle .L3
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 10.2.1-6) 10.2.1 20210110"
.section .note.GNU-stack,"",@progbits
Assembly
During this stage, The assembler converts the assembly language
source to an unlinked relocatable object file in ELF format. The
output contains the actual instructions to be run by the target
processor. However, an unlinked relocatable object file is not
executable yet: it may require definitions from other files,
including libraries. To get the result of the assembly stage, pass
the -c
option to gcc
:
gcc -c -o test.o test.s
or we can manually invoke as
as -o test.o test.s
Running the above command will produce an unlinked relocatable
object file in ELF format named test.o
. We can
inspect the ELF sections with
readelf -a test.o | less
and to see the content of
specific section we can use readelf -x .text test.o
.
Linking
The object files generated in the assembly stage is composed of
machine instructions that the processor understands, but some
pieces of the program are out of order or missing. The linker
resolves all the references in a set of object files or archive so
that functions in some pieces can successfully call functions in
other ones, and then produces an executable. To get the final
executable use following command, -v
option give us
detail information of linking process.
gcc -v -o test.elf test.o
We can also manually invoke linker separately using
ld
to get the final executable.
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
STARTFILES="$GLIBC_LIB_DIR/crt1.o $GLIBC_LIB_DIR/crti.o"
ENDFILES="$GLIBC_LIB_DIR/crtn.o"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 $STARTFILES test.o $GLIBC_LIB_DIR/libc.so $ENDFILES
The final executable is also a ELF file.
ELF
Executable and Linkable Format (ELF) is a common standard file format used in UNIX system for executable files, object code, shared libraries, and core dumps.
Execution
At first, it seems when a program is executed, it starts with the
int main(int argc, char *argv[])
, however it is not quite true.
Load Executable with Interpreter
Firstly, when we try to run a program, it trigger an
execve
system call
to the kernel. The kernel
allocates the structure linux_binprm
for a new
process,
open the executable file from disk,
find the corresponding interpreter for the executable, in case of our C program executable in ELF format is then
executed
with
ELF loader.
Load Dynamic Linker
The ELF loader
read program headers table of executable which contains a field
INTERP
. For dynamically linked program INTERP
is the path
to dynamic linker. We can use
readelf --program-headers test.elf
to see the program
headers table and use readelf -x .interp test.elf
to
see the value of INTERP
, its value is
/lib64/ld-linux-x86-64.so.2
in my machine. The kernel
opens
and
reads the dynamic linker executable in ELF format.
Auxiliary Vector
Kernel uses a special structure called the
auxiliary vector or auxv
to comminicate with dymanic linker. Kernel
prepares auxv
and pass auxv
by putting on the stack for the newly
created program. Thus, when the dynamic linker starts it can use
its stack pointer to find the all the startup information
required. It contains system specific information that may be
required, such as the default size of a virtual memory page on the
system or hardware capabilities. We can request the dynamic linker
to show some debugging output of the auxv by specifying the
environment value LD_SHOW_AUXV=1
Call Dynamic Linker with Program Entry Point
Kernel
looks for the e_entry
field
from the ELF
header of our program executable which
contains the entry point address which by default is symbol
_start
. We can examine the entry point with
objdump -f test.elf
. We can use option
--entry=<symbol name>
of ld
to
change entry point to other symbol.
Kernel
adds the value of e_entry
to auxv. Kernel then
starts the execution
from the
entry point address as specified by dynamic linker.
Dynamic Linker
Investigating the dynamic linker with command
objdump -f /lib64/ld-linux-x86-64.so.2
and
objdump --disassemble --section=.text
/lib64/ld-linux-x86-64.so.2
we found the entry point of dynamic linker is
function _dl_rtld_di_serinfo
. It does some linking process on the fly by loading any
libraries as specified in the dynamic section of the program
executable in ELF format and then continue execution from our
program executable entry point address which was passed in.
Kernel Library
To avoid the overheads of system calls by triggering a trap to the
processor which is slow. Kernel loads a shared library (ref:
#1,
#2,
#3) into the address space of every newly created process which
contains a function that makes system calls for you. When the
kernel starts the dynamic linker it adds an entry
AT_SYSINFO_EHDR
to the auxv
structure
(ref:
#1,
#2) which is the address in the memory that the special kernel
library lives in. When the dynamic linker starts it can look for
the AT_SYSINFO_EHDR
pointer, and if found load that
library for the program. The program has no idea this library
exists; this is a private arrangement between the dynamic linker
and the kernel.
The programmers make system calls indirectly through calling functions in the standard C library. The standard C library can check to see if the special kernel binary is loaded, and if so use the functions within that to make system calls. If the kernel determines the hardware is capable, this will use the fast system call method.
The role of _start
function
As you might have already noticed, in the linking section we have
to include somes extras files, this is because
the symbol _start
is defined in crt1.o
(Some systems use
crt0.o, while some use crt1.o
and a few even use
crt2.o
or higher). It takes care of bootstrapping the
initial execution of the program, e.g. setup arguments, prepare
environment variables for program execution etc. What exactly that
entails is highly
libc
implementation
dependent. The objects are provided by
different implementations of libc
and cannot be mixed with other ones.
The following code is disassembled version of
_start
with
objdump --disassemble=_start test.elf
:
0000000000401040 <_start>:
401040: 31 ed xor %ebp,%ebp
401042: 49 89 d1 mov %rdx,%r9
401045: 5e pop %rsi
401046: 48 89 e2 mov %rsp,%rdx
401049: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40104d: 50 push %rax
40104e: 54 push %rsp
40104f: 49 c7 c0 10 11 40 00 mov $0x401110,%r8 # __libc_csu_fini
401056: 48 c7 c1 b0 10 40 00 mov $0x4010b0,%rcx # __libc_csu_init
40105d: 48 c7 c7 71 10 40 00 mov $0x401071,%rdi # our main function
401064: ff 15 86 2f 00 00 callq *0x2f86(%rip) # 403ff0 <__libc_start_main@GLIBC_2.2.5>
40106a: f4 hlt
On glibc 2.31, _start
initializes very early ABI requirements
(like the stack or frame pointer),
setting up the
argc
/argv
/env
values, and then
pass pointers
of
__libc_csu_init,
__libc_csu_fini
and main function to
__libc_start_main
which in turn does more general bootstrapping before finally
calling the real main function.
The
implementation of __libc_start_main
is quite complicated as it needs to be portable across the very
wide number of systems and architectures that
glibc
can run on. It does a number of specific things
related to setting up the C library which the most of the
programmers don’t need to worry about.
Initialization and Termination Routines
init
and fini
are two special parts of
code in shared libraries that may need to be called before the
library starts, and before the library is unloaded respectively.
This might be useful for library programmers to setup variables
when the library is started, or to clean up at the end.
__libc_start_main
call the __libc_csu_init
before calling our main function and
register __libc_csu_fini
as a callback to be
called before program exit
with
__cxa_atexit. What __libc_csu_init
/__libc_csu_fini
do is simply loop the list of
init/fini
function and invokes them.
In order to traverse the list of init
functions, two
symbols __init_array_start
and
__init_array_end
is defined during the linking
process and exported as part of ELF symbol table
.symtab
.
We can use __attribute__((constructor))
and
__attribute__((destructor))
(ref:
#1) to add
initialization and termination routines
to our program, e.g.
void __attribute__((constructor)) program_init(void)
{
printf("init\n");
}
void __attribute__((destructor)) program_fini(void)
{
printf("fini\n");
}
In the new realease of glibc
the process of
fini
was changed as part of this
commit.
Call Main Function
Once __libc_start_main
has completed with the
initialization it finally
calls the main function! Remember that it had the stack setup initially with the
arguments and environment pointers from the kernel; this is how
main gets its argc
, argv[]
,
envp[]
arguments.
Exit
When the main function returns __libc_start_main
call
void exit(int exit_code)
with return value of main function as exit code. The
implementation of exit
is trigger a syscall exit_group (ref:
#1,
#2,
#3,
#4,
#5,
#6) to immediately stops the current process.
Writing program without startfiles
Now we know how the call to the main
is made. We can
override the _start
function to make it call our
main()
.
#include <stdio.h>
#include <stdlib.h>
#define LOOP_TIMES 10
void _start()
{
exit(main());
}
int main(void)
{
for (int i = 0; i < LOOP_TIMES; i++)
{
printf("Hello World #%i!\n", i);
}
return 0;
}
Now we have to force gcc
to use our implementation of
_start()
.
gcc -nostartfiles -o test.elf test.c
We can also manually invoke ld
:
gcc -c -o test.o test.c
GLIBC_LIB_DIR="/usr/lib/x86_64-linux-gnu"
GCC_LIB_DIR="/usr/lib/gcc/x86_64-linux-gnu/10"
ld -o test.elf -dynamic-linker /lib64/ld-linux-x86-64.so.2 test.o $GLIBC_LIB_DIR/libc.so
Reference
- The Four Stages of Compiling a C Program
- Computer Science from the Bottom Up - Chapter 8. Behind the process - Starting a process
- What happens when you compile?
- GAS: Explanation of .cfi_def_cfa_offset
- segfault when linking with ld
- How to build a C program using a custom version of glibc and static linking?
-
Linking a C program directly with ld fails with undefined
reference to
__libc_csu_fini
- Linking a dynamically linked executable with ld
- What is the difference between crtbegin.o, crtbeginT.o and crtbeginS.o?
- ld(1) - Linux man page
- readelf(1) — Linux manual page
- GCC - Options for Linking: -nostartfiles
- GCC Options for Code Generation Conventions
- gcc(1) — Linux manual page
- When is the gcc flag -nostartfiles used?
- How do I tell GCC not to link with the runtime library and the standard library?
- Executing main() in C/C++ – behind the scene
- objdump(1) - Linux man page
- Objcopy elf to bin file
- objcopy(1) - Linux man page
- BFD
- Wikipedia: Executable and Linkable Format
- elf(5) — Linux manual page
- elf.h
- How can I examine contents of a data section of an ELF file on Linux?
- exec(3) - Linux man page
- execve(2) — Linux manual page
- Executing a flat binary file under Linux
- load_flat_binary
- What is the difference between exit and return?
- What is the use of _start() in C?
- Syscall implementation of exit()