From Source Code to Executable
Now that you’ve seen how map
works, let’s take a dive into how we went from high-level C code to an executable.
There are 10 written questions for this section, and you must submit your responses to these questions in the file results/answers.md
.
Before we start, we’ll be using a few compiler flags which are likely new to you. Here’s a summary of the flags we’ll be using.
-Wall
- Enables all compiler warnings-m32
- Compiles the code for the i386 architecture.-S
- Invokes the COMPILER only.-c
- Invokes the COMPILER and ASSEMBLER only.-o
- specify the name of the generated output file-g
- add debug symbol information to th generated object and executable file. This is necessary to be able to debug your executable.
Let’s start by invoking the compiler. The compiler takes high-level C code and produces a variant of x86 known as 8086 or i386 assembly.
To compile map.c
, run:
gcc -m32 -S -o map.S map.c
This will only invoke the compiler for map.c
and output the assembly code in map.S
.
Q1. Generate
recurse.S
and find which instruction(s) corresponds to the recursive call ofrecur(i - 1)
. Copy those into your answer.
Now we will assemble our compiled code into an executable. To assemble our code we can run:
gcc -m32 -c map.S -o map.o
This turns our raw x86 code (map.S
) into machine code or an object file (map.o
).
We can also combine these steps by just running gcc -m32 -c
on our C file directly. We can run:
gcc -m32 -c recurse.c -o recurse.o
The assembler converts the raw assembly code into an object file that contains code as well as other data and metadata necessary for execution. Different operating systems use different types of object files. In this class, we will be using the ELF (Executable and Linkable Format), the object format used by Linux. Let’s start by taking a look at map.o
and recurse.o
. These are object files, so we will use the objdump
program to read them.
objdump -D map.o
objdump -D recurse.o
Q2. What do the
.text
and.data
sections contain?
The assembler generates a symbol table which is part of the object file. The symbol table contains all the symbols that can be globally referenced (referenced outside the object file) from another object file (i.e. global/static variables and functions).
Q3. What command do we use to view the symbols in an ELF file? (Hint: We can use
objdump
again, search forobjdump
to find the right flag).
Here’s an excerpt from the map.o
symbol table:
00000000 g O .data 00000004 stuff
00000000 g F .text 00000060 main
...
00000000 *UND* 00000000 malloc
00000000 *UND* 00000000 recur
Q4. What do the
g
,O
,F
, and*UND*
flags mean?Q5. Where else can we find a symbol for
recur
? Which file is this in? Copy and paste the relevant portion of the symbol table.
Finally, let’s link our 2 object files to create an executable.
gcc -m32 map.o recurse.o -o map
Note that we could’ve just called gcc -m32 map.c recurse.c -o map
on the C files to do this entire process in a single command. Often times build systems will separate these commands in order to speed up compile times (since only the changed files need to be recompiled).
Q6. Examine the symbol table of the entire
map
program now. What has changed?
objdump
can be used to look at more than just the symbol table – it can show us the structure of the executable. Run objdump -x -d map
. You will see that your program has several segments, names of functions and variables in your program correspond to labels with addresses or values. The guts of everything is chunks of stuff within segments.
In the objdump
output these segments are under the section heading
. There’s actually a slight nuance between these two terms which you can read more about online.
Using the output of objdump
, answer the following questions:
Q7. What segment(s)/section(s) contain
recur
(the function)? (The address ofrecur
inobjdump
will not be exactly the same as what you saw in gdb. An optional stretch exercise is to think about why. Hint: See the Wikipedia article on relocation.Q8. What segment(s)/section(s) contain global variables? Hint: look for the variables
foo
andstuff
.Q9. Do you see the stack segment anywhere? What about the heap? Explain.
Q10. Based on the output of map, in which direction does the stack grow? Explain. Run
./map
again if you don’t remember.
Next up: Submitting the Assignment.