80x86 Calling Convention

This section summarizes important points of the convention used for normal function calls on 32-bit 80x86 implementations of Unix. Some details are omitted for brevity. For a description of the x86 registers, please see here.

Calling a Function

The calling convention works like this:

  1. The caller pushes each of the function’s arguments on the stack one by one, normally using the push x86 instruction. Arguments are pushed in right-to-left order. The stack grows downward: each push decrements the stack pointer, then stores into the location it now points to, like the C expression *--esp = value.
  2. The caller pushes the address of its next instruction (the return address) on the stack and jumps to the first instruction of the callee. A single 80x86 instruction, call, does both.
  3. The callee executes. When it takes control, the stack pointer points to the return address, the first argument is just above it, the second argument is just above the first argument, and so on.
  4. If the callee has a return value, it stores it into register eax.
  5. The callee returns by popping the return address from the stack and jumping to the location it specifies, using the x86 ret instruction.
  6. The caller pops the arguments off the stack, normally using the pop x86 instruction. The pop is the exact opposite of a push as it reads the value from the location the stack pointer is referring to and increments it afterwards, like the C expression value = *esp++.

Consider a function f that takes three int arguments. This diagram shows a sample stack frame as seen by the callee at the beginning of step 3 above, supposing that f is invoked as f(1, 2, 3). The initial stack address is arbitrary:

                             +----------------+
                  0xbffffe7c |        3       |
                  0xbffffe78 |        2       |
                  0xbffffe74 |        1       |
stack pointer --> 0xbffffe70 | return address |
                             +----------------+

The Called Function

Based on the above, the called function follows a certain structure as well that ensures accessing the arguments that were passed by the caller, etc. Here is what is happening in any called function:

  1. The prolog code pushes %ebp to the stack using the push %ebp x86 instruction.
  2. The current value of the stack pointer %esp is stored in %ebp, the stack base pointer. All subsequent operations involving the function local stack segment will be performed relatively to %ebp.
  3. The stack pointer %esp is adjusted as needed by function local operations (allocate function local variables, call other functions, etc.)
  4. Other operations specific to the function are performed.
  5. The last operation before returning from the function executes a leave x86 instruction, which restores the value of the stack pointer %esp from %ebp and pops the previous value of %ebp from the stack.
  6. The function executes the ret instruction to return to its caller.

This shows the stack layout after executing a function prolog:

                             +----------------+
                  0xbffffe7c |        3       |
                  0xbffffe78 |        2       |
                  0xbffffe74 |        1       |
                  0xbffffe70 | return address |
                             +----------------+
stack pointer --> 0xbffffe6c |  previous ebp  | <-- ebp points here as well
                             +----------------+

Such a layout allows for accessing the function arguments relatively to %ebp, which has the value of 0xbffffe6c in the case shown above. For instance, loading the second argument (the 2) into %eax looks like: mov 0xc(%ebp),%eax (add 12 to %ebp and use the result as the pointer to the memory location the value should be loaded from).