One of the main features of any debugger is the ability to show the value of local variables and to show the list of active function calls, i.e. to show how we reached a certain statement in a function other than the main() function.

In this page we present the challenges that the debugger has to solve to report the values of local variables and the return address for each active function.

Stack Frames and the Frame Pointer

Most Application Binary Interfaces (ABIs) use the concept of "frame" to specify the layout of certain regions in the stack area of a process. The frame concept is used for two main purposes:

  •  The compiler can store the address of the beginning of the frame in a register (thus called "Frame Pointer", or FP) and then use the based addressing mode to access memory locations in the procedure's frame. Typically local variables and actual parameters are accessed using the frame pointer, as in the following example:

    ----- C code -----    --- Pseudo assembly ---

    f1(int a)
    {                     push   FP
                          SP = FP
                          SP -= #size_of_locals
       ...                ...
       int b = 1;         FP[-b_offset] = #1
       ...                ...
       return a + b;      R0  = FP[+a_offset]
                          R0 += FP[-b_offset]
       ...                ...
    }                     SP = FP
                          pop    FP
                          ret

    As one can see, local variables are allocated at addresses below the frame pointer, while parameters are accessed at addresses above the frame pointer.
  • The debugger can use the Frame Pointer to locate the return address of the current function, as well as the Frame Pointer of the next calling function. This allows the debugger to report to the user which function called which other function. Using symbolic information provided by the compiler, the debugger can compute the location of local variables, parameters and saved registers for each active function.
                    

The following picture shows the data organization of a typical stack:

Stack Frames

When the compiler follows this convention for all functions, it's very easy for the debugger to show local variables and the chain of calls.

Frame-less Functions and the Virtual Frame Pointer

Setting up the frame at the entry point of a function and destroying the frame at the exit of the function may generate inefficient code.

In fact, the compiler knows exactly where to find local variables and parameters at any given instructions in the code, even when the stack pointer changes value within a function. The following code is an alternative to the code generated for the f1() function above:

----- C code -----    --- Pseudo assembly ---

f1(int a)
{                     SP -= #size_of_locals
   ...                ...
   int b = 1;         SP[+b_offset] = #1
   ...                ...
   return a + b;      R0  = SP[+a_offset]
                      R0 += SP[+b_offset]
   ...                ...
}                     SP += #size_of_locals
                      ret

As one can see, this approach requires less instructions than using the frame pointer; moreover, it frees up the frame pointer register for other uses, such as to hold an additional local variable. Unfortunately this causes a problem for the debugger, because the debugger now has to know the "size_of_locals" to be able to compute the location of the return address.

Another difference is that now local variables and actual parameters are both accessed using a positive offset from the value of the stack pointer. This would not be a big problem if the compiler generates detailed information on the location of both local variables and actual parameters. This is not always true. Some object formats (such as stabs) have no way to tell the debugger that there is no frame pointer; in this case, the debugger has to compute the location that the frame pointer would have had if it had been used by the compiler, and then apply the offset indicated in the object format to this virtual frame pointer.

In the case of the function f1() above, the debugger can analyze the entry code of the function and see that no frame pointer is used and record the value of #size_of_locals in the function's representation. We can then apply the following formula to access the location of variable 'b':

      location(b) = register(SP) + #size_of_locals - adjust(#b_offset)

This approach works only in case the stack pointer is not changed within the function. If the compiler changes the value of the stack pointer in other places beyond the entry code of the function, the debugger needs to become smarter and use an additional component to the formula above: the debugger needs to add the additional space allocated on the stack for the current PC!

Stack manipulation in optimized code

Consider the following code sequence:
 

----- C code -----    --- Pseudo assembly ---

f1(int a)
{                     SP -= #size_of_locals
   ...                ...
   f2(b);             push  SP[+b_offset]
                      call  f2
   f3(a);             push  SP[+a_offset+4]
                      call  f3
                      SP += 8
   ...                ...
}                     SP += #size_of_locals
                      ret

In this case, the compiler changes the stack pointer when passing parameters to other functions. In addition, the compiler optimizes the de-allocation of parameters across sequences of calls. This creates a difficult problem for the debugger, because now the debugger has to add the size of the additional space on the stack when computing the location of variable a at the point of call of f3().

Alternative calling conventions

More cases have to be considered depending on the complexity of the calling conventions available to the compiler.

Here is an example of code that is very commonly found in Windows applications:

----- C code -----    --- Pseudo assembly ---

f1(int a)
{                     ESP -= #size_of_locals
   ...                ...
   f4(b);             push  ESP[+b_offset]
                      call  f4@4
   f5(a);             push  ESP[+a_offset]
                      call  f5
                      ESP += 4
   ...                ...
}                     ESP += #size_of_locals
                      ret

In this case, the user specified the Microsoft __stdcall calling convention for f4(), but kept the __cdecl calling convention for f5(). This means that the 'b' value pushed on the stack before calling f4() will be removed by f4() itself, whereas the 'a' value pushed on the stack before calling f5() must be removed by f1().

The debugger can only rely on symbolic information provided by the compiler; otherwise. the debugger will have no way to determine which calling convention is being used for which call.

Functions with variable number of arguments

The last case we consider is that of a function receiving a variable (or unknown) number of arguments.

On processors that only pass arguments on the stack the debugger doesn't have to do anything special.

Different the case for processors (or calling conventions) that pass one or more arguments on some registers. In this case, the compiler generates special code in the prologue of the function. or uses a special layout for the frame.  Here is an example of what might happen:

----- C code -----    --- Pseudo assembly ---

f6(int a, ...)
                      PUSH   R4     // last param. reg
                      PUSH   R3
                      PUSH   R2
                      PUSH   R1     // first param: 'a'
{                     SP -= #size_of_locals
    va_list ap;
    int     x;

    va_start(a);      SP[+offset_of_ap] = #4 // sizeof(a)
    ...
    x = va_arg(int);  R1 = SP[+offset_of_ap]
                      cmp    R1,#16 // size of saved regs
                      jlt    from_stack
                      x = SP[+size_of_locals+R1*4]
                      jmp    done
                 from_stack:
                      x = SP[+size_of_locals+R1*4+4]
                 done:
                      SP[+offset_of_ap] += #4
    ...               ...
}                     SP += #size_of_locals+16
                      ret

In this example, up to 4 parameters are passed in registers (R1 - R4) and the rest is passed on the stack. However, since function f6( ) does not know how many parameters were actually passed, it must keep track of where it can find the next parameter in the "va_list(ap)" variable.

This means that while the "ap" counter is less than 16 (4 registers * 4 bytes each), it should get the next parameter from f6( )'s frame, while if the "ap" counter is higher than 16, it should get the next parameter from the caller's frame (hence the additional 4 when accessing the stack to account for the return address).

 


© 2007 Giampiero Caprino, Backer Street Software