buffer overflows (week 2) Flashcards
what are some reasons you would get a segmentation fault?
a segfault happens when the CPU tries to read a memory address that is not valid
- you try to access a memory address that has not been allocated, that is not mapped
- if you try to write into a page that’s read only
- if you try to load instructions for executing from a page that is non-executable
summarize the basic premise of control hijacking via buffer overflow
- incorrect parsing can override the program’s memory
- memory overriden can include the stack’s control data (return address)
- when function returns, it returns to bogus address (PC %eip points to it)
- leads to a segfault when the CPU tries to load instruction from invalid memory
- if return address is changed to a VALID MEMORY ADDRESS that contains valid x86 instructions the CPU will not trigger a segfault
- instead, CPU will simply keep on executing those instructions as if nothing wrong had happened
this is how an attacker can hijack and take control of program
memory
stack, heap, code
3 important registers
%eip = instruction pointer (attackers want this to point to their own code) %ebp = frame or base pointer (keeps track of current frame of stack, currently executing function) %esp = stack pointer
What happens to the stack during a function call?
remember the stack grows by decrementing, i.e., starts at a higher address and grows lower
so, we push a and d onto the stack.
the base pointer points to the “bottom” of the stack. %epb points to where a is. then d gets pushed on.
then, when example calls square, a copy of the argument passed to square is pushed onto the stack. then the return address. this is now the %esp (stack pointer) which keeps getting decremented because it points to the top of the stack.
what is the function prologue?
set of three assembly instructions that run every time you call a function
- push $ebp: saving the caller’s ebp on the stack, saving the top of the frame of the caller, so we know how to reconstruct the frame of the caller after the function returns
mov $esp, $ebp: moves the epb to where the stack pointer is; this creates a new frame
sub $0x4, $esp: grow the stuck for current function and start putting the called functions value on the frame (in this case 4 bytes)
what is the function epilogue?
leave
shrink stack + restore caller’s frame:
1: mov $ebp, $esp
2: pop $epb
ret
returns control to caller
1: pop $eip (pops the current value of the stack, i.e. where the stack
pointer is set to, into the instruction pointer. so the code will now resume
executing where we left off before we made the function call)
return address vs. return value
return address: the address that the caller places in the stack and informs the callee where to return once the callee has finished
return value: whatever the callee returns to the caller, for e.g., if the callee needs to retun an integer, where is that integer actually placed?
%eax
general purpose value
where the compiler often places the return value
what happens right before the function epilogue?
mov -0x4($ebp), %eax //set return value
recap
return value is passed in the %eax register
• if value fits in 32 bits (%eax is a 32 bit register)
• int, short, float, char, bool, pointers
what might happen if the return value is larger than 32 bits?
the rest could be stored in %edx, another general purpose register
OR
- -> option 1: callee allocates value in heap and returns by reference
- -> option 2: callee allocates space in stack and callee sets values
option 1: return reference
we note that we malloc space on the heap for the book.
and we return not the VALUE of the book but a pointer to the book (return b)
since pointers are 32 bits, this address is stored in %eax
option 2: “return” the value
here space is being allocated in the stack itself
so remember we can’t return a pointer to it because the stack will shrink back below where the pointer points
but in c this works. how? whoever calls, the book gets copied there as well. so when it returns, the caller still has a copy of this.
summary
if value is small enough to fit in register it is returned in %eax
if value is big, caller allocates space for it in its stack frame
–> THIS IS WHY TYPE AND SIZE MUST BE KNOWN AT COMPILE TIME