5. Memory Corruption - Control Questions Flashcards
Which programming languages are most affected by the buffer overflow
problem?
Assembly and C/C++ are popular programming languages that are vulnerable to
buffer overflow, in part because they allow direct access to memory and are not
strongly typed. C provides no built-in protection against accessing or overwriting data
in any part of memory; more specifically, it does not check that data written to a
buffer is within the boundaries of that buffer.
The standard C++ libraries provide many ways of safely buffering data, and C++’s
Standard Template Library (STL) provides containers that can optionally perform
bounds checking if the programmer explicitly calls for checks while accessing data.
For example, a vector’s member function at() performs a bounds check and throws an
out_of_range exception if the bounds check fails. However, C++ behaves just like C
if the bounds check is not explicitly called. Techniques to avoid buffer overflows also
exist for C.
// Buffer_overflow#Choice_of_programming_language
What is a stack frame?
Where on the stack are function parameters and local variables placed?
When stack frame sizes can differ, such as between different functions or between invocations of a particular function, popping a frame off the stack does not constitute a fixed decrement of the stack pointer. At function return, the stack pointer is instead restored to the frame pointer, the value of the stack pointer just before the function was called. Each stack frame contains a stack pointer to the top of the frame immediately below. The stack pointer is a mutable register shared between all invocations. A frame pointer of a given invocation of a function is a copy of the stack pointer as it was before the function was invoked. The locations of all other fields in the frame can be defined relative either to the top of the frame, as negative offsets of the stack pointer, or relative to the top of the frame below, as positive offsets of the frame pointer. The location of the frame pointer itself must inherently be defined as a negative offset of the stack pointer. // Call_stack#Stack_and_frame_pointers
What is the main idea of stack overflow?
○ special form of buffer overflow
○ it occurs when a procedure copies user-controlled data to a local buffer on the
stack without verifying its size
○ user-controlled data overwrites other values on the stack, including potentially
the return address
○ when the procedure returns, the program counter is set to the address residing
at the location of the return address
○ control flow will be changed if there’s code inserted to that modified address,
then it will be executed
Where can the attacker’s code be injected in a stack overflow attack?
○ SetUID/SetGID programs (injected code may run as root)
○ network servers (may allow for remote access to the server)
What else than a return address can be overwritten in a stack overflow attack?
non-control data
Besides stack overflow, what other memory corruption attacks do you know?
○ Heap Corruption : Heap memory is allocated at run- time and usually
contains data from the running program. Heap corruptions occur by
manipulating the data to overwrite through the linked list of heap memory
pointers.
○ Integer Overflow : These overflows occur when an application tries to create
a numeric value that can’t be contained within its allocated storage space.
○ Format String : When a program accepts user input and formats it without
checking it, memory locations can be revealed or overwritten, depending on
the format tokens that are used.
// Memory Corruption / types-of-attacks
What is a shell code?
In hacking, a shellcode is a small piece of code used as the payload in the
exploitation of a software vulnerability. It is called “shellcode” because it
typically starts a command shell from which the attacker can control the
compromised machine, but any piece of code that performs a similar task can
be called shellcode. Because the function of a payload is not limited to merely
spawning a shell, some have suggested that the name shellcode is insufficient.
However, attempts at replacing the term have not gained wide acceptance.
Shellcode is commonly written in machine code.
What is a NOP sled? Why is it used?
○ to cope with slightly wrong guesses, we put a NOP sled at the beginning of
our shell code
○ In computer security, a NOP slide, NOP sled or NOP ramp is a sequence of
NOP (no-operation) instructions meant to “slide” the CPU’s instruction
execution flow to its final, desired destination whenever the program branches
to a memory address anywhere on the slide.
The technique sees common usage in software exploits, where it is used to
direct program execution when a branch instruction target is not known
precisely. Other notable applications include defensive programming
strategies such as EMC-aware programming .
Why 0x00 bytes should be avoided in shell codes? How to avoid them?
If str contains our shell code, then strcpy() will terminate copying it into buffer
when 0x00 byte is encountered.
What countermeasures do you know against stack overflow attacks?
How do they make the task of an attacker harder?
○ the best defense is proper bounds checking
■ but there are many C/C++ programmers and some are bound to forget
■ try to use secure programming methods and tools
○ Defenses:
■ a stack canary is a 32-bit value inserted between the return address
and local variables by the function prologue – you can enable this in
your compiler
● needs recompilation of existing programs the function epilogue
checks if the canary has been altered
■ DEP – Data Execution Prevention
● the idea is to separate executable memory locations from
writeable ones
○ e.g., the stack should be writeable, but non-executable
○ programs are usually executable, but should not be
writeable
■ except for dynamic loading of modules!
● usually implemented at the memory management level
○ memory pages that hold data can be set to be
non-executable (NX bit)
● or the computer architecture is designed with this in mind
example: Harvard architecture
○ physically separated memory for data and code
(different buses)
○ CPU can fetch instructions and data at the same time
○ memory for data is non-executable
■ Address Space Layout Randomization
● the OS chooses the position in memory of the key data and
program areas randomly (stack, .data, .text, shared libraries, … )
What is a return-to-libc attack?
○ a way to bypass DEP
○ 3 assumptions:
■ we can still write the stack
■ we can overwrite a return address on the stack by overflowing a buffer
■ there are suitable functions in memory to ”return to”
● e.g., system() in libc
○ we just need to prepare a valid stack frame for the function we ”return to”
■ e.g., by placing appropriate arguments on the stack
What does Return Oriented Programming (ROP) mean?
How does it differ from simple return-to-libc?
○ chaining gadgets require only the manipulation of the stack!
■ we need to create a properly constructed stack by overflowing a buffer
■ when we return from the vulnerable function, the SP will point to the
address of the first gadget → we start executing the first gadget
■ when we reach the ret in the first gadget, the SP will point to the
address of the 2nd gadget → we start executing the 2nd gadget
○ return-to-libc : the attacker can’t predict the address of the library function
○ ROP: the attacker can’t predict the address of the gadgets
■ when we reach the ret in the 2nd gadget, the SP will point to the
address of the 3rd gadget → we start executing the 3rd gadget