Week 2 Flashcards

1
Q

When is buffer overflow possible? (general answer)

A

When working with memory unsafe languages like C & C++

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In short, what is a buffer overflow attack?

A

Buffer overflow happens when an application written in a memory unsafe language (C/C++) has certain vulnerabilities, and an adversary passes a certain input to the application that allows the adversary to take over the machine that is running the code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a Process?

A

a program in execution. When a program runs, the OS needs to keep the state of the program. It needs to keep the program’s contents in memory and on disk.

It also ensures that the process runs sequentially. If multiple processes share a single cpu, the OS must be able to start and stop them to effectively handle them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens when a process is stopped?

A

Its data, memory utilization and execution context are saved out so that the CPU can resume this process from the same place later on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many processes are used for single program?

A

It depends. A single program may run one or multiple processes.

If the application is opened twice, two processes are opened, one for each instance.

If the application requires multiple processes on one instance, then opening another instance will require the same number of processes for itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why does the OS run apps with the process abstraction?

A

It has to do with multiprogramming. At any point in time, the OS has to manage many applications at the same time.

The concept of a process packages all of a process’ info up nicely so that starting and stopping is easier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How large is a process’ address space?

A

From 0 to 264 - 1 bytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Where are apps run?

A

In the code or “text” section. It’s read only.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Program Counter?

A

It points to the address of the next instruction to execute. It’s part of the code segment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens as a program runs?

A

Temporaryt data gets pushed onto the stack. The stack grows from the top downwards.

When the function calls end, the stack stops tracking the data associated with the given functions.

We track the bottom of the stack (really the top) so we can ensure we don’t grow into the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Data Segment?

A

The part of memory that holds the global variables we will need. These are defined with the static keyword and are determined at compile time. They don’t change in size/length during the running of the program.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Heap?

A

Stores dynamically allocated memory. It grows at runtime. The heap can grow and shrink.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens when not all of an application’s memory can fit into main memory?

A

When the process is running, the CPU keeps the context of each process in mind. What it does then it stores only the PC, stack pointer, and registers for the currently executing process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a Process Control Block?

A

In the OS, each process gets a unique PID or process id. The OS maintains a table or array of process control blocks.

Each entry points to a process control block or PCB. The PCB stores the context of a process.

The PCB holds many values, including stack pointer, PC and registers. When P1 is executing, only the hardware registers are updated.

When P1 pauses, all of this information is stored in the PCB. PCB is not updating along with the CPU. It just gets the values when it finishes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

QUIZ:

Say we have a website that runs on some server. Might be like an http server or a database like mysql.

We also have clients that run on the web browser.

The apache web server will parse the http requests and respond with the appropriate information/content/site. The browser takes all the information, and renders the website.

If you’re the owner of the site, what assets do you need to you need to worry about?

A
  1. The web page – this includes the content. We want it unaltered and protected.
  2. The web server – if an attacker can compromise this, they can compromise the content of our pages.
  3. The database – we want to protect our data from hackers.
  4. The operating system – We need to protect this because if the attacker can get into it, other applications can be hijacked. They can take email, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the security policies we need for the quiz example?

A
  1. Availability: Web server shouldn’t crash.
  2. Integrity: Web server shouldn’t display the wrong pages.
  3. Confidentiality: The attacker should not access raw database state.
17
Q

What C function is notoriously unsafe because of its vulnerability to buffer overflow attacks?

A

gets()

18
Q

What are the reasons why we might get a segfault?

A
  1. The most common reason is trying to access a memory address that has not been mapped.
  2. Can also happen when you try to write into a page that is read only.
  3. Can also happen if you try to load instructions for executing into a page that is non-executable.
19
Q

How does a buffer overflow happen? General steps.

A
  1. A user enters an input that is too large for the allocated space on the stack.
  2. The stack begins to fill up with their input, but since the input is too large, the stack runs out of space and writes the input to the addresses above where we had allocated.
  3. The input gets placed in the RA, and from the RA, the user can input the address to a new address where the IP will redirect to, allowing them to redirect the control flow to a new location for unexpected code.
  4. When the RA is overwritten properly, a segfault does not occur since a valid memory address now exists in the RA.
  5. The attack occurs when the function returns.
20
Q

What is Control Hijacking?

A

This is when a valid return address gets placed in the RA of a function during a buffer overflow attack. This return address must point to valid code instructions to be valid.

It does not trigger a segfault. The CPU will keep executing the instructions it was redirected to.

21
Q

Where is data held in our computer? There are two locations.

A
  1. Memory (stack, heap, code).
  2. Registers.
22
Q

What are the three registers we are primarily concerned with in regards to Buffer Overflow Attacks?

A
  1. The instruction pointer - stores the memory address of the next instruction. The CPU goes all the way to memory, loads that instruction from memory, decodes it, and executes it. This is what an attacker wants to compromise.
  2. The frame or base pointer - marks the start of a function’s stack frame.
  3. The stack pointer - marks the last item on the stack.
23
Q

What is the frame?

A

The frame is a block on the stack that a given function has access to for storing data.

24
Q

How does the stack grow?

A

The stack starts at a high address in memory and grows downward. The ower the address, that’s the next object on the stack.

If we call pop, we could get the latest object in the stack and remove it.

25
Q

How does a program keep track of the frames of the currently executing function?

A

Via two registers: the stack pointer and the base pointer.

%ebp: records the top (really the bottom) of the stack. Remains stationary.

%esp: records the bottom (really the top) of the stack. Grows downward as the stack grows.

26
Q

What happens to the stack during a function call?

A
  1. The calling function pushes the arguments for the callee onto the stack before anything starts. When this happens
    1. The stack pointer is decremented. We grew the stack and move the pointer to track this.
    2. The calling function needs to tell the callee where to return to, so it pushes a return address onto the stack (should be the next line in the calling function).
  2. The stack is prepared. The prologue for the callee is executed. This includes pushing the ebp register onto the stack, moving the esp register into the ebp register, and then subtracting some space from the stack register.
    1. Pushing the ebp saves the caller’s ebp register onto the stack. This allows us to find where the caller’s frame was when we finish.
    2. Moving the esp into the ebp makes the ebp register point to where our stack pointer is currently pointing to.
    3. Then we grow the stack for the function we are in as needed.
  3. When the function completes, we run the epilogue which has two instructions: leave & ret.
    1. Leave lets us shrink the stack by moing the esp into the ebp, therefore shrinking the frame of the callee.
    2. It then pops into the ebp. This takes the current value of the stack which is whateevr is pointed at by esp - we put that value into the ebp register.
    3. Ret then returns control to the caller. It pops whateevr value in the stack is pointed at by the stack pointer (esp) and puts it intp $eip. We then have the instruction pointer for the next instruction and we have fully reconstructed the frame of the original function we need to return to.
27
Q

What is the Return Address?

A

The address that the caller places in the stack to inform the callee where to return to after executing.

28
Q

What is the %eax register?

A

The %eax register is a general-purpose register. The compiler uses it for all kinds of purposes like intermediate states, sometimes return values, etc.

29
Q

What does the below instruction do?

mov -0x4(%ebp), %eax

A

This says, take the address of the cell of %ebp, and add a negative offset of -4, so we go to the address pointed at by ebp, and we subtract 4 bytes from it, which means that we are going to be pointing at %esp which holds “c”.

We then take the value at that address, c, and we copy it into the eax register. So what this is doing is it is copying c into the eax register, and then it is calling the function epilogue.

So this is how the function square is going to return the value/result to the caller (example).

30
Q

What is the difference between Return by Reference and Return by Value?

A

Return by Reference: returns a pointer to a value somewhere in memory.

Return by Value: returns the actual value or object. This only works though if the return value fits in one register. To get around this, the callee can make a copy of the object (assuming it’s an object since it’s larger than a register) and it can copy all of the values from the object over to the caller’s version of the object. The callee’s values will be wiped at the end of the function, but the caller will get to keep them. For this to work, the type and size of the return value must be known at compile time.

31
Q

Overview of Buffer Overflow Attack (complete).

A

Say we have a vulnerable function vulnerable() which allocates 12 characters to the stack with gets() - gets() does NO bounds checking. Our main function calls vulnerable().

We see the frame of main at the top of the stack. Main pushes the RA of main onto the stack. Then we run the prologue which saves the ebp of main onto the stack. Then it allocates enough space on the stack to hold our local variables.

In some cases, there may be a gap between local variables in vulnerable()’s stack frame and the main() stack frame. To understand the memory layout of a program, you need to compile it and disassemble it.

gets() gets called in vulnerable() and it writes data upwards (from low address to high address) which is different from how the stack fills. This allows the text to run up the stack into the caller’s provided RA. Gets() applies a null terminating byte at the end.

What the attacker will do is input text to run up the stack into the RA, and will inject a valid address into the RA so that he can jump to some piece of code that is not intended by the program. The attacker can jump to library code, other code in the program, or inject his own code into the gets() buffer in the form of opcodes, which he can then jump back to from the RA. The CPU will automatically run up the instructions in the buffer, executing them. The CPU runs up the buffer because the buffer goes from low address to high address so it just follows up.

32
Q

Why can buffer overflow attacks exist?

A
  1. A lot of systems software is written in unsafe languages. We use these languages because they are fast but they don’t perform garbage collection/memory management.
  2. Raw memory addresses are accessible to hackers, meaning that they can manipulate them how they’d like.
  3. Bounds checking is not conducted automatically when copying data, so the programmer is responsible for that and it can be missed.
  4. Integer overflows can be used to alter buffer sizes and cause other mayhem.
  5. Format strings can be used to pring out the contents of the stack and alter performance.
33
Q

What is the danger of format strings?

A

When using things like printf, sprintf, etc., one of the arguments is a format specifier: %d, %f, etc. It indicates to the function at run time how it should interpret the data. So, it allows us to change the behavior of a function at run time.

This matters because if the attacker can use it, he can actually supply the format string and then also change the behavior of the function that uses the format string.

They can literally pass a format string like “%p %p %p %p” and this is going to print the stuff in our stack. This would make the program think that each of the other arguments to the callee like “i”, “a”, and “&a” are instructions and it would work its way up the stack, printing them. They could even input %s which forces the program to interpret what is above in the stack as a string until it encounters a terminatig byte. It might hit an invalid memory location before that and crash the program.

We can also execute code this way.

34
Q

What is integer overflow?

A

Another type of vulnerability. If you have two 32-bit integers and you add them togetherr, if the result is under 32 bits your fine, but if it’s more, then you only get to keep the LSBs.

For example, 2 * 2147483652 = 8 because of this.

A developer might have length checks for inputs, but NOT the final output/result. The inputs are not constants and can be a product of a bunch of operations. If the attacker can cause an overflow, they might be able to copy more or less data than the developer intended.

35
Q

What do the following GDB commands do?

  • list
  • b function_name
  • disas function_name
  • si
  • p
  • info
A
  • list: Will print out chunks of code. continually writing list will make gdb show you subsequent blocks of code.
  • b function_name: sets a break point at the start of a function so that it breaks once the function is called.
  • disas function_name: disassembles a function into its assembly code.
  • si: Step instruction - it allows us to go from instruction to instruction.
  • p: print instruction - prints thevalue of all sorts of things, like p $eax to print the contents of the eax register.
  • info: gets us info on things like registers - i.e. info reg