OS, CPU, Memory, Buffer Overflow... oh my! Flashcards
Give a basic overview of how OS system works.
BASIC: OS kernel is intermediary between hardware and user. It “talks” to CPU, memory, disk storage, display, network.
Process Management: It also manages the processes (allocates CPU time to each process so multiple processes can run at once).
Memory Management: Ensures each process’s memory doesn’t inerfere with each other (virtual and physical- RAM).
File System Management: Organizes data, manages disk space, controlls access to files.
AKA: The master communicator/intermediator between all of these processes/components on a computer. High potential for security vulnerabilities.
Give a high-level overview of how the CPU works.
Comparable to the “brain” of the computer to execute program instructions + perform tasks. CPU contains registers that store important memory addresses during the runtime of a program.
EIP Register
Very important register that contains the memory address for what code to execute next. Repeated: 1) fetch instruction in EIP, 2) execute that instruction, 3) increment or update EIP
What does the MMU do? Why is it important?
Each memory access a program wants to make goes through the MMU. It translates virtual->physical memory addresses and ensures each process has its own individual memory space
What is the CPL?
= Current Privilege Level on a scale of 0-3. 0 = kernel, will execute any instructions, 3 = will only execute a subset of instructions.
Permissions: Kernel vs Non-Kernel
CPL=0=kernel permissions allows: direct access to any addresses, changes to any register, changes to any MMU, can also change CPL level
CPL=3 permissions : no direct access to MMU, nor changes to many registers, nor changes to internal state of MMU, nor changing CPL level
What is a process?
Data structure managed by the kernel (its own space in MMU, own registers). When begins, kernel loads values, sets CPL=3, and loads EIP addr to turn over CPU control to this process.
Explain how the memory, Kernel and MMU interact
A few key points:
- Kernel will allocate virtual address space to each proces
- Kernel marks virtual addresses as “read only”/”do not execute” aka, the EIP is not allowed to point here
- When the CPU tries to read one of these non-allowed access addresses = segfault
- Kernel configures memory permissions: R/W/E memory to process 1, R/E to libc
How does a process with CPL=3 permissions execute certain priviliged actions?
Via syscalls! They are specific instructions that will switch CPL=0 and then execute certain actions, then switch CPL=3 after.
What is an Access Control System?
This is a type of policy created. It defines subjects, objects, verbs and then creates corresonding yes/no to these SOV combos
Access Control Matrix, Access Control List (ACL), Capability
Objects x Subjects, at each point in the matrix exists the set of actions they can do. ACL is a row in this matrix. Quickly defines “who can access this object”. Capability is the other row of the matrix. It answers “what objects can this subject access”
What is a reference monitor? What is the flow of using a reference monitor
1) subject requests permissions of an object
2) reference monitor checks if correct permissions based on the list of policies
3) Subject either is permitted or not permitted access
What are the 3 main requirements of a reference monitor? Are they all feasible?
1) tamper-proof
2) always invoked (also not circumventable)
3) verifiable- aka the system should be as simple and small as possible in order to be easily analyzed and prove correctness
These are very difficult to meet all of them in practice
What is UNIX system known for? When was it created?
An OS system created in 1970s. Designed to be SIMPLE!! (small programs)
How is SOV defined in Unix?
Subjects: Users (UID), Processes (PID)
Objects: Files, directories, memory segments (also: Access control information, processes, users?)
Verbs: For files- Read, write, execute; for processes- kill, debug; for users- delete, change groups
How are users defined in UNIX?
Identified by UID, each user belongs to groups = GID
How do you change user/group owner of a file in UNIX?
chown = change owner, chgrp = change group
Explain the UGO Model
This is a simple set of 12 bits to represent file permissions in unix. 3 bits per user/group/other for read, write, execute permissions. 1 bit to set if directory, 3 special bits (setuid, setgetid, t-bit)
What does the heierarchy look like when a user u tries to access a file?
1) if the u is the owner, use owner permissions
2) If the u is not the owner but in the same group, use group permissions
3) else, use “other” permissions
What is the “root” user?
- they are the administrator account, UID = 0
- Allowed to have all permissions for all files, processes, etc
How do you access the root powers? What design principle(s) does this follow?
Root is not initially set to be logged in. USE sudo to access root permisisons. This follows the least permissions and isolated compartments design principles
How do process owership and permissions work in UNIX? (3 UIDS) and how do you change these permissions?
- Real UID = real UID of process
- Effective UID = permissions-granting changes to display permissions
- Saved UID = similar to “temp” variable to store the UID when switching from real-> effective
Use setuid from syscall to change the UID.
What is the basic overview of memory layout for a process in Linux? (6 main components)
Stack and heap are most important here. Others not super important but be familiar with them
.text: the machine executable code
.data: global initialized static variables
.bss: global UNinitialized variables
heap: dynamically allocated memory (home to syscall)
stack: stores all local variables + function call info
env: environment variables
What does the EIP register do?
the instruction pointer; contains address in memory for what code will be executed next
What does the ESP register do?
ESP is the TOP stack pointer
What does the EBP register do?
EBP is the base stack pointer; points to the bottom of the stack, aka the stack frame
Explain what happens in memory with this basic function call:
int foo(int a, int b) {
int d = 1;
return a + b + d;
}
1) A new stack frame is added: ESP and EBP are adjusted
2) Instruction pointer EIP moves to the code for foo() funciton
3) Arguments a and b are added to the stack
Explain a basic buffer overflow attack via this function:
void bad(char s) {
char buf[64];
strcpy(buf, s);
}
The components will be allocated in memory as so: local variables, ebp, eip, function arguments. If we set s to be LARGER than 64 char, it will be strcpy into the buffer which is stored above ebp and eip. All we have to do is specifically manipulate our code to perfectly overwrite the eip address with our own special address. This address will point to malicious code to run (either in our buffer or somewhere else- syscall maybe).
What do NOP Sleds work around? What are they?
Due to ASLR and other memory security, we may not know the address of the top of the payload, so instead we inject our malicious code into the payload with a TON of NOP sleds right before. NOP sleds (commonly 0x90) are instructions that essentially do nothing, so if the EIP points to a NOP sled, it will follow the NOP sleds down to the malicious code and then execute that.
What is a technique workaround for not knowing where the EIP lays in memory?
We can simply spam place the fake EIP address in multiple places in our payload to hope that it will eventually hit the dedicated EIP register
What are Heap Attacks? What are some of the vulnerabilities/bugs that can allow for this?
Heap attacks are similar to stack- you overwrite the heap’s instruction pointer to point to a different spot in the heap with your malicious code. Vulnerabilities include: use-after-free, double free, etc
What are the main defenses on memory attacks? (4 ways)
- Stack canaries
- ASLR
- W ^ X (Write XOR execute)
- Fuzzing + memory safe languages
Explain how Stack canaries work and what they protect against. Next, explain how canaries are chosen.
Stack canaries aim to detect if data has been corrupted BEFORE the function returns.
1) compiler inserts extra code into each function: pushes a “canary” value onto the stack in between the local variables and ebp/eip
2) Before returning, if this canary is different, we know the code has been tampered with: abort!
Attackers can try to GUESS the canary to try and include it in their code for the compiler to bypass! For this, comiler can choose one of these special canaries:
- NULL: 0x00…00 these are hard to replicate on the stack- a lot of the times will return
- Terminator: (ex: 0x000d0aff) code will terminate at this
- Random: randomly change the canary at each time so that the attacker cannot guess.
What is the difference between -fstack-protector and -fstack-protector-strong in gcc?
the first will protect 2.5% of kernel functions and .33% of binary. The second will protect 20.5% of kernel functions and 2.4% of binary
What does it mean to “read the stack”
“reading the stack” is a technique to bypass canaries. You basically overflow byte by byte. Your program will crash if you guessed it wrong, but if you guessed it right, it will not. Keep doing this until you have guessed the canary. You can learn one XX byte after 256 tries.
Explain what Shadow Stacks are
The compiler can basically save the eip address on a “shadow stack” and then check to make sure the eip matches this shadow before returning (similar to canary)
What is ASLR and what are possible further attacks against this defense?
Stands for Address-Space Layout Randomization. makes it extremely difficult to guess where the EIP is! Basically OS will add random bits of padding (16 or 24) between various parts of memory.
Attacks:
- A ton of NOP sleds + copying shellcode in many places
- Brute force via many forks
What is W ^ X?
Stands for Write XOR Execute. Idea that code is not writeable, data not executable. Stack is set to not executable. The OS marks each portion of memroy as either writeable, executable NOT both.
What is a return-to-libc attack?
On hardware where the stack is non-executable, engineer your payload to overwrite EIP to point to lib-c function.
1) Overwrite EIP -> point to system call in libc.
2) Must engineer payload to overflow EIP with special address to system call, then a random extra word of memory (where the EBP is) and then after that can put the sys arguments.
What is ROP attack?
ROP = Return-oriented programming. This is an attack that is similar to return-to-libc but it jumps to the very end of these library functions. You can chain on these returns to continuously execute code
Explain what program fuzzing is and its 2 types (+ their cons)
To test code, devlopers will run their code on a ton of random inputs to search for bugs/crashes.
mutation- based: (dumb)- take a bunch of real examples and “Mutate” them. These are not as thorough and you could theoretically do this forever. (too many test cases, need a strong server)
generative: (smart)- creating the bugs based on the CODE and where there could possibly be bugs. (too little test cases, way too much human work)
General issues: Need to know which of them are automatic, some can trigger the same bug, how do we prioritize bugs?
What are some of the most industry-standard widely used fuzzers?
AFL: open-source fuzzer by google
OneFuzz: An SAAS fuzzer
Explain which languages are memory safe/not-memory safe and their pros and cons
Not memory safe: C, C++, Assembly
Memory safe: Java, Python, Javascript, etc all else
Memory safe cons- they are slower!