02 - Reverse Engineering Flashcards
What is Reverse Engineering and how does the process looks like?
RE is the analysis of an unkown malicious application (or any other object) by investigate its code mainly by using a disassembler (IDApro) or a Debugger (Ollydbg) with the goal to understand its functions, structures and operations.
Main uses are: Malware Analysis, Vulnerability Research, Patching Files, Cracking, Fixing Legacy Application.
SOURCE CODE -> Compilation -> OBJECT FILES -> Linking -> EXECUTABLE
EXECUTABLE -> Disassembly -> DISASSEMBLY -> Decompilation -> SOURCE CODE
What is a Compiler?
The Compiler takes that text and turns in into an Object File. This is basically a file containing machine language and data that can later be read by a linker. Each code function and data in an object file has a corresponding Symbol name by which it is referenced, and all of these symbol names are stored in the object Symbol table.
What is a Linker?
Its job is to take all of the projects Object Files, and any dlls that need to be Statically Linked (embedded in the exe, as opposed to Dynamically linked at run time) – and it combines them into a Executable/DLL/SYS file. At this stage the applications is only binary code, and humans can’t read it (unless long lines of 0’s and 1’s make sense to you).
What is a Disassembler?
A disassembler takes the binary and changes it to assembly language. Disassembly is actually a really simple process. Binary files are 0s and 1s, but these binary bits make up instructions to tell the CPU what to do (e.g. Add 2 numbers, move some memory around). A Disassembler simply lists all of those instructions.
Name five main types of tools used in RE?
- Hex editor (HxD, Hiew)
- Disassembler (IDA Pro)
- Search Engines (Google, MSDN)
- Debugger (Ollydbg, Immunity Debugger) - similar to a disassembler, except that they allow us to step through the code
- Scripting Language (Ruby, Perl, Python)
Explain Byte, Word, Dword, Qword?
8 bits, 16 bits, 32 bits, 64 bits
What is Endianness?
is the order in which we read a piece of data (i.e. Right to left, or left to right). Intel x86 uses little endian.
How large in Byte & Hex is a ASCII character?
1 Byte or 2 digit hex code
What is extended ASCII?
As people gradually required computers to understand additional characters and non-printing characters the ASCII set became restrictive. As with most technology, it took a while to get a single standard for these extra characters and hence there are few varying ‘extended’ sets.
What is Unicode?
ASCII has only 1 byte per character = 256 characters in total! So Unicode emerged as an alternative character encoding standard, using 2 bytes (1 Word) per character – allowing (65536) characters. Strings of Unicode text may start with a BOM, which is a Byte Order Mark (placed at the start of a stream). This is to indicate whether the 2 bytes representing each character are in little endian or big endian format.
What is CISC?
This means Complex Instruction Set Computer (each instruction can carry out several low-level ops) e.g. load from memory, copy to register, arithmetic all in one instruction.
What is a Control Unit?
Gets & Decodes Instructions. Also retrieves parameters and stores results in memory.
What is a Execution Unit?
This is the main part of the CPU – it performs all of the instructions e.g. Adding, Subtracting, moving values in memory etc.
What are Registers?
To be able to carry out its operations the Execution Unit often needs some local memory storage to keep information. This is what a register is. Think of it as a small 32-bit piece of memory in the CPU itself, where the CPU can store information. In that sense it is very like the idea of a variable in programming languages. Because the registers are in the CPU itself, they are much faster to access than RAM.
What are Flags?
Lastly we have flags (called EFLAGS = status register), which are used to indicate various events taking place in the CPU. For example if a subtraction operation resulted in an answer of 0 – the Zero Flag is set.