02 - Reverse Engineering Flashcards

1
Q

What is Reverse Engineering and how does the process looks like?

A

RE is the analysis of an unkown malicious application (or any other object) by investigate its code mainly by using a disassembler (IDApro) or a Debugger (Ollydbg) with the goal to understand its functions, structures and operations.

Main uses are: Malware Analysis, Vulnerability Research, Patching Files, Cracking, Fixing Legacy Application.

SOURCE CODE -> Compilation -> OBJECT FILES -> Linking -> EXECUTABLE

EXECUTABLE -> Disassembly -> DISASSEMBLY -> Decompilation -> SOURCE CODE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Compiler?

A

The Compiler takes that text and turns in into an Object File. This is basically a file containing machine language and data that can later be read by a linker. Each code function and data in an object file has a corresponding Symbol name by which it is referenced, and all of these symbol names are stored in the object Symbol table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a Linker?

A

Its job is to take all of the projects Object Files, and any dlls that need to be Statically Linked (embedded in the exe, as opposed to Dynamically linked at run time) – and it combines them into a Executable/DLL/SYS file. At this stage the applications is only binary code, and humans can’t read it (unless long lines of 0’s and 1’s make sense to you).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Disassembler?

A

A disassembler takes the binary and changes it to assembly language. Disassembly is actually a really simple process. Binary files are 0s and 1s, but these binary bits make up instructions to tell the CPU what to do (e.g. Add 2 numbers, move some memory around). A Disassembler simply lists all of those instructions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name five main types of tools used in RE?

A
  • Hex editor (HxD, Hiew)
  • Disassembler (IDA Pro)
  • Search Engines (Google, MSDN)
  • Debugger (Ollydbg, Immunity Debugger) - similar to a disassembler, except that they allow us to step through the code
  • Scripting Language (Ruby, Perl, Python)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Byte, Word, Dword, Qword?

A

8 bits, 16 bits, 32 bits, 64 bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Endianness?

A

is the order in which we read a piece of data (i.e. Right to left, or left to right). Intel x86 uses little endian.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How large in Byte & Hex is a ASCII character?

A

1 Byte or 2 digit hex code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is extended ASCII?

A

As people gradually required computers to understand additional characters and non-printing characters the ASCII set became restrictive. As with most technology, it took a while to get a single standard for these extra characters and hence there are few varying ‘extended’ sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Unicode?

A

ASCII has only 1 byte per character = 256 characters in total! So Unicode emerged as an alternative character encoding standard, using 2 bytes (1 Word) per character – allowing (65536) characters. Strings of Unicode text may start with a BOM, which is a Byte Order Mark (placed at the start of a stream). This is to indicate whether the 2 bytes representing each character are in little endian or big endian format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is CISC?

A

This means Complex Instruction Set Computer (each instruction can carry out several low-level ops) e.g. load from memory, copy to register, arithmetic all in one instruction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Control Unit?

A

Gets & Decodes Instructions. Also retrieves parameters and stores results in memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a Execution Unit?

A

This is the main part of the CPU – it performs all of the instructions e.g. Adding, Subtracting, moving values in memory etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Registers?

A

To be able to carry out its operations the Execution Unit often needs some local memory storage to keep information. This is what a register is. Think of it as a small 32-bit piece of memory in the CPU itself, where the CPU can store information. In that sense it is very like the idea of a variable in programming languages. Because the registers are in the CPU itself, they are much faster to access than RAM.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Flags?

A

Lastly we have flags (called EFLAGS = status register), which are used to indicate various events taking place in the CPU. For example if a subtraction operation resulted in an answer of 0 – the Zero Flag is set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name the for main categories of registers?

A
  • General Purpose Registers
  • Segment Registers
  • Instruction Pointers
  • Control Registers
17
Q

What are GPRs for?

A

These are the most commonly used registers from a reverse engineering perspective.
EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI

18
Q

What are Segment Registers for?

A

When systems had only 16 bit registers, this only allowed the OS to address up to 65536 bytes of memory space. To get around this Intel came up with Segment Registers, and divided memory space into different chunks of memory – each 65536 bytes in size, and then used the Segment register as an index. Nowadays this is no longer the case, but some of these registers are still used.

  • *CS** is used to point to the code section of memory
  • *DS** is used to point to the data section of memory
  • *FS** is used to point to the Thread Environment Block.
19
Q

What are Instruction Pointers for?

A

EIP. This is the Instruction pointer (aka Program Counter). On Intel based processors it always contains the address of the next instruction to be executed. In 64bit its called RIP.

20
Q

What are Control Registers for?

A

Again, we will rarely use these in reversing – but they are here for completeness. They contain some settings that alter the behaviour of the CPU e.g. If Bit 16 of CR0 is set to 1 the CPU can write to pages marked read-only.

CR0, CR1, CR2, CR3, CR4

21
Q

Name some of the main flags?

A
CF = Carry Flag
PF = Parity Flag
ZF = Zero Flag
SF = Sign Flag
TF = Trap Flag
DF = Direction Flag
OF = Overflow Flag
22
Q

What is the difference between 32bit and 64bit registers?

A

The registers now begin with R instead of E. However, 64bit has sixteen GPRs: RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8 - R15. To access all 64 bits use RAX, to access the bottom 32 use EAX, the bottom 16 AX, and so on. There are 8 new general purpose registers you can use. You can access them in a similar manner using R8 (qword), R8D (lower dword), R8W (lowest word), R8B (lowest byte). There is no R8H. RFLAGS is basically the exact same as EFLAGS, but 64 bit – the top 32 bits are currently unused.

23
Q

What is the stack?

A

The stack is an area of memory that every thread has which is used for variables. The stack is a Last-In, First-Out (LIFO) area of system memory which is normally used to store variables created during run time. PUSH / POP are used to put/read data to/from the stack. The stack actually grows downwards.

  • EBP always contains the address of the bottom
  • ESP moves along and is always on top of the stack
24
Q

Explain stdcall & cdecl?

A
  • __stdcall: Standard call on windows. Cleans stack within the function e.g. RET 8
  • __cdecl: Most common C calling convention. Cleans stack after the function, e.g. ADD ESP, 8
  • __fastcall: Uses registers (instead of stack) for arguments
  • __thiscall: C++. Passes ‘this’ pointer in ecx
25
Q

What are structures?

A
  • Can have different types of data (compare to Arrays where everything is the same).
  • Keyword STRUCT tells the compiler we are creating a structure.
  • Do not occupy any memory until associated with a variable, then memory is allocated for all members combined.
26
Q

What are unions?

A

An alternative to a structure is a union. In structure we stored several items of different types. Unions store only a single value, but it can be of several types. Memory for the largest value is allocated.

27
Q

How many calling conventions are common for 64 bit systems?

A

Only one. As there are a lots of extra registers, fastcall makes sense. It is the caller’s responsibility to clean the stack after the call.

28
Q

What are pointers?

A

CHAR_PTR for example, is a POINTER to a CHAR in memory. Although the CHAR value is only 1 byte in size, the POINTER contains the memory address of that character – so it will be 32 bits (4 bytes) or 64 bits (8 bytes) depending on the machine.

29
Q

How big are:

  • char
  • short
  • int
  • long
  • float
  • double
  • pointers
A
  • char: 1 byte
  • short: 2 bytes
  • int: 4 bytes
  • long: 4 bytes
  • float: 4 bytes floating point
  • double: 8 bytes floating point
  • pointers: 4 bytes no 32bit, 8 bytes on 64bit
30
Q

What are the main stages of malware analysis?

A

0 - Extracing malware
1 - Static Analysis
2 - Blackboxing
3 - Internet Search
4 - Whiteboxing
5 - Result Presentation