Obfuscation Flashcards
What is an informal definition of obfuscation?
To obfuscate a program P means to transform it into a executable program P’ from which it is harder to extract information than from P.
What is an informal definition of reverse engineering?
The process of extracting data or a model of the system by inspecting its lower level description and/or behavior.
Name 2 attack scenarios addressed by obfuscation
Stealing intellectual property, stealing secrets embedded in program
Name the two main types of obfuscation and their respective properties
Static obfuscation:
- obfuscated program remains fixed at runtime
- raises bar against static analysis
- can be attacked through dynamic techniques
Dynamic obfuscation
- program keeps changing at runtime -> self modifying code
- raises the bar against static analysis
What are different ‘Points of insertion’ for obfuscation?
Source code, Intermediate representation, machine code
What are the different Transformation targets?
- layout -> scramble identifiers and code layout
- data -> obfuscate data embedded in code
- control flow -> obfuscate secret algorithms
Name 9 different static obfuscation techniques
Confuse Code Reader:
- Scramble identifiers
- Instruction substitution
- Garbage code insertion
- Merging and splitting functions
- Control-flow flattening
Confuse Code Reader and Compiler:
- Opaque predicates
- Virtualization obfuscation
- Opaque expressions
- White-box cryptography
What is Scrambling identifiers?
Identifier names are replaced with random strings
What is instruction substitution?
Replace binary operation by functionally equivalent but more complicated computations
What is garbage code insertion?
Dead code is added
What are opaque predicates?
Opaque predicates are bogus branches in the control flow which always take the same branch, although hard to see for an attacker
What is control-flow flattening?
- Put each basic block in a case of a switch statement
2. Wrap the switch statement in an infinite loop
What is a possible attack on control-flow flattening and how could it be countered?
- Find next blocks of every basic block
- Rebuild original CFG
Mitigation: assign opaque expression to next
What is an opaque expression?
An opaque expression is an expression that will always evaluate to the same value in a way not obvious for an attacker.
How do opaque expressions from array aliasing work?
- A statically initialized array with seemingly random values
- The values are generated such that some invariant holds
- Update array cells with values that respect invariants
How does virtualization obfuscation work?
- Generate random bytecode instruction set architecture (ISA) L covering all instructions of P
- Translate P to L bytecode program
- Generate emulator to interpret L bytecode on machine
Output: P’ consisting of bytecode and emulator
What is the goal and the idea behind White-Box cryptography?
Goal: Hide encryption/decryption key
Idea: Embed the key within the cipher
What are some issues with software diversity?
- analysis of crash dumps
- incremental udpates
- digitally signing all versions
Name two types of software diversity
- Pre-distribution Software Diversity
- Post-distribution Software Diversity
In which phases does dynamic obfuscation run?
- At compile time
- initial program configuration is generated
- runtime code-transformer is added - At runtime
- interleave execution of the program with calls to the code-transfomer T
- T changes the code at runtime
- ideally a non-repeating series of configurations, in practice they repeat
How does replacing instructions work?
- Replace real instructions with bogus instructions
- Just before execution replace bogus instructions with real instructions
- After execution replace real instructions with bogus instructions
How does dynamic code merging work?
- Have two or more functions share the same location in memory
- Create templates for functions that share the same location
- Before function is called, patch memory using edit script to load it
How does dynamic decryption and re-encryption work?
- Execute current basic block
- At some point the current block decrypts the next basic block
- Decryption key could be hash of some other basic block
- Jump to decrypted block
- Encrypt the previously executed basic block
goto 1
What is a non-obvious but annoying problem with self-modifying code?
Virus scanners will complain
Name 3 dynamic obfuscation techniques
replacing instructions, dynamic code merging, dynamic decryption and re-encryption
What are the 4 dimensions of Collberg’s obfuscation taxonomy?
Potency: comprehensibility of code by humans
Stealth: identifiability of obfuscated code
Resilience: resistance against automatic deobfuscation
Cost: performance and resource overhead of obfuscation
Describe methodological steps to characterize and predict the strength of obfuscation
- Model MATE as attack-nets
- Model transitions as search problems
- Identify features to generate programs
- Obfuscate programs
- Attack obfuscated programs
- Feature extraction
- Predict average effort of attack