CH13 Security Testing Flashcards
Four Bounds of Security Testing
Dynamic Testing
Static Testing
Automatic Testing
Manual Testing
Manual Dynamic Approach
Penetration Testing
Automatic Dynamic Approaches
DAST, IAST, Vulnerability Scanner
Automatic Static Approach
SAST
Manual Static Approach
Manual Code Review
Security Testing light definition
a systematic process for revealing flaws in information systems
Dynamic Testing: The Core Idea
The running application is being tested
Strange inputs are made
The outputs are returned and the program behaviour is observed and reported
Dynamic Testing operates similarly to black-box testing, in that you’re comparing outputs against inputs and not necessarily assessing the code itself.
- means that hard-coded aspects of the program may get easily overlooked
Static Security Testing Core Idea
Parse all source code and config files
Analyze (an abstract representation of) the parsed files
Report any problems found
Static Testing is similar in execution to program compilation
Vulnerability existence against report spectrum
Weakness exists and reported = True Positive
Weakness exists and not reported = False Negative (DANGEROUS)
Weakness doesn’t exist and reported = False Positive (ANNOYING TIME WASTE)
Weakness doesn’t exist and not reported = True Negative
For a fully automated security testing tool, we must make compromises regarding False Negatives and False Positives:
In case of doubt, report a potential vulnerability:
- we might “annoy” users with many findings that are not real issues
- we risk “the boy that cried wolf” phenomena
In case of doubt, we stay silent:
- we might miss severe issues
Reasons and Recommendations for False Negatives
Fundamental: Under-approximation of the tool
- missing language features (might intercept data flow analysis)
- missing support for complete syntax (parsing errors)
Therefore Report to tool vendor
Configuration: lacking knowledge of insecure frameworks
- insecure sinks (output) and sources (input)
Therefore Improve configuration
Unknown security threats
- XML verb tampering
Therefore Develop new analysis for tool (might require support from tool vendor)
Security expert: “I want a tool with 0 false negatives! False negatives increase the overall security risk”
Reasons and Recommendations for False Positives
Fundamental: over-approximation of the tool, e.g.m
- pointer analysis
- call stack
- control-flow analysis
Therefore report tool to vendor
Configuration: lacking knowledge of security framework, e.g.,
- sanitization functions
- secure APIs
therefore improve configuration
Mitigated by attack surface: strictly speaking a true finding, e.g.,
- No external communication due to firewall
- SQL injections in a database admin tool
Therefore should be fixed. In practice often mitigated during audit or local analysis configuration
Developer: “I want a tool with 0 false positives!” False positives create unnecessary effort
Prioritization of Findings
A pragmatic solution for too many findings
Classification with clear instructions for each vulnerability has proven to be the most easy to understand.
Can clearly see:
- What needs to be audited
- What needs to be fixed
- as security issue
- quality issue
- Different rules for
- old code
- new code
Mainly two patterns that cause security vulnerabilities
Local issues
- insecure functions
- secrets stored in the source code
Data-flow related issues
- XSS
- Secrets stored in the source code
generic defects visible in the code
static analysis sweet spot. built-in rules make it wasy for tools to find these without programmer guidance
e.g. buffer overflows
generic defects visible only in the design
Most likely to be found through architectural analysis.
e.g. the program executes code downloaded as an email attachement
context specific defects visible in the code
Possible to find with static analysis, but customisation may be required
e.g. mishandling of credit card information
context specific defects visible only in design
Requires both understanding of general security principles along with domain-specific expertise.
e.g. cryptographic keys kept in use for an unsafe duration.
Static Application Security Testing SAST, Pragmatic static analysis is based on
successful developments from research community
- type checking
- property checking (model-checkng, SMT solving, etc.)
- Abstract interpretation
- …
techniques from the software engineering community
- style checking
- program comprehension
- security reviews
- …
Type checkers are useful, but
may suffer from false positives/negatives
identifying which computations are harmful is undecidable
why will the java compiler flag this as an error?
short s = 0;
int i = s;
short r = 1;
Java is a statically typed, compiled language. types are set at compilation. even prior to running the program, an error will flag as an integer is being assigned seemingly compatible data from an incompatible type short.
why will the java compiler flag this as an error?
Object [] objs = new String[1];
objs[0] = newObject();
Style Checkers
Enforce more superficial rules than type checkers
- like x == 0 vs 0 == x
Style checkers are often extensible
- PMD
- JSHint
Simple, but very successful in practice
Program Understanding
Tools can help with
- understanding large code ases
- reverse engineering abstractions
- finding declarations and uses
- Analysing dependencies
- …
Useful for manual code/architectural reviews
Fuzzing (DAST) core idea
Create large random strings
Pipe input into Unix utilities
Check if they crash
e.g.
Bash
$ fuzz 100000 -o outfile | deqn
started in 1988 by Barton Miller at University of Wisconsin
Industrial Case Study: Fuzzing Chrome
Chrome’s Fuzzing Infrastructure
Automatically grab the most current Chrome LKGR (Last Known Good Revision)
Fuzz and test Chrome, starting with multi-million test cases
Thousands of Chrome instances on hundreds of virtual machines
AddressSanitizer
Compiler which performs instrumentation
Run-time library that replaces malloc(), free(), etc
custom malloc() allocates more bytes than requested and “poisons” the redzones around the region returned to the caller
Heap buffer overrun/underrun (out-of-bounds access)
Use after free
Stack buffer overrun/underrun
AddressSanitizer: Result
0 months of testing the tool with Chromium (May 2011)
300 previously unknown bugs in the Chromium code and in third-party libraries
SyzyASAN
AddressSanitizer works only on Linux and Mac
SysyASAN uses a different instrumentation (using MS Visual Studio)
ThreatSanitizer
Runtime data race detector based on binary translation
Supports also compile-time instrumentation
- Greater speed and accuracy
Data races in C++ and Go code
Synchronization issues
libFuzzer
Engine for in-process, coverage-guided, whitebox fuzzing (works on top of fuzzers like addressSanitizer)
Cluster Fuzzing: ClusterFuzz
A fuzzing tool making use of the following memory debugging tools with libFuzzer-based fuzzers:
- AddressSanitizer (ASan): 500 GCE VMs
- MemorySanitizer (MSan): 100 GCE VMs
- UndefinedBehaviorSanitizer (UBSan): 100 GCE VMs
Fuzzing Challenges
Detecting input channel
Input generation
Deciding if the response is a bug or noot
How to get a test setup that is safe to use?
When did we test enough? (coverage)
Random Fuzzing
All randomly generated input data
Very Simple
Inefficient
- random input is often rejected, as a specific format is required
- probability of causing a crash is very low
Unlikely generating random HTML documents that trigger edge cases
Mutation-based Fuzzing
mutate existing data samples to create new test data
- little or no knowledge of the structure of the inputs is assumed
- anomalies are added to existing valid inputs
- anomalies may be completely random or follow some heuristics
- requires little to no set up time
- dependent on the inputs being modified
- may fail for protocols with checksums, those which depend on challenge response, etc.
- examples include Taof, Peach Fuzzer, ProxyFuzz
Mutation-Based Fuzzing pros and cons
+easy to set up and implement
+requires little to no knowledge of the input format/protocol
-effectiveness limited by selection of initial data set
-has problems with file formats/protocols that require valid checksums
Generation Based Fuzzing
define new tests based on models of the input format
Generate random inputs with the input specification in mind (RFC, documentation, etc.)
Add anomalies to each possible spot
Knowledge of the input format allows to prune inputs that are rejected by the application
Input can be specified by a grammar-based fuzzing tool
examples include Peach Fuzzer and SPIKE
Generation-Based Fuzzing Pros and Cons
+ completeness (you can measure how much of the specification has been covered)
+ can handle complex inputs (e.g., that require matching checksums)
- building a generator can be a complex problem
- specification needs to be available
- Greybox-Fuzzing (Concolic Testing)
- uses symbolic execution to trigger unused paths
- invented by Microsoft and used for fuzzing file input routines
- Autodafe
-Fuzzing by weighting attacks with markers- Open Source