L4 Flashcards
Korat
whitebox, systematic testing tool for linked data structures
Randoop
feedback-directed random testing for classes and libraries
creates new test guided by feedback from previous tests
problem with automated testing
- inf tests
- even a finite subset can be huge
- need to be concise (avoid illegal and redundant tests) and diverse (give good coverage)
small test case hypothesis
- if there is any test that cause the program to fail, there is small such test.
how to generate inputs
- use the types
- class declaration shows what values can fill each field
- enumerate all possible deviation deviations from a fix set of objects
Korat algorithm
- generates all possible input up to user-defined k
- discards inputs where pre-condition is false
- runs the program on remaining inputs
- check results using post-condition
The total number of candidate vectors/shapes for a at-most 3 node binary tree in Korat?
7 fields, 4 choices each -> 4^7, but we are only interested in 9 of them
The total number of candidate vectors/shapes for an at-most k node binary tree in Korat?
(k+1)^(2k+1)
steps Korat follows
- initiate candidate vectors (all null)
- expand the last field accessed by the pre-condition
- backtrack if all possibilities for a field are exhausted
Korat’s strength
- when we can enumerate all possibilities
- for linked data structures, small, easily specified procedures, unit testing
Korat’s weakness
- when enumeration is weak
- only as good as the pre and post conditions
Randoop’s recipe
randomly create new test guided by feedback from previously created tests
- build new sequences incrementally
- as soon as a sequence is created, execute it
- use results to guide test generation toward sequences that create new obj states
Randoop’s input
- classes under test
- time limit
- set of contract
condition for violating test cases to exist in the output of Randoop
- no contract should be violated up the the end of the sequence
- the final assertion should fail
Randoop’s algorithm
Create new sequence:
- randomly picks a method m returning T
- for each Ti, pick Si from components that construct object Vi
- assemble n randomly picked sequences to create Snew
Classify new sequence: discard/ output as test/ add to components
illegal sequence
- sequences that crash before the contract is checked
- violates some precondition during execution
redundant sequences
- randoop maintains a set of all objects created in execution of each sequence
- new seq is redundant if obj created in its execution belongs to the above set.
- could also use more sophisticated state equivalent methods
use type information to guide test generation is a characteristic of
both Korat and Randoop
each test is fully independent of the past tests is a characteristic of
neither
generate test deterministically is a characteristic of
Korat
suited to test method seq is a characteristic of
Randoop
Avoid generating redundant test is a characteristic of
both Korat and Randoop
if a list function works for lists of length 0 to 3, what conclusion can we draw?
it probably works for all lists
korat’s precondition technique
- add code to observe its actions
- record fields accessed by the pre-condition
- if the pre-condition doesn’t access a field, it doesn’t depend on the field.
korat’s enumerating tests
- shapes are enumerated by associated vectors
+ initial: all field null
+ next shape generated by expanding last field accessed in pre-condition & backtracking if all possibilities for a field are exhausted
key idea for enumerating tests
- never expand parts of input not examined by pre-condition
- cleverly checks for and discard shapes isomorphic to previously generated shapes
Randoop stands for
Random tester for object-oriented programs
problem with uniform random testing
creates too many illegal and redundant tests
Randoop’s output
contact-violating test cases
randoop redundant sequence technique’s drawbacks
- might miss bugs that can only be trigger by extending a sequence
- might also mistakenly redeem redundant sequences as not redundant
why didn’t automated test generation become popular earlier?
- earlier program languages has weak type systems while automated test generation relies heavily on type information.
- contemporary languages fit better for test generation