Constraint based analysis Flashcards
Benefits of constraint based analysis
Separates analysis specs from implementation. Hence analysis writer focuses on what rather than on how.
Another benefit of constraint-based analysis is that it yields natural program
specifications: just like types in a type system, constraints are usually defined locally,
and solving their conjunction captures global properties about the program.
Finally, the modularization of the program analysis task into a specification and an
implementation sub-problem allows the specification to be agnostic of the
implementation. In other words, we can leverage powerful, off-the-shelf constraint
solvers in a “plug-and-play” manner, giving us flexibility that would otherwise not be
available.
Datalog
a constraint language.
Datalog is a declarative logic programming language.
It is not a Turing-complete language: it can be viewed as a subset of Prolog, or as
SQL with recursion. Efficient algorithms exist to evaluate programs in these
languages, so there exist efficient algorithms to evaluate Datalog programs.
Datalog originated as a query language for deductive databases. It was later applied in
many other domains, including software analysis, data mining, networking, security,
knowledge representation, and cloud computing among others.
Two static analyses in Datalog
an intra-procedural analysis in Datalog, that is, an
analysis that is restricted to a single procedure. How to
specify computing reaching definitions.
an inter-procedural analysis in Datalog, that is, an
analysis of a program involving multiple procedures. How
to specify computing points-to information.
intra-procedural dataflow analysis
The specification of reaching definitions analysis is as follows:
OUT[n] = (IN[n] - KILL[n]) ∪ GEN[n]
IN[n] = U n’ ∈ predecessors[n] OUT[n’]
KILL[n]
set of definitions killed at program point n
GEN[n]
the set of
definitions generated at program point n
predecessors(n)
the set of program
points that immediately precede program point n in the input procedure’s control-flow
graph.
define the relation kill(n:N, d:D)
the definition d is in the KILL set
of program point n …
the relation gen(n:N, d:D) to mean that the definition d is in the GEN set of program
point n …
the relation gen(n:N, d:D)
the definition d is in the GEN set of program
point n …
next(n:N, m:N)
that program point m is an immediate
successor of program point n, or equivalently, that program point n is an immediate
predecessor of program point m.
in(n:P, d:D)
the relation that asserts that the
definition d is a member of the IN set of program point n—that is, definition d may
reach the program point just before n …
out(n:P, d:D)
definition d is a member of the OUT
set of program point n—that is, definition d may reach the program point just after n.
rules of inference to compute the IN and OUT sets.
out(n, d) :- gen(n, d).
out(n, d) :- in(n, d), !kill(n, d).
in(m, d) :- out(n, d), next(n, m).