Pointer Analysis Flashcards
New Assignment
v = new …
Object copy statement
v1=v2
Object copied from v2 to v1
Field reads
v1 = v2.f
dot operator to access fields from an object
Field writes
v1.f = v2
Read with pointers
v1 = v2[*]
Read writes with pointers
v1[*] = v2
May alias analysis
Pointer Analysis. An analysis that is dedicated to proving facts of this form (not true that x and z may alias)
Must alias
The same circle. X and Z are aliased.
Rules for object copy
First, let’s add alias relationships created through “object copy.”
For the statement v1 = v2, we create a variable node for v1 (if it doesn’t already
exist), and then add a blue arrow from the variable node for v1 to all nodes pointed to
by the variable node for v2.
Again note that we do not remove or replace any existing arrows from v1, such as this
one (gesture). v1 merely accumulates another arrow to B.
In words, v1 now additionally points to what v2 points to.
Rule fields for writes
Field-write
statements can be one of the following forms (gesture to the two forms).
Although the two statements shown here are syntactically different, they will have the
same affect on a points-to graph.
Prior to the field write statement, suppose v1 points to A and v2 points to B.
Additionally, A has a link between its allocation site and the allocation site for the
field f, or equivalently, the array member denoted by [*]. This represents the portion
of memory specifically reserved for the field or array member.
The field write statement modifies the points-to-graph by adding a field relationship,
denoted by a red arrow, from A to the value stored in v2. That is, we add a red arrow
from A to B. Again, we perform a “weak update” and do not remove the field
relationship between A and C.
Now, there are some edge cases we must consider. For instance, what should we do if
v1 and v2 point to the same node? This would be equivalent to the statement v1.f =
v1. As you might have guessed, we will add a red arrow from A to itself.
As another edge case, what should we do if there isn’t already a node for v1 or v2? In
this case, we should skip this statement temporarily and handle it in the next iteration.
Rule for field reads
Now, let’s see how field-read relationships update a points-to-graph. As in fieldwrites,
field-reads also have an equivalent rule update for reads from array members.
Suppose we have the setup shown here prior to the field-read statement. That is, v1
points to the allocation site A, V2 points to the allocation site B, and B has a link to C
denoting field f, or, equivalently, array access [].
Field-reads, in themselves, do not modify any points-to relationships. However, an
assignment to a field read, can indeed modify relationships and thus will have an
affect on our points-to-graph.
Following the shown field-read assignment, the variable v1 now points to the
allocation site for field f (or array member []) of the object pointed to by v2. So, we
add a arrow from v1 to C by following the arrows representing field relationships.
As in all update rules, we perform a “weak update” and keep the points to relationship
between v1 and A. Note that object B may in fact be pointing to many other objects
via arrows labeled by the field in question. In this case we would need to add an arrow
from v1 to each of these nodes in order to reflect the fact that the field f, and therefore
v1, may point to any one of these objects.
Lastly, we must consider the case in which the variable nodes for v1 or v2 do not
exist. As you might guess, we skip the statement temporarily and try to handle it in the next iteration. Likewise, if there is no field relationship from B via f or [*], we skip
the statement for this iteration.
Flow sensitive analysis
Flow-sensitivity concerns how a pointer analysis algorithm models control-flow
within a procedure or function. As we are only concerned with control flow within
each function, we call this intra-procedural control flow.
Flow-insensitive pointer analysis algorithms, like the one we just discussed, ignore
control-flow entirely, viewing the program as an unordered set of statements. A
hallmark of flow-insensitive analyses is that these analyses only generate new facts as
they progress; they never kill any previously generated facts. We observed this in the
case of the pointer analysis algorithm we just saw, wherein the points-to graph only
grew in size as each statement of the program was considered. We say that such
algorithms perform weak updates. Such algorithms typically suffice for may-alias
analysis, that is, it is practical for a may-alias analysis to have a low false positive rate
despite being flow-insensitive.
Flow-sensitive pointer analysis algorithms, on the other hand, are capable of killing
facts in addition to generating facts. We say that such algorithms perform strong
updates. This is possible as we have a distinct program order in which facts that come
later take precedence over facts generated earlier in the program order. Such
algorithms are typically required for must-alias analysis, that is, it is impractical for a
must-alias analysis to have a low false positive rate by being flow-insensitive.
Context
sensitivity
Another common dimension for classifying pointer analysis algorithms is context
sensitivity, which concerns how to handle control-flow across procedures. This
property is called inter-procedural control-flow. We can classify pointer analysis
algorithms as context-insensitive or context-sensitive, based on how they handle interprocedural
control-flow.
Context-insensitive pointer analysis algorithms analyze each procedure once,
regardless of how many different parts of the program call that procedure. These
algorithms are relatively imprecise, as they conflate together aliasing facts that arise
from different calling contexts. But they are very efficient, since they analyze each
procedure only once.
Context-sensitive pointer analysis algorithms, on the other hand, potentially analyze
each procedure multiple times, once per abstract calling context. These algorithms are
relatively precise but expensive. They differ primarily in the manner in which they
abstract the calling context. Like many choices in analysis, the choice of the scheme is
dictated by the desired tradeoff between precision and efficiency.
Heap abstraction
A heap abstraction concerns abstracting
program data, in particular, dynamically allocated objects on the heap.
Heap abstraction. Allocation-site based scheme
This scheme abstracts concrete
objects based on the site in the program where they are created, called the allocation
site. In other words, each abstract object under this scheme corresponds to a separate
allocation site.