Garbage Collection (L13) Flashcards
Which type of memory allocation must be explicitly handled by the programmer (or by GC)?
Dynamic.
What is a memory leak?
Failure to reclaim unused, previously dynamically
allocated memory.
What is the most fundamental, two phase garbage collection technique called?
Mark and sweep.
What are tracing collectors?
These are a class of garbage collectors that identify/reclaim unreachable objects by tracing the program’s execution.
What is the root set?
It is the initial set of objects that are directly accessible and serve as the starting point for marking all other reachable objects.
What types of variables make up the root set?
Any global variables, and variables/references on the current call stack.
Describe the mark phase of the mark and sweep algorithm.
Starting from the root set, all accessible objects in memory are looked at and marked as “in use” by setting a flag (if in use).
When are in use flags cleared?
At the start of a GC cycle (prior to mark phase).
Describe the sweep phase of the mark and sweep algorithm.
The GC cycles through all objects in the heap and, if their flags aren’t set, reclaims their memory.
Does GC run continuously?
No, since it steals processing cycles away from the application.
What is the most common strategy for choosing to invoke the garbage collector?
When memory is low.
Can the GC be invoked on a predefined schedule?
Yes.
In practice, when do (naive) mark and sweep algorithms begin to fail?
In large applications.
What is the tri colour approach?
GC in which candidates for reclamation are iteratively moved between three sets until it is certain that they are no longer required.
What is the generational technique?
GC in which the age of objects impacts the frequency of testing - older objects are likely long term and thus considered less.
Is it possible to provide GC for languages like C?
Yes, using an external collector that the compiler does not need to directly support.
Name an external GC that can be used by C?
Boehm GC.
What algorithm does the Boehm GC use?
Mark and sweep.
The Boehm GC is considered a black box, meaning what?
It is not integrated into the C compiler.
The Boehm GC is conservative, meaning what?
It may have false negatives (items that could be reclaimed, but aren’t), but never has false positives (never reclaims items that shouldn’t be).
Which function call does the BGC replace with its own version?
Stadard malloc calls - functionality remains the same, but BGC maintains additional information about all dynamically allocated memory.
What memory regions are monitored by the BGC?
The stack frame and static region.
BGC knows a word sized value on the stack isn’t a pointer value if it cannot be interpreted as what kind of address?
As a valid heap address.
How do transient memory leaks occur?
Transient memory leaks occur when BGC interprets non-address values on the current stack frame as valid pointers.
Why are transient memory leaks considered transient?
Transient as in temporary, and temporary because the memory leak is rectified once the stack frame ends.
Why are transient memory leaks not common on modern 64 bit systems?
Because the heap represents a tiny portion of the total process space - it is thus unlikely for random sequences of bits to present a valid pointer.
Why do performance sensitive systems use manual collection over GC?
Manual collection is faster and more predictable.
What GC method does Python implement?
Reference counting.
Describe reference counting.
References/assignments are kept track of, the memory they point to is reclaimed when none are left.
Why are the issues with reference counting?
Cyclic references are difficult to detect, and lots of overhead due to constant count updates.
Does reference counting fall under the umbrella of tracing collectors?
No.