JVM Memory Managemnt and Garbage Collection Flashcards
What does it mean by generational collectors?
Most objects in an application are short lived. So to avoid scanning whole heap to clear objects it is better to divide heap in sections based on Generations. Young and Old.
Young Gen sections?
Eden space, where objects are first allocated. Survivor spaces, S0 and S1.
What is old gen also called?
Tenured area. Objects that have survived some number of GC cycles are probable to live long, so are tenured in old gen area.
What is GC that occurs in Young gen called?
Minor GC
What is GC that occurs in Old gen called?
Major GC or Stop the world GC.
Heap structure
Young Gen (Eden + Survivor spaces), Old Gen, Perm Gen (Method area)
When does Minor GC occur?
When Eden space gets full.
What does Minor GC do?
Collects garbage from Young gen. All objects that are referenced are moved to one of the survivor spaces. Minor GC also checks for survivor objects in survivor spaces and moves them to other survivor space. So at any given point in time, one survivor space is empty.
When are objects moved to Old gen?
When objects have survived many minor GC cycles, they are moved to old gen space. This is triggered either by threshold or age of objects.
When does Major GC occur?
When old gen is getting full. It is triggered using some ratio of old gen. Say trigger major GC if 60% gets full.
Which is faster Minor GC or Major GC?
Minor is faster.
What is keep area?
It contains most recently allocated objects and is not garbage collected until the next GC cycle. This is done to prevent objects from being promoted just because they were allocated just before young collection started.
What is metaspace?
In Java 8, there is no perm gen. Unlike Perm gen which is part of heap, Metaspace is not part of heap. Allocation for class metadata is done out of native memory. Metaspace grows in size while PermGen was fixed in size.
PermGen vs Metaspace?
- PermGen till Java 7 and Metaspace from Java 8.
- PermGen is fixed in size while Metaspace can grow
- PermGen is part of heap while Metaspace is not part of heap, it is allocated from native memory.
What is code cache?
Code cache is the area where JIT compiled native code is cached. This area is flushed if its size exceeds given threshold.
-Xms flag
Set the initial size of heap when JVM starts
-Xmx flag
Set the maximum size of heap
-Xmn flag
For setting size of Young gen, rest of the space goes to old gen
-XX:PermGen flag
Initial size of Perm Gen
-XX:MaxPermGen flag
Maximum size of Perm Gen
-XX:SurvivorRatio flag
Ratio of Eden space to survivor. Eg if 10m is size of young gen and survivor ratio is 2, then 5m is allocated to young and 2.5m for each survivor space. Default value is 8
-XX:NewRatio
Ratio of old/new gen size. Default value is 2
Which are the stages of garbage collection
- Marking phase
- Sweeping phase
- Compaction phase optional
What is mark phase?
Mark phase is process of finding which objects are reachable and are marked as live. Marking phase starts from GC roots such as Java threads, native handlers etc.
What is sweep phase?
In sweep phase, the heap is traversed to find gaps between live objects and gaps are recorded in freel list. That space is then available for new object allocation.
What is heap compaction?
It is similar to disk defragmentation. When there are small holes between objects, the performance of object allocation suffers. So during compaction all objects are moved together to create contiguous section of used and free memory.
What is Serial GC?
Uses simple mark-sweep-compact approach for young and old generation. Uses only single thread.
What is Parallel GC?
Uses N parallel threads to do Young gen collection and serially in Old generation. N is equal to number of CPU cores in system.
What is Parallel Old GC?
Same as Parallel new GC but it uses multiple threads for both young and old collection.
What is Concurrent Mark and Sweep collector>
It is referred to as concurrent low pause collector. CMS works on Old generation and algorithm for new generation is same as parallel collector. Suitable for responsive applications which cannot afford high pause times.
G1 garbage collector or “Garbage first” collector?
Available from Java 7, to replace CMS in long run. It is parallel, concurrent and incrementally low pause collector. There is no concept of young or old gen. It divides heap in multiple equal regions. It first collects regions with least live data, so “Garbage first”
- Meant for multiprocessor machines with large memories
Why is G1 called Garbage first?
Because it divides heap into equal sized regions. When GC is invoked it collects region with lesser live objects, so it is called garbage first.
-XX:+UseSerialGC
Activate serial GC
-XX:+UseParallelGC
Activate parallel new GC
-XX:+UseParallelOldGC
Activate old parallel GC which uses parallel threads for both old and new collection.
-XX:+UseConcMarkSweepGC
Activate CMS collector
-XX:ParallelGCThreads=N
Number of threads used by Parallel GC
-XX:+UseG1GC
Activate G1 GC
–XX:ParallelCMSThreads=n
Number of CMS threads
How to detect memory leak?
Overall memory utilization is increasing continuously and memory is not reaching to base level even after garbage collection.
What is churn rate?
Rate at which the application is allocating new objects. Number of young generation collections provides information on it. The higher the number, the more the churn rate.
What issue arrises with high churn rate?
It negatively impacts response time because minor GC is triggered frequently. Also the old gen fills quickly because young generation cannot cope with quantity of objects.
What is GC pressure?
GC pressure occurs when churn rate is high and objects are pre maturely tenured. This indicates sizing issue of heap or too high churn rate.
What indicates there is GC pressure or Heap sizing issue?
Old generation fluctuates greatly, then objects are being copied unnecessarily from young generation. Either young is too small, churn rate is too high.
jstat utility
For diagnosing performance issues with GC, heap sizing. Does not need any flags to be enabled. Included in JDK by default.
jstat -gc
jmap -heap
Provides heap information
- Information specific to GC algorithm, threads
- Heap configuration provided at command line
- Heap usage, capacity, free. Region wise details are provided.
jmap -histo
TBA
jcmd
Introduced in Java 8 and should be preferred over jmap. Used to send diagnostic command requests to the JVM, which are useful for diagnosing application.
jcmd GC.heap_dump_filename=
Takes a heap dump (hprof)
jmap -dump:file=
Takes a heap dump in hprof format.
jhat
Heap analysis tool provides convinient way to browse object topology in heap snapshot. It parses binary heap dump created using jcmd. Useful for finding memory leaks.
Common cases of memory leaks
- Some global map where the reference of object is being hold and not unregistered when unused
- An object has registered an anonymously created listener and did not unregister than listener. So even though the main object is not referenced but the listener is still holding reference to this.
- Classloader leak
jhat
Analyses the heap profile and starts a web server on which various queries can be performed using OQL (Object query language)
HPROF
Is a tool for heap and CPU profiling shipped with JDK. It is useful for analysing performance, lock contention, memory leaks and other issues.
VisualVM
GUI tool that can do CPU sampling, Memory sampling, run garbage collections, analyze heap errors, take snapshots and more.
JMX remote ports
- Dcom.sun.management.jmxremote
- Dcom.sun.management.jmxremote.port=
We need to enable JMX remote ports to connect to remote machine and view CPU utilization. It is also possible to generate thread dump and heap dump on remote machine.
-XX:+HeapDumpOnOutOfMemoryError
Takes heap dump if out of memory error occurs while running application. This will create .hprof file
How will you find method that is taking too much CPU?
Using CPU sampling of VisualVM. It will show which methods are taking most CPU.
Explain mark and sweep garbage collectors
Mark phase marks the objects that are still alive. Sweep phase adds all dead objects to freelist and compact phase to compact memory after unused objects have been removed.
Explain copying garbage collector?
There are two spaces from and to. Marking phase occurs in from space. All the objects that are live are then moved to to space with compaction. So there is no different compaction phase.
When are objects moved to old gen?
- Object survives certain number of garbage collections
- Survivor space gets full
-XX:+AlwaysTenure flag
TBA
-XX:PretenureSizeThreshold=n
Objects larger than n bytes are allocated directly in Old gen
TLABs (Thread Local Allocation Buffers)
In multi-threaded environment multiple threads will allocate memory and there is one pointer. So to do that we would need synchronization. But that is slow. So to improve performance each thread gets its own buffer in Eden space where it can allocate. So no locking is required and allocation is really fast.
What are live roots?
A reference to any object from
- stack frame (represents running functions. So any objects being referenced from stack frame must be live references)
- static variables (statics are global and so any objects that are referenced by statics will also be global and so must be kept live)
- JNI, synchronization
Garbage collection uses these live roots to mark other live objects that are being referenced by these roots.
Why we need card table?
When an object in old generation is referencing an object in new generation, the minor garbage collection will not know that object in young is being referred to from old gen. To do that GC would have to scan the old gen. And that defeats the purpose of generational collector. So to overcome that limitation we need card table.
What is CardTable?
- When a write to a reference to a young gen object happens it goes through write barrier
- Write barrier triggers code in JVM
- That method updates entry in table called cardtable
- One entry per 512 bytes of memory
- Minor GC scans table looking for any change data
- Load that memory and follow reference and marked to be in use
Which GC cyle uses CardTable?
Minor GC cycle uses CardTable to take care of objects which are referred to from objects in old gen.
Is Serial collector stop the world?
Yes, it is stop the world and a mark and sweep collector.
CMS collector runs o n which region
Old gen
Does CMS cause heap fragmentation?
Yes
CMS stages
- Initial Mark (stop the world) follow root references
- Concurrent Mark (concurrent) traverse graph looking for live objects. Any new allocations made are considered alive
- Remark (stop the world) Find objects created after concurrent phase stopped
- Concurrent Sweep (concurrent) collect objects
- Resetting (concurrent) reset for next run
G1 Garbage collector is which type of garbage collector?
Compacting collector
Which utility can be used keep track of number of young and old garbage collections count?
jstat -gc prints YGC and FGC which is young garbage collection count and full garbage collection count.
Which plugin is useful for GC information in VisualVM?
VisualGC is a plugin for VisualVM which provides depth of information regarding garbage collection.
What type of references are in Java by default?
Java references are by default strong references
What are the other type of references available in Java?
SoftReference,
WeakReference,
PhantomReference
When is soft reference collected?
Soft reference will not be garbage collected on normal times, but will be GCed if there is memory pressure.
When is weak reference collected?
Weak reference will never keep object alive, when GC runs and there is no soft or weak reference pointing to that object but only a weak reference, then it will be garbage collected.
References in terms of strength
Strong > Soft Reference > Weak Reference > Phantom Reference
Which type of reference can be used for caching?
Soft Reference, because they are only collected when there is GC pressure.
Which type of reference can be used to associate meta data with other type?
Weak reference. In conjunction with WeakHashMap can also be used.
What is typical use of Phantom reference?
To interact with Garbage collector.
Is it recommended to use Soft reference for caching?
NO. Because it’s all up to garbage collector to which objects it will collect. We cannot control if we need LRU, LFU kind of caching strategies.
What is ReferenceQueue in java?
Java reference types Soft, Weak and Phantom take constructor arguments as ReferenceQueue. When all strong references to an object are cleared then the reference object is added to the reference queue. This is useful for associating some cleanup stuff.
What is relation between live objects and GC performance?
The more live objects there are, the slower GC cycle will be. The more objects die, the faster garbage collection is.
Which memory problem occurs when some objects are freed and some remain alive?
Memory fragmentation issue. If there are random sized holes in memory, allocation of new objects takes more time because need to find a suitable sized hole for the object.
What is the solution for memory fragmentation?
Compaction. Most GC algorithms do compaction. This eradicates the need for free lists. But downside is that compaction is not concurrent in most GCs and so application is suspended during that time, causing throughput to go down.
Parallel collector is useful for which type of applications?
Throughput applications. Like batch processing applications where response time is not as important.
Concurrent collector is useful for which types of applications?
Response time sensitive applications. Which need as low pause times as possible.
Disadvantages of concurrent collector?
- Memory fragmentation
- More complicated algorithm and more cpu cycles wasted
- Much fine tuning is required and more flags to work with.
How to reduce impact of compaction?
- Don’t run compaction on each GC cycle
- Set threshold for compaction that 50% memory fragmented then do it.
- Also don’t compact whole memory only till the time we achieve desired threshold like 50%
Which GC strategy is used in young generation?
Copying strategy is used. Copying alive objects and moving them to other area and declaring old area as empty. This is much faster and simpler than sweeping and compaction. But it counts on most of the objects dying in young. The advantage is no fragmentation.
When does Young generation copying strategy cause problem?
When the application is executing a high number of transactions. If young generation is too small, the objects are tenured pre maturely to old. If young is too large, many objects stay alive and the GC cycle will take too long.
Are minor GCs stop the world events?
Yes. Minor GC is full stop the world event. Can be done paralelly using parallel collector. So too frequent minor GCs can also grind application down to its knees.
Which is most important JVM flag for garbage collection logging?
verbose:gc
Reasons for old generation to fluctuate greatly without rising after GC?
- The young gen is too small
- there’s high churn rate
- too much transactional memory usage
While young generation tuning what can be done to deal with middle aged objects that are tenured in old gen?
Use a concurrent GC, and let it handle spillover. But spill over should not be too great. In that case we need to reduce/optimize transactional memory usage of application.
Tuning old gen for response time sensitive applications
Need to tune both concurrent GC thresholds and the size of old generation so that average fill rate is not more than 75%. 25% is needed by concurrent GC. If old is too full, concurrent GC will not be able to free enough memory and lead to real Full GC. Stop the world!!
Source of Memory leaks?
- Mutable static fields and collections
- Thread local variables
- Forgetting to unregister a listener or unsubscribing
- Bi directional references (Like Node in an XML document internally holds reference to container document object)
What is transactional memory usage?
Transactional memory usage describes how much memory a transaction keeps alive. Too many transactions and temporary objects will be tenured into old gen.
Transactional memory and throughput tradeoff
The more concurrency you expect in production, the lower transactional memory application should use.
Which are the measures of performance?
- Response time
- Throughput
- Availability
Is average a good way to calculate performance?
No. With averaging you lose fluctuations over longer time. While smaller averages taken over small number of measurements are imprecise. You lose peak values.
With average response time what should also be maintained?
With average, peak response time should also be maintained. We maintained 1 min peak response time, 10 mins, 1 hour peak.
Is Median better than Mean?
Yes. Median does not fabricate the value artificially. It is the middle value. So is close to actual reality.
Average vs Median vs Percentile which is best way to measure performance?
Percentile is most precise. If 95th percentile of application response time is 2 ms, then 95% of requests are served in 2ms or less. But it is difficult to calculate. More data is required as compared to average.
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
JVM starts logging GC times and details of following format
0.291: [GC (Allocation Failure) [PSYoungGen: 33280K->5088K(38400K)] 33280K->24360K(125952K), 0.0365286 secs] [Times: user=0.11 sys=0.02, real=0.04 secs]
Explain this output of GC log for Par New collection
[GC[ParNew: 6528K->702K(6528K), 0.0130227 secs] 469764K->465500K(522488K), 0.0130578 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
6528K is the space in the young generation occupied by objects at the start of the ParNew collection. Not all those objects are necessarily alive.
702K is the space occupied by live objects at the end of the ParNew collection.
6528K is the total space in the young generation.
0.0130227 is the pause duration for the ParNew collection.
469764K is the space occupied by objects in the young generation and the old (CMS) generation before the collection starts.
465500K is the space occupied by live objects in the young generation and all objects in the old (CMS) generation. For a ParNew collection, only the liveness of the objects in the young generation is known so the objects in the old (CMS) generation may be live or dead.
522488K is the total space in the heap.
[Times: user=0.05 sys=0.00, real=0.01 secs] is like the output of time(1) command. The ratio user / real give you an approximation for the speed up you're getting from the parallel execution of the ParNew collection. The sys time can be an indicator of system activity that is slowing down the collection. For example if paging is occurring, sys will be high.
Explain GC log for CMS initial mark phase
2015-05-26T16:23:07.321-0200: 64.42: [GC (CMS Initial Mark[CMS-initialmark: 10812086K(11901376K)] 10887844K(12514816K), 0.0001997 secs] [Times: user=0.00
sys=0.00, real=0.00 secs]
- 2015-05-26T16:23:07.321-0200: 64.42 – Time the GC event started, both clock time and relative to
the time from the JVM start. For the following phases the same notion is used throughout the event and is
thus skipped for brevity. - CMS Initial Mark – Phase of the collection – “Initial Mark” in this occasion – that is collecting all GC
Roots. - 10812086K – Currently used Old Generation.
- (11901376K) – Total available memory in the Old Generation.
- 10887844K – Currently used heap
- (12514816K) – Total available heap
- 0.0001997 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] – Duration of the phase,
measured also in user, system and real time.
Explain GC logs of CMS Concurrent Mark Phase
2015-05-26T16:23:07.321-0200: 64.425: [CMS-concurrent-mark-start]
2015-05-26T16:23:07.357-0200: 64.460: [CMS-concurrent-mark: 035/0.035 secs] [Times:
user=0.07 sys=0.00, real=0.03 secs]
- CMS-concurrent-mark – Phase of the collection – “Concurrent Mark” in this occasion – that is traversing
the Old Generation and marking all live objects. - 035/0.035 secs – Duration of the phase, showing elapsed time and wall clock time correspondingly.
- [Times: user=0.07 sys=0.00, real=0.03 secs] – “Times” section is less meaningful for concurrent
phases as it is measured from the start of the concurrent marking and includes more than just the work
done for the concurrent marking.
Explain GC logs of CMS Concurrent Pre clean Phase
2015-05-26T16:23:07.357-0200: 64.460: [CMS-concurrent-preclean-start]
2015-05-26T16:23:07.373-0200: 64.476: [CMS-concurrent-preclean: 0.016/0.016
secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
- CMS-concurrent-preclean – Phase of the collection – “Concurrent Preclean” in this occasion –
accounting for references being changed during previous marking phase. - 0.016/0.016 secs – Duration of the phase, showing elapsed time and wall clock time correspondingly.
- [Times: user=0.02 sys=0.00, real=0.02 secs] – The “Times” section is less meaningful for
concurrent phases as it is measured from the start of the concurrent marking and includes more than just
the work done for the concurrent marking.
Explain GC logs of CMS Concurrent Aboratable Pre Clean Phase
2015-05-26T16:23:07.373-0200: 64.476: [CMS-concurrent-abortable-preclean-start]
2015-05-26T16:23:08.446-0200: 65.550: [CMS-concurrent-abortable-preclean: 0.167/1.074
secs] [Times: user=0.20 sys=0.00, real=1.07 secs]
- CMS-concurrent-abortable-preclean – Phase of the collection “Concurrent Abortable Preclean” in this
occasion - 0.167/1.074 secs – Duration of the phase, showing elapsed and wall clock time respectively. It is
interesting to note that the user time reported is a lot smaller than clock time. Usually we have seen that
real time is less than user time, meaning that some work was done in parallel and so elapsed clock time is
less than used CPU time. Here we have done a little amount of work – for 0.167 seconds of CPU time, and
garbage collector threads just waited for something for almost a second, not doing any work. - [Times: user=0.20 sys=0.00, real=1.07 secs] – The “Times” section is less meaningful for
concurrent phases, as it is measured from the start of the concurrent marking and includes more than just
the work done for the concurrent marking.
Explain CMS Concurrent Abortable Pre Clean Phase
A concurrent phase that is not stopping the application’s
threads. This one attempts to take as much work off the shoulders of the stop-the-world Final Remark as
possible. The exact duration of this phase depends on a number of factors, since it iterates doing the same thing
until one of the abortion conditions (such as the number of iterations, amount of useful work done, elapsed wall
clock time, etc) is met.
Explain GC Logs of Final Remark Phase of CMS
2015-05-26T16:23:08.447-0200: 65.550: [GC (CMS Final Remark) [YG occupancy: 387920 K
(613440 K)]65.550: [Rescan (parallel) , 0.0085125 secs] 65.559: [weak refs processing,
0.0000243 secs]65.559: [class unloading, 0.0013120 secs]65.560: [scrub string table,
0.0001759 secs][CMS-remark: 10812086K(11901376K)] 11200006K(12514816K) , 0.0110730
secs] [[Times: user=0.06 sys=0.00, real=0.01 secs]
- 2015-05-26T16:23:08.447-0200: 65.550 – Time the GC event started, both clock time and relative to
the time from the JVM start. - CMS Final Remark – Phase of the collection – “Final Remark” in this occasion – that is marking all live
objects in the Old Generation, including the references that were created/modified during previous
concurrent marking phases. - YG occupancy: 387920 K (613440 K) – Current occupancy and capacity of the Young Generation.
- [Rescan (parallel) , 0.0085125 secs] – The “Rescan” completes the marking of live objects while
the application is stopped. In this case the rescan was done in parallel and took 0.0085125 seconds. - weak refs processing, 0.0000243 secs]65.559 – First of the sub-phases that is processing weak
references along with the duration and timestamp of the phase. - class unloading, 0.0013120 secs]65.560 – Next sub-phase that is unloading the unused classes,
with the duration and timestamp of the phase. - scrub string table, 0.0001759 secs – Final sub-phase that is cleaning up symbol and string tables
which hold class-level metadata and internalized string respectively. Clock time of the pause is also
included. - 10812086K(11901376K) – Occupancy and the capacity of the Old Generation after the phase.
- 11200006K(12514816K) – Usage and the capacity of the total heap after the phase.
- 0.0110730 secs – Duration of the phase.
- [Times: user=0.06 sys=0.00, real=0.01 secs] – Duration of the pause, measured in user, system
and real time categories.
Explain the Final Remark Phase of CMS
This is the second and last stop-the-world phase during the event. The goal of this
stop-the-world phase is to finalize marking all live objects in the Old Generation. This means traversing the Old
Generation starting from the roots determined in the same way as during the Initial Mark plus the so-called “dirty”
objects, i.e. the ones that had modifications to their fields during the concurrent phases.
Usually CMS tries to run final remark phase when Young Generation is as empty as possible in order to eliminate
the possibility of several stop-the-world phases happening back-to-back
Explain GC Logs of Concurrent Sweep Phase of CMS
2015-05-26T16:23:08.458-0200: 65.56: [CMS-concurrent-sweep-start]
2015-05-26T16:23:08.485-0200: 65.588: [CMS-concurrent-sweep: 0.027/0.027 secs`]
- CMS-concurrent-sweep – Phase of the collection “Concurrent Sweep” in this occasion, sweeping
unmarked and thus unused objects to reclaim space. - 0.027/0.027 secs – Duration of the phase showing elapsed time and wall clock time correspondingly.
- [Times: user=0.03 sys=0.00, real=0.03 secs] – “Times” section is less meaningful on concurrent
phases, as it is measured from the start of the concurrent marking and includes more than just the work
done for the concurrent marking
Explain GC Logs of Concurrent Reset phase of CMS
2015-05-26T16:23:08.485-0200: 65.589: [CMS-concurrent-reset-start]
2015-05-26T16:23:08.497-0200: 65.60: [CMS-concurrent-reset1: 0.012/0.012 secs]
- CMS-concurrent-reset – The phase of the collection – “Concurrent Reset” in this occasion – that is
resetting inner data structures of the CMS algorithm and preparing for the next collection. - 0.012/0.012 secs – Duration of the the phase, measuring elapsed and wall clock time respectively.
- [Times: user=0.01 sys=0.00, real=0.01 secs] – The “Times” section is less meaningful on
concurrent phases, as it is measured from the start of the concurrent marking and includes more than just
the work done for the concurrent marking.