low latency Flashcards
how to create immutable objects in java
if you variables are not provided any setters
and
if they are declared as final ( which implies they need to initialized within the constructor or defaulted to a value )
Take care the variable type say Employee has Address .. address should also be immutable
what benefit does creating immutable objects provide
As well they are side-effect free and can be freely passed across multiple threads without them being affected.
why is String object immutable in java
String is statistically the most used data structure in java so it needed optimization
a) String pools are made to reuse String literals so lesser object with same values are created . This helps save memory .. but then that necessitates that they be immutable so that multiple references can point to the same object and be sure its value is not changed
b) anything immutable can be cached safely
c) String also then becomes threadsafe
d) Security .. if another thread calls and changes the value of a string then that doesnt affect you .
void criticalMethod(String userName) {
// perform security checks
if (!isAlphaNumeric(userName)) {
throw new SecurityException();
}
// this security check would be useless if someone can change the value of userNAme after this check is done
// do some secondary tasks
initializeDatabase();
// critical task connection.executeUpdate("UPDATE Customers SET Status = 'Active' " + " WHERE UserName = '" + userName + "'"); } 4) Helps in performance whereby we have decreased the size of heap being used. In java 9 they added compaction of string from char[] to be a byte[] in case you are using latin1 characters only
5) Hashcode of String object is only computed once and then its cached. Subsequent call to hashcode returns the cached value .. This in turn helps make collections like hashmap/ hashtable and hashset faster
java 17 new features
pattern matching for switch and instanceof
the case statement doesnt have to be a constant value .. and swtich can work across objects and also initialise the variabel i in this case .
public static String getInfo(Object object) {
return switch (object) {
default -> “Unknown”;
case Integer i -> (i < 0 ? “Negative” : “Positive”)+ “ integer”;
case String s -> “String of length “ + s.length();
case null -> “Null!”;
};
}
whats a race condition
Anytime the behavior of a multi threaded application depends on the timing of the interleaving threads giving inconsistent results can be called a race condition
Especially true for all check then act operation
For example c++
is equal to
read value of c
increment that value by 1
assign new value to C
In case multiple threads are involved and they run you may get an output of 1 ,1 or 1,2 with an input of 0
Lazy initialisation also has this issue
check if object exists and if not create an object
what problem does volatile variables address
has to do with the java memory model. the idea of threads working with their own thread cached shared resources.
Essentially there is a Visibility problem that means other threads don’t see the changes that were made by one thread.
The thread cache and main memory values might differ.
volatile bypasses caching and values are updated in the main memory … and each thread gets the updated value .. they dont have re-read it
low level observers.. each thread using a volatile variable would refresh all values in the current scope when the volatile variables are updated.
Atomic variables also address the visibility issues.( similar to volatile)
Write about Atomic Integers and the non-blocking concurrency setup
AtomicInteger or AtomicReference have added implementation for the compareandswap functionality
Typical use
AtomicInteger i = new AtomicInteger(0)
do{
int oldvalue= i.get() ;
int newvalue =i++;
}
while (!i.compareAndSwap(oldvalue, newvalue)
so eventually the value is set by multiple threads. .CAS returns true or false based on if it was able to do the swap
The AtomicInteger has functions like
getandIncrement or incrementandGet() that work on the above principle and hence can be used in multi threaded application.
-XX:+UseStringDeduplication
G1 collector option only
JVM will try to eliminate duplicate strings as part of garbage collection process
Works only on long lived objects
To see String deduplication statistics, such as how much time it took to run, how much duplicate strings were evacuated, how much savings you gained, you may pass ‘-XX:+PrintStringDeduplicationStatistics’
advantages of keeping xms and xmx the same
system starts up faster
there is no allocation and deallocation of memory from and to the OS as per systems usage of the memory
the max memory is retained for the process at startup so new heavy neighbours wont affect you
heap dump tool
jmap
load heapdum to eclipse MAT or jprofiler/Yourkit
usual cause of memory usages being high
, considerable amount memory is wasted because of inefficient programming practices such as: Duplicate object creation, suboptimal data type definitions (declaring ‘double’ and assigning only ‘float’ values), over allocation and underutilization of data structures and several other practices.
boxed wrappers take a whole lot of memory when compared to primitives along with duplicate strings
you can consider interning strings in your code
you can consider using primitive as much as possible
there are specialised collection framworks from apache that allows primitive values and they are way faster
fastutils is one such
get rid of boxed numbers
GC logging
Enabling GC logging on your application adds almost zero overhead
The GC log rich set of information, such as timestamps of GC events, types of events (Young GC, Full GC, Mixed GC), duration of events, sizes of internal JVM memory regions (Young Gen, Old Gen, Metaspace) before and after the event, time spent in the JVM and Kernel during GC events, number of GC threads used, duration each GC thread took to complete, string deduplication details…
how do you choose a GC
if memory is higher than 32
then
a) if GC throughput is important then go for parallelgc
b) if low pause is a requirement then go for shenandoah or Z GC
if memory is less than 32 then go for
a) if gc throughput isimportant then go for paralllelgc
b) else go for G1 GC.
What do you get by doing GC tuning /
what are the benefits of gc tuning
by reducing the gc pause you can improve response times
can improve throughput
can right size the memory / hence cost
can right size the cpu by reducing the cpu used by GC /hence cost
unearth memory problems earlier by monitoring gc logs
look at GC thread cout in a threaddump
XX:ParallelGCThreads=n: Sets the number of threads used in parallel phase of the garbage collectors.
-XX:ConcGCThreads=n: Controls the number of threads used in concurrent phases of garbage collectors.
So if your JVM is running on server with 32 processors, then
ParallelGCThread value is going to be: 23(i.e. 8 + (32 – 8)*(5/8))
ConcGCThreads value is going to be: 6 (i.e. max(25/4, 1)
how to tune GC
start from scratch .. old jvm arguements added at the first iteration may be obsoleted or counterproductive with newer algorithms or not meet the currect system requirements. There are 600 vm arguments related to memory and gc
. Tune GC Algorithm Settings
Adjust Internal Memory Regions Size
a. Young Generation
b. Old Generation
c. MetaSpace
d. Others
Reduce Object Creation rate : this is at an application level
Tuning G1 GC what do you choose
first and foremost
pause time goal.: ‘-XX:MaxGCPauseMillis’ or something
2)Avoid setting young gen size
3) Remove old arguments
4) XX:+UseStringDeduplication’ argument.
seconds
no of parallelgc threads
no of concurrent gc threads
at what threshold is gc triggered
the size of the region itself
overall heap size
https://blog.gceasy.io/simple-effective-g1-gc-tuning-tips/#:~:text=G1%20GC%20is%20an%20adaptive%20garbage
Memory Analysis toolkit
jps : to find the process ids of java processes in general
jmap -histio : to generate an object histogram of sorts . a snapshot of objects in memory for a PID
jmap is also used to get heapdump
jmap -dump :file=<filelocation> PID</filelocation>
we can also set XX: heapdumponOOM property for the jvm
Eclipse MAT is great for analysiing the hprof file generated by heapdump .. is lightweight
JvisaulVM .. not shipped with jdk 9 onwards can be used to do memory profile / take a dump of threads/ heap etc.
Java mission control is licensed by Oracle and is great for profiling of apps but not available as part of OpenJDK
What is memory leak
memory leak is when an object allocated in memory but no longer needed is not getting garbage collected
In an OOM / heap dump scenario you look at what is Retained Heap size
Shallow heap size is the size of the object itself
Retained is the size of all the objects and their references . So you want to target those with highest Retained heap size
in memory profiling you look at the generational count of objects ..
a count of 0 : object has not gone through any round of gc
1 : one round
if you notice that the generational count is increasing constantly then that can be an indicator of an object not getting garbage collected at all . worth investigation
visualvm also allows to view stack traces to see where in the code that object was created.
most common causes of memory leak
1) Memory Leak Through static Fields
In Java, static fields have a life that usually matches the entire lifetime of the running application
If collections or large objects are declared as static, then they remain in the memory throughout the lifetime of the application.
2) Unclosed resources
connections/input streams / sessions should be closed carefully thinking about scenarios where an exception might lead to these not getting closed properly
close in finally block or use try with resources thing
how improper equals and hashcode can lead to memory leak?
Improper equals() and hashCode() Implementations
Something like this would add 100 objects of the same type. the heap size is 100 instead of 1.
@Test
public void givenMap_whenEqualsAndHashCodeNotOverridden_thenMemoryLeak() {
Map<Person, Integer> map = new HashMap<>();
for(int i=0; i<100; i++) {
map.put(new Person(“jon”), 1);
}
Assert.assertFalse(map.size() == 1);
}
ORM tool like Hibernate, uses the equals() and hashCode() methods to analyze the objects and saves them in the cache. IT may end up saving duplicates in cache
When defining new entities, always override the equals() and hashCode() methods.
It’s not enough to just override, these methods must be overridden in an optimal way as well.
Other memory leaks
. Inner Classes That Reference Outer Classes
non-static inner classes (anonymous classes). keeps an implicit reference to its containing class
If we use this inner class’ object in our application, then even after our containing class’ object goes out of scope, it won’t be garbage collected.
solution : use ZGC or make the inner class static.
Finalize
The finalize() method is called the finalizer. Its defined just as any other method in which you try to close resources and GC has to invoke this method before doing actual GC.
Usually it doesnt help GC that it has to wait and try to execute this first . 8
public void finalize() {
try {
reader.close();
System.out.println(“Closed BufferedReader in the finalizer”);
} catch (IOException e) {
// …
}
}
Threadlocals and memory leak ?
threadlocals are values/references associate with a thread.
In some applications/frameworks that use threadpools.. the threads once created serve multiple request and stay alive for real long .
hence Threadlocal objects can stay alive in memory even though they are no longer being used ..
Solution : Remove the threadlocal value once you have used the threadlocal.get or in the finally threadlocal.remove()
Imagine you had a ThreadLocal<Arraylist<String>> locallist and it stays in memory even though not being relevant anymore!</String>
define latency and throughput
Latency is the time required to perform some action or to produce some result. Latency is measured in units of time – hours, minutes, seconds, nanoseconds or clock periods.
Throughput is the number of such actions executed or results produced per unit of time. This is measured in units of whatever is being produced (cars, motorcycles, I/O samples, memory words, iterations) per unit of time
Generally, you should aim for maximal throughput with acceptable latency.