low latency Flashcards

1
Q

how to create immutable objects in java

A

if you variables are not provided any setters
and
if they are declared as final ( which implies they need to initialized within the constructor or defaulted to a value )
Take care the variable type say Employee has Address .. address should also be immutable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what benefit does creating immutable objects provide

A

As well they are side-effect free and can be freely passed across multiple threads without them being affected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why is String object immutable in java

A

String is statistically the most used data structure in java so it needed optimization

a) String pools are made to reuse String literals so lesser object with same values are created . This helps save memory .. but then that necessitates that they be immutable so that multiple references can point to the same object and be sure its value is not changed

b) anything immutable can be cached safely
c) String also then becomes threadsafe
d) Security .. if another thread calls and changes the value of a string then that doesnt affect you .

void criticalMethod(String userName) {
// perform security checks
if (!isAlphaNumeric(userName)) {
throw new SecurityException();
}
// this security check would be useless if someone can change the value of userNAme after this check is done
// do some secondary tasks
initializeDatabase();

// critical task
connection.executeUpdate("UPDATE Customers SET Status = 'Active' " +
  " WHERE UserName = '" + userName + "'"); } 4) Helps in performance whereby we have decreased the size of heap being used. In java 9 they added compaction of string from char[] to be a byte[] in case you are using latin1 characters only

5) Hashcode of String object is only computed once and then its cached. Subsequent call to hashcode returns the cached value .. This in turn helps make collections like hashmap/ hashtable and hashset faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

java 17 new features

A

pattern matching for switch and instanceof
the case statement doesnt have to be a constant value .. and swtich can work across objects and also initialise the variabel i in this case .

public static String getInfo(Object object) {
return switch (object) {
default -> “Unknown”;
case Integer i -> (i < 0 ? “Negative” : “Positive”)+ “ integer”;
case String s -> “String of length “ + s.length();
case null -> “Null!”;
};
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

whats a race condition

A

Anytime the behavior of a multi threaded application depends on the timing of the interleaving threads giving inconsistent results can be called a race condition
Especially true for all check then act operation

For example c++
is equal to
read value of c
increment that value by 1
assign new value to C

In case multiple threads are involved and they run you may get an output of 1 ,1 or 1,2 with an input of 0

Lazy initialisation also has this issue
check if object exists and if not create an object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what problem does volatile variables address

A

has to do with the java memory model. the idea of threads working with their own thread cached shared resources.

Essentially there is a Visibility problem that means other threads don’t see the changes that were made by one thread.

The thread cache and main memory values might differ.

volatile bypasses caching and values are updated in the main memory … and each thread gets the updated value .. they dont have re-read it

low level observers.. each thread using a volatile variable would refresh all values in the current scope when the volatile variables are updated.

Atomic variables also address the visibility issues.( similar to volatile)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Write about Atomic Integers and the non-blocking concurrency setup

A

AtomicInteger or AtomicReference have added implementation for the compareandswap functionality
Typical use

AtomicInteger i = new AtomicInteger(0)

do{
int oldvalue= i.get() ;
int newvalue =i++;
}
while (!i.compareAndSwap(oldvalue, newvalue)

so eventually the value is set by multiple threads. .CAS returns true or false based on if it was able to do the swap

The AtomicInteger has functions like
getandIncrement or incrementandGet() that work on the above principle and hence can be used in multi threaded application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

-XX:+UseStringDeduplication
G1 collector option only

A

JVM will try to eliminate duplicate strings as part of garbage collection process
Works only on long lived objects
To see String deduplication statistics, such as how much time it took to run, how much duplicate strings were evacuated, how much savings you gained, you may pass ‘-XX:+PrintStringDeduplicationStatistics’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

advantages of keeping xms and xmx the same

A

system starts up faster
there is no allocation and deallocation of memory from and to the OS as per systems usage of the memory
the max memory is retained for the process at startup so new heavy neighbours wont affect you

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

heap dump tool

A

jmap
load heapdum to eclipse MAT or jprofiler/Yourkit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

usual cause of memory usages being high

A

, considerable amount memory is wasted because of inefficient programming practices such as: Duplicate object creation, suboptimal data type definitions (declaring ‘double’ and assigning only ‘float’ values), over allocation and underutilization of data structures and several other practices.

boxed wrappers take a whole lot of memory when compared to primitives along with duplicate strings

you can consider interning strings in your code
you can consider using primitive as much as possible
there are specialised collection framworks from apache that allows primitive values and they are way faster
fastutils is one such
get rid of boxed numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

GC logging

A

Enabling GC logging on your application adds almost zero overhead

The GC log rich set of information, such as timestamps of GC events, types of events (Young GC, Full GC, Mixed GC), duration of events, sizes of internal JVM memory regions (Young Gen, Old Gen, Metaspace) before and after the event, time spent in the JVM and Kernel during GC events, number of GC threads used, duration each GC thread took to complete, string deduplication details…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how do you choose a GC

A

if memory is higher than 32
then
a) if GC throughput is important then go for parallelgc
b) if low pause is a requirement then go for shenandoah or Z GC

if memory is less than 32 then go for
a) if gc throughput isimportant then go for paralllelgc
b) else go for G1 GC.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do you get by doing GC tuning /
what are the benefits of gc tuning

A

by reducing the gc pause you can improve response times
can improve throughput
can right size the memory / hence cost
can right size the cpu by reducing the cpu used by GC /hence cost
unearth memory problems earlier by monitoring gc logs

look at GC thread cout in a threaddump

XX:ParallelGCThreads=n: Sets the number of threads used in parallel phase of the garbage collectors.
-XX:ConcGCThreads=n: Controls the number of threads used in concurrent phases of garbage collectors.

So if your JVM is running on server with 32 processors, then
ParallelGCThread value is going to be: 23(i.e. 8 + (32 – 8)*(5/8))
ConcGCThreads value is going to be: 6 (i.e. max(25/4, 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how to tune GC

A

start from scratch .. old jvm arguements added at the first iteration may be obsoleted or counterproductive with newer algorithms or not meet the currect system requirements. There are 600 vm arguments related to memory and gc
. Tune GC Algorithm Settings
Adjust Internal Memory Regions Size
a. Young Generation

b. Old Generation

c. MetaSpace

d. Others

Reduce Object Creation rate : this is at an application level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tuning G1 GC what do you choose

A

first and foremost
pause time goal.: ‘-XX:MaxGCPauseMillis’ or something

2)Avoid setting young gen size
3) Remove old arguments
4) XX:+UseStringDeduplication’ argument.
seconds
no of parallelgc threads
no of concurrent gc threads
at what threshold is gc triggered
the size of the region itself
overall heap size

https://blog.gceasy.io/simple-effective-g1-gc-tuning-tips/#:~:text=G1%20GC%20is%20an%20adaptive%20garbage

16
Q

Memory Analysis toolkit

A

jps : to find the process ids of java processes in general
jmap -histio : to generate an object histogram of sorts . a snapshot of objects in memory for a PID

jmap is also used to get heapdump
jmap -dump :file=<filelocation> PID</filelocation>

we can also set XX: heapdumponOOM property for the jvm

Eclipse MAT is great for analysiing the hprof file generated by heapdump .. is lightweight

JvisaulVM .. not shipped with jdk 9 onwards can be used to do memory profile / take a dump of threads/ heap etc.

Java mission control is licensed by Oracle and is great for profiling of apps but not available as part of OpenJDK

17
Q

What is memory leak

A

memory leak is when an object allocated in memory but no longer needed is not getting garbage collected

In an OOM / heap dump scenario you look at what is Retained Heap size
Shallow heap size is the size of the object itself
Retained is the size of all the objects and their references . So you want to target those with highest Retained heap size

in memory profiling you look at the generational count of objects ..
a count of 0 : object has not gone through any round of gc
1 : one round
if you notice that the generational count is increasing constantly then that can be an indicator of an object not getting garbage collected at all . worth investigation

visualvm also allows to view stack traces to see where in the code that object was created.

18
Q

most common causes of memory leak

A

1) Memory Leak Through static Fields
In Java, static fields have a life that usually matches the entire lifetime of the running application
If collections or large objects are declared as static, then they remain in the memory throughout the lifetime of the application.
2) Unclosed resources
connections/input streams / sessions should be closed carefully thinking about scenarios where an exception might lead to these not getting closed properly

close in finally block or use try with resources thing

19
Q

how improper equals and hashcode can lead to memory leak?

A

Improper equals() and hashCode() Implementations
Something like this would add 100 objects of the same type. the heap size is 100 instead of 1.

@Test
public void givenMap_whenEqualsAndHashCodeNotOverridden_thenMemoryLeak() {
Map<Person, Integer> map = new HashMap<>();
for(int i=0; i<100; i++) {
map.put(new Person(“jon”), 1);
}
Assert.assertFalse(map.size() == 1);
}

ORM tool like Hibernate, uses the equals() and hashCode() methods to analyze the objects and saves them in the cache. IT may end up saving duplicates in cache

When defining new entities, always override the equals() and hashCode() methods.
It’s not enough to just override, these methods must be overridden in an optimal way as well.

20
Q

Other memory leaks

A

. Inner Classes That Reference Outer Classes

non-static inner classes (anonymous classes). keeps an implicit reference to its containing class

If we use this inner class’ object in our application, then even after our containing class’ object goes out of scope, it won’t be garbage collected.

solution : use ZGC or make the inner class static.

21
Q

Finalize

A

The finalize() method is called the finalizer. Its defined just as any other method in which you try to close resources and GC has to invoke this method before doing actual GC.

Usually it doesnt help GC that it has to wait and try to execute this first . 8

public void finalize() {
try {
reader.close();
System.out.println(“Closed BufferedReader in the finalizer”);
} catch (IOException e) {
// …
}
}

22
Q

Threadlocals and memory leak ?

A

threadlocals are values/references associate with a thread.
In some applications/frameworks that use threadpools.. the threads once created serve multiple request and stay alive for real long .

hence Threadlocal objects can stay alive in memory even though they are no longer being used ..

Solution : Remove the threadlocal value once you have used the threadlocal.get or in the finally threadlocal.remove()

Imagine you had a ThreadLocal<Arraylist<String>> locallist and it stays in memory even though not being relevant anymore!</String>

23
Q

define latency and throughput

A

Latency is the time required to perform some action or to produce some result. Latency is measured in units of time – hours, minutes, seconds, nanoseconds or clock periods.

Throughput is the number of such actions executed or results produced per unit of time. This is measured in units of whatever is being produced (cars, motorcycles, I/O samples, memory words, iterations) per unit of time

Generally, you should aim for maximal throughput with acceptable latency.

24
Q

What is CAP theroem

A

http://ksat.me/a-plain-english-introduction-to-cap-theorem

In a system which essentially does a read-write operation and is distributed ( so several node separated by network and heterogenous nodes ) . The network in this case is asynchronous that is the nodes can get disconnected( aka partitoned) you always have to choose between consistency and availability

Consistency : When a read is done it will always get the value of the last write . or an Atomic read. If the system cant provide you the data it will respond with an error ( which contradicts Availability)

Availability : The System is always available and you will always get a response . The response may not be the most updated data ( since that data is on another node and due to partition(that node being disconnected from you) has not reached you . You choose to respond in all cases.

C is important

Banking and Financial Systems
Explanation: Banking systems must ensure that all transactions are consistently recorded. For instance, if a bank transaction is processed (e.g., a transfer of funds), it’s critical that the most recent account balance is reflected in all subsequent reads across the system.

Partition Handling: During a network partition, such a system would rather deny access or delay responses than return outdated data, ensuring that the integrity of data is maintained.
Drawback: This approach may impact availability if the system needs to enforce consistency during network partitions, potentially blocking operations until partition issues are resolved

Another example
Inventory Management in E-commerce
Explanation: Real-time inventory management systems need to ensure that products are not oversold. When an item is sold, it’s important that the inventory count is consistent across the entire system to prevent selling more items than available.

Systems Prioritizing Availability (A > C)
When uninterrupted service is critical, even at the cost of possibly returning slightly stale data, systems prioritize Availability over Consistency.

Social media platforms like Twitter, Instagram, or Facebook focus on providing fast, real-time access to updates and notifications, allowing users to continue interacting with the platform even if some data might not reflect the latest update.

Why Availability?: In this context, it’s better for users to have some data available (even if slightly stale) than to block access until the latest updates are fully synchronized.

Partition Handling: During a network partition, each partition may operate independently, allowing users to continue reading and posting content with the expectation that the system will eventually achieve consistency.

Another A
Caching Systems (e.g., CDN for Video Streaming)
Explanation: Content delivery networks (CDNs) prioritize availability, as their primary goal is to serve content quickly to users worldwide. Data is often cached across distributed servers to minimize latency.
Why Availability?: Video streaming platforms like Netflix, YouTube, or Hulu cache popular content at various edge locations to ensure fast access, even if the data is not always the absolute latest (e.g., metadata like the number of views might lag slightly).