Terminologies Flashcards
Bandwidth
Bandwidth is a measure of how much data over time a communication link can handle, i.e. its capacity. This is typically measured as kbps (kilobits – thousands of bits per second), Mbps (megabits – millions of bits per second) or Gbps (gigabits – billions of bits per second).
Think of lanes on a road. A mile of eight-lane freeway has more capacity for cars than a mile of two-lane road.
Latency
Latency is the time it takes for a packet to get across the network, from source to destination. It is measured in units of time — ms (millisecond, or 1/1,000 of a second).
Following the road analogy, it’s how long it takes you to get to work. Longer is not better.
For a software system, Latency is defined as:
Latency = Timestamp at which the response arrived - Timestamp at which the request was sent.
Thus, latency is nothing but the time taken by the application to receive the request, process it and return the response to the caller.
Throughput
Throughput is the actual amount of data that is successfully sent/received over the communication link. Throughput is presented as kbps, Mbps or Gbps, and can differ from bandwidth due to a range of technical issues, including latency, packet loss, jitter and more.
Rush hour traffic, speed limits, potholes and stalled vehicles prevent you and your fellow travelers from zipping along.
Bandwidth, Latency and Throughput Relationship
If you have high bandwidth and low latency, then you have a greater throughput because more data is being transferred faster.
E.g. If you are in India, watching a movie online, and the CDN that serves the movie is in New York, then the throughput will be less because of increased latency. On the other hand, if the CDN server is located in Mumbai, that will result in increased throughput because of lower latency. This example assumes that the Internet bandwidth is decent enough in both the examples.
Desired Qualities of a Software System
1) Low Latency
2) High Throughput
Scalability
Scalability is the capability of the system in which every piece of the infrastructure on which the application runs can be expanded to handle the increased load as the system grows (in data volume, traffic volume, or complexity).
Systems have four general areas where scalability can apply to:
1) Disk I/O
2) Memory
3) Network I/O
4) CPU
When any of the above four areas experience bottlenecks, system tends to become slow or even crash down, resulting in bad user experience. In such scenarios, system can be scaled-up using:
1) Vertical Scaling
2) Horizontal Scaling
Elasticity
Elasticity is the capability of the system to scale-up or scale-down dynamically. This is achieved when the system is able to detect additional load/traffic and add resources on the fly. As the load fades away, the resources are removed automatically.
Systems deployed on the cloud can have Elasticity build within them inherently. On AWS, we can build elastic systems using Auto-Scaling Groups, combined with other AWS services, like Load Balancers, EC2 and ECS.
Reliability, Fault Tolerant and Resilient Systems
The system should continue to work correctly at
the desired level of performance even in the face of adversity, like hardware or software faults, and even human error).
Typical expectations from a software system include:
- The application performs the function that the user expected.
- It can tolerate the user making mistakes or using the software in unexpected ways.
- Its performance is good enough for the required use case, under the expected load and data volume.
- The system prevents any unauthorized access and abuse.
If all those things together mean “working correctly,” then we can understand reliability as meaning, roughly, “continuing to work correctly, even when things go
wrong.” The things that can go wrong are called ‘faults’, and systems that anticipate faults and
can cope with them are called ‘fault-tolerant’ or ‘resilient’ systems.
Fault vs Failure
A fault is usually defined as one component of the system deviating from its spec, whereas a failure is when the system as a whole stops providing the required service to the user.
It is impossible to reduce the probability of a fault to zero; therefore it is usually best to design fault-tolerance
mechanisms that prevent faults from causing failures.
Fault-Tolerance vs Fault-Resiliency
Fault Tolerance: User does not see any impact except for some delay during which failover occurs.
Fault Resilience: Failure is observed by the user, but rest of system continues to function normally.
Types of Load on a Software System (Load Parameters)
- Requests per second to the server
- Ratio of reads vs writes to the DB
- Numbers of active users in a chat application
- Hit rate on a cache
Latency and Response Time
Latency and response time are often used synonymously, but they are not the same. The response time is what the client sees: besides the actual time to process the request (the service time), it includes network delays and queueing delays. Latency is the duration that a request is waiting to be handled - during which it is latent, awaiting service