Chapter 1 - Reliable, Scalable & Maintainable Applications Flashcards
What is reliability?
The system should continue to perform the correct functions at the desired level of performance even in the face of adversity such as hardware faults, software faults or human error
What is scalability?
As the system grows in data volume, traffic volume or complexity there should be reasonable ways of dealing with that growth
What is maintainability?
The ability for future engineers to both maintain current behaviour of a system and adapt it for new use cases
What is the difference between a fault and a failure?
A fault is a component of the systems deviating from its spec, a failure is when the system as a whole stops providing the required service to the user. Fault tolerance mechanisms prevent faults from causing failures. This makes a system reliable.
What is data redundancy?
The practise of keeping data in two or more places within a database or data strage system
What is RAID?
RAID stands for redundant array of independent disks - it virtually combines multiple physical disk drive components into one or more logical units for the purpose of data redundancy and/or performance improvement at the hardware level.
What are load parameters?
Load parameters are a way to describe the currrent load on the system. This will depend on the architecture and may be requests per second to a web server, ratio of reads to writes in a database, the hit rate on a cache etc.
What is the difference between latency and response time?
Response time is what the client experiences, the time to process the request plus network delays and queueing delays. Latency is the duration a request is waiting to be handled (e.g Network Latency is how long a request takes to reach the server)
What are p50, p95, p99, p999 etc.?
They are percentiles which indicate response times. 50% of requests are expected have a response time less than or equal to p50 (the 50th percentile). 95% of requests are expected to have a response time less than or equal to p95.
Why may p95 be viewed as more important than p50?
High percentiles are generally seen as more important as the slow response times are typically experienced by users with more data, thus more frequent users of the platform - more valuable users. Response time requirements should be defined in terms of high percentages (p95 for example).
What does it mean for a system to be elastic?
An elastic system automatically adds cmputing resources when it detects an increase in load. These are good for systems with unpreditable load.