Terms Flashcards
Scalability and how to handle
Scalability refers to the system’s ability to handle a growing amount of work, or its potential to be enlarged to accommodate that growth
How to handle: Horizontal scaling and in some cases vertical scaling; autoscaling and micro services that distribute the compute
Availability
Availability - Availability is the proportion of time a system is operational and accessible when needed.
How to handle: Aim for high availability through redundancy, failover mechanisms, and careful maintenance scheduling; active-passive clustering
Fault-tolerance
Fault-tolerance - Fault tolerance is the system’s ability to continue operating properly in the event of the failure of one or more of its components
How to handle: Use replication (for data and services), redundancy, and self-healing architectures. Create a design with no single point of failure.
Reliability
Reliability refers to the probability that a system will function without failure over a specified period of time.
How to handle: Focus on designing fault-tolerant systems, implementing robust error handling, regular monitoring, and automated recovery processes. Use reliable communication channels and persistent storage.
Security
Security - Security refers to the system’s ability to protect against unauthorized access, vulnerabilities, and attacks
How to handle: Implement authentication, authorization, encryption (both at rest and in transit), auditing, and regular security patches. Use secure development practices (e.g., OWASP), security reviews, and penetration testing.
Latency
Latency - Latency is the time delay between a request and the corresponding response. Low latency is crucial for real-time or near real-time applications.
How to handle: Caching - Minimize latency by optimizing network paths (e.g., Content Delivery Networks for global distribution), using in-memory caching (e.g., Redis), and optimizing algorithms and query performance (e.g., database indexing).
Consistency
Consistency refers to the system ensuring that all users see the same data at the same time across all systems.
How to handle: In distributed systems, choose between strong consistency (e.g., databases with ACID properties) or eventual consistency (e.g., CAP theorem trade-offs in systems like Cassandra or DynamoDB). Ensure data replication strategies are well defined
Durability
Durability - ensures that data is never lost and remains intact, even in case of system crashes or other failures.
How to handle: Use persistent storage with backups and replication strategies. Implement transactional systems where data is committed before being acknowledged as written
Fairness
Fairness - Fairness ensures that resources are allocated equitably among users or requests, preventing any one user or process from monopolizing system resources.
How to account for it: Implement fair scheduling algorithms (e.g., fair queueing or round-robin) and rate limiting. Consider service level agreements (SLAs) to guarantee equitable access to resources for all users.