1.0 Scalability, availability, stability, patterns(Fun reading) Flashcards

Question

HTTP Caching

Answer 1

A

You don’t get to 10million users without having interesting stories

Answer 2

A

Immutability as the default means that once data is created, it cannot be modified - only new versions can be created. In system scalability, this principle is crucial for several reasons:
When data is immutable, multiple processes or servers can read it simultaneously without worrying about concurrent modifications or race conditions. This significantly simplifies distributed systems as you don’t need complex locking mechanisms or worry about data consistency across nodes.
In distributed databases, immutability helps with versioning and audit trails. Instead of updating records in place, new versions are created with timestamps. This makes it easier to track changes, roll back to previous states, and maintain data consistency across distributed systems.
Immutable data structures also help with caching strategies. Since immutable data can’t change, cache invalidation becomes simpler - you don’t need to worry about cached data becoming stale due to modifications. You can cache aggressively and only update when new versions are created.
In event-sourcing architectures, immutability is fundamental. Events are stored as an immutable log, and the current state is derived from processing these events. This approach provides better scalability as events can be processed in parallel and distributed across multiple nodes.

Answer 3

A

Referential Transparency means that an expression can be replaced with its value without changing the program’s behavior. In scalable systems, this property has several important implications:
Referential transparency makes systems more predictable and easier to reason about. When a function always produces the same output for the same input, regardless of where or when it’s called, it becomes easier to distribute computation across different servers or processes. This predictability is crucial for horizontal scaling.
In distributed systems, referentially transparent functions can be easily cached and memoized. If you know a function will always return the same result for the same input, you can cache these results across your system with confidence. This improves performance and reduces computational load.
For microservices architectures, referential transparency helps with service isolation and reliability. Services that maintain referential transparency are easier to test, debug, and deploy independently. They’re also easier to replicate across different instances since their behavior is consistent and predictable.
It also enables better fault tolerance strategies. Since operations are predictable, you can more easily implement retry mechanisms, fallbacks, and circuit breakers without worrying about side effects or inconsistent states.

Answer 4

A

Laziness in system design refers to delaying computation or data loading until absolutely necessary. This principle has significant implications for scalability:
Lazy evaluation helps conserve system resources by only computing what’s needed, when it’s needed. In distributed systems, this means you can handle larger datasets and more concurrent users because you’re not wasting resources on unused computations or unnecessary data loading.
For database systems, lazy loading helps with performance by only fetching related data when explicitly requested. This reduces initial query time, network bandwidth usage, and memory consumption. Instead of loading an entire object graph, you load only the immediate data and defer loading related entities until they’re accessed.
In microservices architectures, laziness can be implemented through patterns like the Virtual Proxy pattern, where expensive operations or remote service calls are deferred until necessary. This improves response times and reduces system load.
However, lazy evaluation requires careful implementation to avoid performance pitfalls. You need to balance the benefits of deferred computation against the potential cost of multiple smaller operations. You also need to handle edge cases where lazy loading might fail due to network issues or service unavailability.

Answer 5

A

This principle emphasizes that not all data in a system requires the same level of consistency, availability, or durability guarantees. Understanding these differences is crucial for efficient system design:
Different types of data have different consistency requirements:

Strong consistency might be necessary for financial transactions or user authentication
Eventual consistency might be acceptable for social media likes or view counts
Some data might require ACID properties while others can work with BASE properties

Data access patterns also influence the guarantees needed:

Frequently read, rarely written data might benefit from aggressive caching
Write-heavy data might need special consideration for consistency and durability
Time-sensitive data might require different caching strategies than static content

Storage solutions should match data requirements:

Critical data might need multi-region replication
Temporary data might be fine in-memory only
Some data might require full audit trails while others don’t
Different storage engines (document stores, key-value stores, relational databases) might be appropriate for different types of data

Understanding these distinctions helps in:

Choosing appropriate storage solutions
Implementing efficient caching strategies
Designing appropriate backup and recovery procedures
Balancing system resources effectively
Making appropriate trade-offs between consistency and availability