1.0 Scalability, availability, stability, patterns(Fun reading) Flashcards

1
Q

How do you manage overload?

A

You don’t get to 10million users without having interesting stories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Immutability as the default in the context of system scalability, and why is it important?

A

Immutability as the default means that once data is created, it cannot be modified - only new versions can be created. In system scalability, this principle is crucial for several reasons:
When data is immutable, multiple processes or servers can read it simultaneously without worrying about concurrent modifications or race conditions. This significantly simplifies distributed systems as you don’t need complex locking mechanisms or worry about data consistency across nodes.
In distributed databases, immutability helps with versioning and audit trails. Instead of updating records in place, new versions are created with timestamps. This makes it easier to track changes, roll back to previous states, and maintain data consistency across distributed systems.
Immutable data structures also help with caching strategies. Since immutable data can’t change, cache invalidation becomes simpler - you don’t need to worry about cached data becoming stale due to modifications. You can cache aggressively and only update when new versions are created.
In event-sourcing architectures, immutability is fundamental. Events are stored as an immutable log, and the current state is derived from processing these events. This approach provides better scalability as events can be processed in parallel and distributed across multiple nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Referential Transparency in scalable systems, and how does it benefit system design?

A

Referential Transparency means that an expression can be replaced with its value without changing the program’s behavior. In scalable systems, this property has several important implications:
Referential transparency makes systems more predictable and easier to reason about. When a function always produces the same output for the same input, regardless of where or when it’s called, it becomes easier to distribute computation across different servers or processes. This predictability is crucial for horizontal scaling.
In distributed systems, referentially transparent functions can be easily cached and memoized. If you know a function will always return the same result for the same input, you can cache these results across your system with confidence. This improves performance and reduces computational load.
For microservices architectures, referential transparency helps with service isolation and reliability. Services that maintain referential transparency are easier to test, debug, and deploy independently. They’re also easier to replicate across different instances since their behavior is consistent and predictable.
It also enables better fault tolerance strategies. Since operations are predictable, you can more easily implement retry mechanisms, fallbacks, and circuit breakers without worrying about side effects or inconsistent states.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does Laziness benefit scalable system design, and what are its implementation considerations?

A

Laziness in system design refers to delaying computation or data loading until absolutely necessary. This principle has significant implications for scalability:
Lazy evaluation helps conserve system resources by only computing what’s needed, when it’s needed. In distributed systems, this means you can handle larger datasets and more concurrent users because you’re not wasting resources on unused computations or unnecessary data loading.
For database systems, lazy loading helps with performance by only fetching related data when explicitly requested. This reduces initial query time, network bandwidth usage, and memory consumption. Instead of loading an entire object graph, you load only the immediate data and defer loading related entities until they’re accessed.
In microservices architectures, laziness can be implemented through patterns like the Virtual Proxy pattern, where expensive operations or remote service calls are deferred until necessary. This improves response times and reduces system load.
However, lazy evaluation requires careful implementation to avoid performance pitfalls. You need to balance the benefits of deferred computation against the potential cost of multiple smaller operations. You also need to handle edge cases where lazy loading might fail due to network issues or service unavailability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does “Think about your data: Different data need different guarantees” mean in scalable systems?

A

This principle emphasizes that not all data in a system requires the same level of consistency, availability, or durability guarantees. Understanding these differences is crucial for efficient system design:
Different types of data have different consistency requirements:

Strong consistency might be necessary for financial transactions or user authentication
Eventual consistency might be acceptable for social media likes or view counts
Some data might require ACID properties while others can work with BASE properties

Data access patterns also influence the guarantees needed:

Frequently read, rarely written data might benefit from aggressive caching
Write-heavy data might need special consideration for consistency and durability
Time-sensitive data might require different caching strategies than static content

Storage solutions should match data requirements:

Critical data might need multi-region replication
Temporary data might be fine in-memory only
Some data might require full audit trails while others don’t
Different storage engines (document stores, key-value stores, relational databases) might be appropriate for different types of data

Understanding these distinctions helps in:

Choosing appropriate storage solutions
Implementing efficient caching strategies
Designing appropriate backup and recovery procedures
Balancing system resources effectively
Making appropriate trade-offs between consistency and availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Latency vs Throughput

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Availability vs Consistency

A

In a centralized System(RDBMS etc) we don’t have network partitions e.g P in CAP

So you get both

Availability
Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What will we choose in a distributed system?

Basically Available SoftState Eventual consistent

BASE

A

In a distributed system we will have network partitions e.g in CAP
So you can only pick one:
Availability
Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Failover

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fail-back

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Network Failover

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Replication
Active Replication - Push
Passive replication - Pull

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Replication
Master-slave replication
Tree Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Master Slave Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Master Master Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Buddy Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Scalability Patterns: State

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Partitioning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

HTTP Caching
CDN, Akamai

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
HTTP Caching
25
General Static Content
26
HTTP Caching First Request
27
HTTP Caching Subsequent Request
28
Service of Record SoR
29
How to scale out RDBMS?
30
Sharding: Partitioning
31
Sharding: Replication
32
ORM + rich domain model anti-pattern
33
Think about your data Think again
34
When is a RDBMS not good enough? Scaling Writes to a RDBMS is impossible
35
36
NOSQL (Not Only SQL)
37
38
Who's BASE?
39
NOSQL in the Wild
40
Chord & Pastry
41
Node ring with Consistent Hashing
42
Bigtable
43
Dynamo
44
Types of NOSQL stores
45
Distributed Caching
46
Write through
47
Write-behind
48
Eviction Policies
49
50
Peer-To-Peer
51
Distributed Caching Products
52
Memcached
53
Data Grids/Clustering Parallel data storage
54
Data Grids/Clustering More Products
55
Concurrency What is Concurrency?
56
Shared-state Concurrency
57
Shared-State Concurrency Problems with locks:
58
Shared State concurrency Please use java.util.concurrent.*
59
Message-Passing Concurrency
60
Messaging Passing More Actors
61
More Actors
62
Dataflow Concurrency
62
Software Transactional Memory
63
STM: Overview
64
STM: restrictions
65
STM libs for the JVM
66
Scalability Patterns Behaviour:
67
Event-Driven Architecture
68
Event-Driven Architectcure
69
Domain
70
Domain Events
71
Domain Events
72
Event Sourcing
73
Event Sourcing
74
Command and Query Responsibility Segregation (CQRS) pattern
75
CQRS in a nutshell
76
CQRS
77
CQRS: Benefits
78
Event Stream Processing
79
Event Stream Processing Products
80
Messaging
81
Publish-Subscribe
82
Point-to-Point
83
Store Forward
84
Request- Reply
85
Messaging
86
ESB
87
ESB Products
88
Compute Grids Parallel execution
89
# Reverse Proxies: Apache mod_proxy(OSS), HAProxy, Squid, Nginx(OSS), Ngin Load Balancing
90
91
Parellel Computing
92
SPMD Pattern
93
Master/Worker
94
Loop Parallelism
95
What if task creation can't be handled by: parallelizing loops (Loop Parallelism) putting them on work queues(Master/Worker)
96
Fork/Join
97
Scalability Patterns
98
Circuit Breaker
99
Fail Fast
100
Bulk Heads
101
Steady State
102
Throttling
103
Client Consitency
104
Server-side Consistency