1.0 Scalability, availability, stability, patterns(Fun reading) Flashcards

1
Q

How do you manage overload?

A

You don’t get to 10million users without having interesting stories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Immutability as the default in the context of system scalability, and why is it important?

A

Immutability as the default means that once data is created, it cannot be modified - only new versions can be created. In system scalability, this principle is crucial for several reasons:
When data is immutable, multiple processes or servers can read it simultaneously without worrying about concurrent modifications or race conditions. This significantly simplifies distributed systems as you don’t need complex locking mechanisms or worry about data consistency across nodes.
In distributed databases, immutability helps with versioning and audit trails. Instead of updating records in place, new versions are created with timestamps. This makes it easier to track changes, roll back to previous states, and maintain data consistency across distributed systems.
Immutable data structures also help with caching strategies. Since immutable data can’t change, cache invalidation becomes simpler - you don’t need to worry about cached data becoming stale due to modifications. You can cache aggressively and only update when new versions are created.
In event-sourcing architectures, immutability is fundamental. Events are stored as an immutable log, and the current state is derived from processing these events. This approach provides better scalability as events can be processed in parallel and distributed across multiple nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Referential Transparency in scalable systems, and how does it benefit system design?

A

Referential Transparency means that an expression can be replaced with its value without changing the program’s behavior. In scalable systems, this property has several important implications:
Referential transparency makes systems more predictable and easier to reason about. When a function always produces the same output for the same input, regardless of where or when it’s called, it becomes easier to distribute computation across different servers or processes. This predictability is crucial for horizontal scaling.
In distributed systems, referentially transparent functions can be easily cached and memoized. If you know a function will always return the same result for the same input, you can cache these results across your system with confidence. This improves performance and reduces computational load.
For microservices architectures, referential transparency helps with service isolation and reliability. Services that maintain referential transparency are easier to test, debug, and deploy independently. They’re also easier to replicate across different instances since their behavior is consistent and predictable.
It also enables better fault tolerance strategies. Since operations are predictable, you can more easily implement retry mechanisms, fallbacks, and circuit breakers without worrying about side effects or inconsistent states.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does Laziness benefit scalable system design, and what are its implementation considerations?

A

Laziness in system design refers to delaying computation or data loading until absolutely necessary. This principle has significant implications for scalability:
Lazy evaluation helps conserve system resources by only computing what’s needed, when it’s needed. In distributed systems, this means you can handle larger datasets and more concurrent users because you’re not wasting resources on unused computations or unnecessary data loading.
For database systems, lazy loading helps with performance by only fetching related data when explicitly requested. This reduces initial query time, network bandwidth usage, and memory consumption. Instead of loading an entire object graph, you load only the immediate data and defer loading related entities until they’re accessed.
In microservices architectures, laziness can be implemented through patterns like the Virtual Proxy pattern, where expensive operations or remote service calls are deferred until necessary. This improves response times and reduces system load.
However, lazy evaluation requires careful implementation to avoid performance pitfalls. You need to balance the benefits of deferred computation against the potential cost of multiple smaller operations. You also need to handle edge cases where lazy loading might fail due to network issues or service unavailability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does “Think about your data: Different data need different guarantees” mean in scalable systems?

A

This principle emphasizes that not all data in a system requires the same level of consistency, availability, or durability guarantees. Understanding these differences is crucial for efficient system design:
Different types of data have different consistency requirements:

Strong consistency might be necessary for financial transactions or user authentication
Eventual consistency might be acceptable for social media likes or view counts
Some data might require ACID properties while others can work with BASE properties

Data access patterns also influence the guarantees needed:

Frequently read, rarely written data might benefit from aggressive caching
Write-heavy data might need special consideration for consistency and durability
Time-sensitive data might require different caching strategies than static content

Storage solutions should match data requirements:

Critical data might need multi-region replication
Temporary data might be fine in-memory only
Some data might require full audit trails while others don’t
Different storage engines (document stores, key-value stores, relational databases) might be appropriate for different types of data

Understanding these distinctions helps in:

Choosing appropriate storage solutions
Implementing efficient caching strategies
Designing appropriate backup and recovery procedures
Balancing system resources effectively
Making appropriate trade-offs between consistency and availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Latency vs Throughput

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Availability vs Consistency

A

In a centralized System(RDBMS etc) we don’t have network partitions e.g P in CAP

So you get both

Availability
Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What will we choose in a distributed system?

Basically Available SoftState Eventual consistent

BASE

A

In a distributed system we will have network partitions e.g in CAP
So you can only pick one:
Availability
Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Failover

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fail-back

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Network Failover

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Replication
Active Replication - Push
Passive replication - Pull

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Replication
Master-slave replication
Tree Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Master Slave Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Master Master Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Buddy Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Scalability Patterns: State

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Partitioning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

HTTP Caching
CDN, Akamai

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

HTTP Caching

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

General Static Content

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

HTTP Caching
First Request

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

HTTP Caching
Subsequent Request

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Service of Record
SoR

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How to scale out RDBMS?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Sharding: Partitioning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Sharding: Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

ORM + rich domain model anti-pattern

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Think about your data
Think again

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

When is a RDBMS not good enough?
Scaling Writes to a RDBMS is impossible

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q
Yes, but many times we don't. Why?
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

NOSQL
(Not Only SQL)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Who’s BASE?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

NOSQL in the Wild

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Chord & Pastry

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Node ring with Consistent Hashing

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Bigtable

43
Q

Dynamo

44
Q

Types of NOSQL stores

45
Q

Distributed Caching

46
Q

Write through

47
Q

Write-behind

48
Q

Eviction Policies

50
Q

Peer-To-Peer

51
Q

Distributed Caching
Products

52
Q

Memcached

53
Q

Data Grids/Clustering
Parallel data storage

54
Q

Data Grids/Clustering
More Products

55
Q

Concurrency
What is Concurrency?

56
Q

Shared-state Concurrency

57
Q

Shared-State Concurrency

Problems with locks:

58
Q

Shared State concurrency
Please use java.util.concurrent.*

59
Q

Message-Passing Concurrency

60
Q

Messaging Passing
More Actors

61
Q

More Actors

62
Q

Dataflow Concurrency

62
Q

Software Transactional Memory

63
Q

STM: Overview

64
Q

STM: restrictions

65
Q

STM libs for the JVM

66
Q

Scalability Patterns Behaviour:

67
Q

Event-Driven Architecture

68
Q

Event-Driven Architectcure

69
Q

Domain

70
Q

Domain Events

71
Q

Domain Events

72
Q

Event Sourcing

73
Q

Event Sourcing

74
Q

Command and Query Responsibility Segregation (CQRS) pattern

75
Q

CQRS in a nutshell

77
Q

CQRS: Benefits

78
Q

Event Stream Processing

79
Q

Event Stream Processing
Products

80
Q

Messaging

81
Q

Publish-Subscribe

82
Q

Point-to-Point

83
Q

Store Forward

84
Q

Request- Reply

85
Q

Messaging

87
Q

ESB Products

88
Q

Compute Grids
Parallel execution

89
Q

Reverse Proxies: Apache mod_proxy(OSS), HAProxy, Squid, Nginx(OSS), Ngin

Load Balancing

91
Q

Parellel Computing

92
Q

SPMD Pattern

93
Q

Master/Worker

94
Q

Loop Parallelism

95
Q

What if task creation can’t be handled by:
parallelizing loops (Loop Parallelism)
putting them on work queues(Master/Worker)

96
Q

Fork/Join

97
Q

Scalability Patterns

98
Q

Circuit Breaker

99
Q

Fail Fast

100
Q

Bulk Heads

101
Q

Steady State

102
Q

Throttling

103
Q

Client Consitency

104
Q

Server-side Consistency