Design system question Flashcards
What is Scaling, vertical scaling, horizontal scaling and difference between them and adventages and disadventages?
[1] Scalability
Scalability is the property of a system to handle a growing amount of work by adding resources to the system
[2] Vertical Scaling vs Horizontal scaling vertical scaling (aka scaling up) describes adding additional resources to a system so that it meets demand. How is this different from horizontal scaling? While horizontal scaling refers to adding additional nodes, vertical scaling describes adding more power to your current machines
One of the fundamental differences between the two is that horizontal scaling requires breaking a sequential piece of logic into smaller pieces so that they can be executed in parallel across multiple machines. In many respects, vertical scaling is easier because the logic really doesn’t need to change. Rather, you’re just running the same code on higher-spec machines. However, there are many other factors to consider when determining the appropriate approach.
How to scale Databse horizontaly and vertically?
[1] Horizontal scaling of DB
In a database world, horizontal scaling is usually based on the partitioning of data (each node only contains part of the data).
[2] Vertical Scaling of DB
In vertical scaling, the data lives on a single node and scaling is done through multi-core, e.g. spreading the load between the CPU and RAM resources of the machine.
How to measure scalability?
The scalability of an application can be measured by the number of requests it can effectively support simultaneously. The point at which an application can no longer handle additional requests effectively is the limit of its scalability. This limit is reached when a critical hardware resource runs out, requiring different or more machines. Scaling these resources can include any combination of adjustments to CPU and physical memory (different or more machines), hard disk (bigger hard drives, less “live” data, solid state drives), and/or the network bandwidth (multiple network interface controllers, bigger NICs, fiber, etc.).
Scaling horizontally and scaling vertically are similar in that they both involve adding computing resources to your infrastructure. There are distinct differences between the two in terms of implementation and performance.
Whered do servers come from?
[1] Provisioned within your own company’s data center
[2] Cloud services (Amazon EC2, Google Compute Engine, Azure VM’s)
[3] Fully managed ‘serverless’ services (Lambda, kineisi, Athena)
What is a ‘failover’?
Failover is a backup operational mode in which the functions of a system component are assumed by a secondary component when the primary component becomes unavailable – either through failure or scheduled down time.
What is cold standby failover?
[Secondary node behavior]
A cold failover is a manual, and therefore, delayed switch-over to a secondary system. The delay occurs because the secondary system needs to be brought online and consequently, some data can be lost.
it takes to “spin up” these cold systems. This type of configuration is far less resource-intensive than a hot configuration but it does leave potential information gaps and these can be critically important in a security event.
Cold Standby, is a redundancy method where you have a standby node (as a backup) for the primary one. This node is only used during a master failure. The rest of the time the cold standby node is shut down and only used to load a backup when needed.
[2] Data protection
Data from a primary system can be backed up on a storage system and restored on a secondary system when it is required.
[3] Failover time
a Few hours
What is ‘Warm Standby’ failover?
[1] Secondary node behavior
The software component is installed and available on the secondary server, which is up and running. If a failure occurs on the primary node, the software components are started on the secondary node. This process is automated by using a cluster manager.\
[2] Data protection
Data is regularly replicated to the secondary system or stored on a shared disk
[3] Failover time
A few minutes
What is ‘Hot Standby’ failover?
[1] Secondary behaviour
The software component is installed and available on both the primary node and the secondary node. The secondary system is up and running, but it does not process data until the primary node fails.
[2] Data protection
Data is replicated and both systems contain identical data. Data replication is performed based on the software capabilities.
[3] Filover time
a Few seconds
What is sharding?
Sharding is a database architecture pattern related to horizontal partitioning — the practice of separating one table’s rows into multiple different tables, known as partitions. Each partition has the same schema and columns, but also entirely different rows. Likewise, the data held in each is unique and independent of the data held in other partitions.
Sharding involves breaking up one’s data into two or more smaller chunks, called logical shards. The logical shards are then distributed across separate database nodes, referred to as physical shards, which can hold multiple logical shards. Despite this, the data held within all the shards collectively represent an entire logical dataset.
What is sharding?
medium.com/@jeeyoungk/how-sharding-works-b4dec46b3f6
Sharding is a method of splitting and storing a single logical dataset in multiple databases. By distributing the data among multiple machines, a cluster of database systems can store larger dataset and handle additional requests. Sharding is necessary if a dataset is too large to be stored in a single database. Moreover, many sharding strategies allow additional machines to be added. Sharding allows a database cluster to scale along with its data and traffic growth.
What is denormalization and how does it work? Pros and cons of denormalization
Denormalization is a strategy used on a previously-normalized database to increase performance. The idea behind it is to add redundant data where we think it will help us the most. We can use extra attributes in an existing table, add new tables, or even create instances of existing tables
[1] Denormalization pros
- Faster reads for denormalized data
- Simpler queries for application developers
- Less compute on read operations
[2] Denormalization cons
- Slower write operations
- Additional database complexity
- Potential for data inconsistency
- Additional storage required for redundant tables
What is CAP theorem?
https://www.bmc.com/blogs/cap-theorem/
In theoretical computer science, the CAP theorem also named Brewer’s theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:
CONSISTENCY - Every node provides the most recent state or does not provide a state at all. None two nodes will return different state any different time
AVAILABILITY - all reads contain data, but it might not be the most recent
PARTITION TOLERANCE - The system continues to operate despite network failures (ie; dropped partitions, slow network connections, or unavailable network connections between nodes.)
When a network partition failure happens, it must be decided whether to:
- cancel the operation and thus decrease the availability but ensure consistency or to
- proceed with the operation and thus provide availability but risk inconsistency.
Thus, if there is a network partition, one has to choose between consistency and availability. Note that consistency, as defined in the CAP theorem, is quite different from the consistency guaranteed in ACID database transactions.[4]
Eric Brewer argues that the often-used “two out of three” concept can be somewhat misleading because system designers need only to sacrifice consistency or availability in the presence of partitions, but that in many systems partitions are rare
EXPLANATION:
No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning.
In the absence of a partition, both availability and consistency can be satisfied.[9]
Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency.
What ACID?
ATOMICITY - either the entire transaction succeeds or the entire thing fails
CONSISTENCY - All database rules are enforced, or the entire transaction is rolled back
ISOLATION - No transaction is affected by other transaction that is still in progress
DURABILITY - Once a transaction is committed, it stays even if the system crashes immediately after.
Give some real world scenarios when to choose Consistency and when Availability. What types f DB would you use?
[1] CONSISTENCY
Consistent databases should be used when the value of the information returned needs to be accurate.
Financial data is a good example. When a user logs in to their banking institution, they do not want to see an error that no data is returned, or that the value is higher or lower than it actually is. Banking apps should return the exact value of a user’s account information. In this case, banks would rely on consistent databases.
Examples of a consistent database include:
- Bank account balances
- Text messages
Database options for consistency:
- MongoDB
- Redis
- HBase
[2] AVAILABILITY
Availability databases should be used when the service is more important than the information.
An example of having a highly available database can be seen in e-commerce businesses. Online stores want to make their store and the functions of the shopping cart available 24/7 so shoppers can make purchases exactly when they need.
Database options for availability:
- Cassandra
- DynamoDB
- Cosmos DB
Some database options, like Cosmos and Cassandra, allow a user to turn a knob on which guarantee they prefer—consistency or availability.
List problems that can occur when using caching
[1] EXPIRATION POLICY
The “expiration policy” dictates how long data is cached. Too long and your data might go stale, too short on your data won’t do much good.
[2] HOTSPOT PROBLEM (celebrity problem)
frequent access of the same key in a partition is known as a hotspot key
[3] COLD-START problem
There is an analogy with a cold engine and warm engine of the car. Cold cache - doesn’t have any values and can’t give you any speedup because, well, it’s empty. Warm cache has some values and can give you that speedup.