Lecture 7 Flashcards
What are the two types of scaling?
Vertical and Horizontal scaling
Horizontal scaling
Adding more servers to your application to spread the load.
- Can make use of non-demand cloud server architectures.
- Facilitates redundancy - having each layer running on multiple servers means that if any single machine fails, your application keeps running.
- Requires more complicated software and architecture.
Vertical scaling
Add more RAM, processors, bandwidth, or storage to a a machine.
- Quick and east way to get your application’s level of service back up to standard. Will only get you so far.
- Upgrading a single server beyond a certain level can become very expensive and often involves downtime and comes with an upper limit.
CAP theorem meanings:
- Consistency
- Availability
- Partition tolerance
Consistency
A consistent view of data on all nodes of the distributed system.
Availability
Demands the system ti eventually answer every request, even in case of failures.
Partition tolerance
The system is resilient to message losses between nodes.
A partition is an arbitrary split between nodes of a system, resulting in complete message loss in between.
Partitioning (Definition)
Separating one table’s rows into multiple different tables.
Types of partitioning
- Range based
- Key based
Partitioning explanation
Partitioning may be stored on different table spaces, which can be on different storage tiers. (RAM/SSD/HD)
- Partitions can be compressed using different compression schemes.
- Local indexes can be dropped for some partitions.
- Table statistics can be frozen on some partitions, while being periodically refreshed on others.
Partitioning explanation
Partitioning may be stored on different table spaces, which can be on different storage tiers. (RAM/SSD/HD)
- Partitions can be compressed using different compression schemes.
- Local indexes can be dropped for some partitions.
- Table statistics can be frozen on some partitions, while being periodically refreshed on others.
Distributed Partitions are also known as..
Sharding
Distributed Partitions / Sharding
- Storage and badwidth constraints
- Scale read and write capacity
- Geolocation: Proximity, Privacy and data protection laws.
- Higher Availability (losing a single shard vs losing all connection)
Distributed Partitions / Sharding
Sharding key needed, the value used to determine to which database to connect.
- Smaller reference tables may need to be replicated to all shards, a strategy is needed for how these tables can be modified and changes propagated to all shards.
Distributed DBMS - Parallel Database
- Nodes are physically close to each other.
- Nodes are connected via high-speed LAN.
- The communication cost between nodes is assumed to be small. As such, one does not need to worry about nodes crashing or packets getting dropped when designing internal protocols.
Distributed DBMS - Distributed Database
- Nodes can be far from each other.
- Nodes are potentially connected via a public network, which can be slow and unreliable.
- The communication cost and connection problems cannot be ignored.
NoSQL Hashed Sharding
Since records have no relations and cannot be joined, there are no transactions.
We can use a hash-function to distribute the data, this evenly distributes the data across clusters.
Leader-Follower Replication
Check online.
- Replicas of databases, where follower replicas, can only execute read-only queries.
Replication Synchronous
Check online.
Distributed databases: Linearizability
Check online.