01 - Databases in the Real World Flashcards
List
Design Considerations
3
- Performance
- High Availability
- Backup and Recovery
List
Performance features required for the workload
4
- Required latency
- IOPS
- Read/write throughput
- Concurrency
Define
latency
- how quickly users need a response
- amount of time needed to complete an activity
Define
IOPS
Input/output operations per second
How often users are reading and writing data to the database
(e.g. 10 reads per second)
Define
concurrency
- How many users are active at the same time
- How many active users are accessing the same data at the same time?
How to improve latency and read/write throughput?
provision more IOPS when configuring the database
Differentiate
IOPS vs Throughput
- Throughput - measurement of bits or bytes per second that can be processed by a storage device
- IOPS - number of read/write operations per second.
Both IOPS and throughput can be used together to describe performance.
Define
High Availability
At any point where you try to access the database, it always gives the data needed (non-error response)
List
High Availability features required for the workload
3
- Read replicas
- Clustering
- Geo-distributed deployments
Define
Read replicas
Create read-only copies
* Updates made to the source database are asynchronously copied to read replicas
* provides scalability
* can be promoted to a standalone database instance
Define
Clustering
- Some nodes, that you can write and process
- for write- and process-heavy workloads.
- 1 cluster = 1+ compute nodes replicated across multiple Availability Zones
- to gain increased read scalability and failover protection.
Geo-distributed deployments
for databases across diff places
* If you deploy data in US, you are bound by US laws
* Ex. if theres a threat to natl security, they can look into data servers in the US
* You can read/write from other places, or pwedeng read lang ganun
Explain
Backup and Recovery
- When disaster happens, need to make sure ur data is still accessible & u don’t lose data
- Have multiple copies of data, so that when one copy is ruined, you still have other copies available
Define
RPO
returning point objective
getting data when the disaster happens → no data loss
Define
RTO
Returning Time Objective
* time it takes to go back to RPO
* Best RTO: as soon as possible
List
Workload requirements
3
- Data Storage
- Data volume, velocity, and variety
- Data Usage
Data Storage Types
4
- File system
- Object store
- Relational database
- Nonrelational database
Differentiate
Data volume, velocity, and variety
Data volume: size of individual items being written into workload AND total size of all items within workload.
Data velocity: how fast writes and reads are
* can cause data bottlenecks if your system is not properly tuned.
Data variety: indicator of the type of database or databases you may need for your workload.
* Before: structured data in relational databases; semistructured in nonrelational databases; unstructured data in file system.
* lines are more blurred now
How will the data in your workload be used?
5
SQL data organization
* OLTP or OLAP
* DSS
* Data warehouse
NoSQL access patterns
* IoT
* Session state
T/F
In an ideal scenario, the application server communicates with the database server over a public network.
F
private network