DB Flashcards
Row Store
Stores data in rows
Good for Transactional DB. OLTP
This is important if the Data will be queried (CRUD) for all rows at once at all times. Eg, Sales record.
Example AWS: ** MySQL
Column Store
Columns are stored together. Ideal for reporting or when all values for a specific attribute (size) are required.
OLAP
- This is good for Inventories and reporting.
- Data warehouse
- Analytics
You can take data from an OLTP into a Column store for reporting and analytics
eg AWS: REDSHIFT
Graph DB
Data, alongside their relationships, are stored in the DB, hence querying the data returns results really fast.
Best for systems with complex sophisticated Relationships
Justifications for DB on EC2
- Access to the DB Instance OS
- Advanced DB Option tuning … (DBROOT)
- … Vendor demands..
- DB or DB Version AWS don’t provide..
- Specific OS/DB Combination AWS don’t provide
- Architecture (AWS don’t provide replication/resilience)
- Decision makers who ‘just want it’
DB on EC2 CONS
- Admin overhead - managing EC2 and DBHost
- Backup / DR Management
- EC2 is single AZ
- Features - some of AWS DB products are amazing
- EC2 is ON or OFF - no serverless, no easy scaling
- Replication - skills, setup time, monitoring & Effectiveness
- Performance….AWS invests time into optimisation & features
Command to migrate a mysQl DB into a file
$ mysqldump -u root -p alwordpress > a41wordpress.sql
RDS
RDS is a VPC service. It is deployed into a subnet
RDSMulti-AZ Deployment
Without indication, in a Multi-AZ setup, RDS will randomly select any two subnets in two AZs for its deployment; one for its Primary, and the other, standby DB, all within its Subnet Group
RDS Instance Architecture
An RDS Instance can have more than one Database on them
Every RDS Instance (Master and Standby) has its own dedicated EBS Storage
AWS
Note that RDS for SQL Server has a limit of up to 100 databases on a single DB instance.
RDS Replication
Synchronous: Data is replicated to standby as soon as it arrives.
Read Replicas are Asynchronously replicated. This can be same AZ, Multi AZ, or Multi Region.
Read replicas are used to create resilience, HA , and to scale read workloads.
RDS Backup
In Multi Az mode, backup happens on the standby instance, hence, extremely short downtime
RDS Free Tier
Single AZ
Functions of the RDS Standby Instance
For Failover Only
And then Backups to S3
RDS Failover notable events
DB CNAME changes from the Primary DB Endpoint to the Standby Endpoint Url.
Failover to a Standby can take up to 60s-120s. This is to enable DNS Endpoint swapping
RDS Multi AZ Cluster vs Multi Az instance
- Multi AZ instance deploys a Primary DB, and a standby in another AZ.
- **No operation is allowed on the standby apart from Automatic Failover.
while
- Multi AZ Cluster deploys a Primary Master and two or more RR is different AZs.
- Multi AZ Cluster runs on faster Hardware (Graviton + NVME SSD storage).
- Read replicas can be promoted to Primary
- All RRs are read-active
- Faster Failover (35s)
RDS Consistency model
RDS data is considered Confirmed after Main and Standby/RR are committed
RDS Backup
- Manual Snapshots
- Automated Backup
both are stored on AWS Managed S3 Bucket; Bucket is not available for user Access.
RDS Snapshot
- Causes time-out on a Single AZ deployment
- Do not expire
- More frequent snapshots=Faster RTO
- After initial Snapshot, subsequent snapshots are incremental.
- Snapshots dont Expire. They Have to be deleted Manually or via some external process
Automated Backup content
Automated snapshots contains DB data, and Transaction logs(5min granularity)
Automated backups takes place once a day, during the backup window.
Backup retention
0 days - No Backup
35days - Max
RDS Restore (Snapshot/Backup)
During RDS Restore, RDS Deploys a brand new DB, hence Applications need to be updated with the new endpoint.
- Snapshots are restored to its Backup point in time
- Automated Backups are restored to any 5mins Point in time. Backup is restored and transaction logs are ‘replayed’ to bring DB to desired point in time (GOOD RPO)
RDS Restore RTO
- RDS Restore time is not short. It takes time
Depending on the amount of data on the DB
RDS Read Replica
They are eventually consistent. and have their own separate endpoint.
Without Application support, RR are useless.
RR do not perform Automatic failover.
Once Cross-Region is selected, AWS Handles the Networking.
Note: Synchronous - Multi AZ
Asynchronous: Read Replica
RDS Encryption
- Authentication
- Authorization
- Encryption In Transit
- Encryption at rest
Is it good to store images in relational database?
Storing binary files in a relational database is not very efficient. The process of saving and retrieving the files from a database is generally much slower than using file storage.
RDS as a managed service
RDS is managed, hence, ssh is not allowed