Databases Flashcards
Is RDS OLTP or OLAP?
RDS is OLTP as it’s row-based and not made for data warehousing.
What OLTP Databases are available on AWS?
MySQL MariaDB Aurora PostgreSQL Oracle SQL Server
What OLAP service is there on AWS?
Redshift (dataware housing)
What’s the difference between OLAP and OLTP?
OLAP is for analytics, OLTP is for transaction processing.
OLAP uses columnar architecture to achieve high IO performance (same data type in a data column) as well as very good compression. Queries can span multiple tables and process/join multiple sources of data which can be very intensive.
OLTP uses row-based architecture which can be easily queried without spanning too many databases.
What is AWS Redshift?
Redshift is a data warehousing service for AWS. It has fast performance and high storage capacity. It is a columnar type architecture with multi-parallel processing and advanced data compression.
Is AWS Redshift encrypted?
At rest encryption: AES-256
In transit encryption: SSL/TLS
What is Multi-AZ in an RDS Instance?
Multi-AZ is used to synchronize your database writes to a copy in another AZ so that it can act as a failover replica in the case your primary database fails.
Used for High Availability
What is Read-Replicas in an RDS instance?
Read-Replicas are read-only replicas of your primary database that you can use to gain performance/separate reads from your primary database.
This is used to gain a performance advantage.
What are the available database services in AWS?
OLAP: Redshift
OLTP: RDS – MySQL, SQL Server, PostgreSQL, Oracle, Aurora, MariaDB
In-Memory: Elasticache (redis, memcached)
No-SQL: DynamoDB
How does Aurora Scaling/Failover work?
Two copies of your database in each AZ. Each Aurora instance has a minimum of three AZs (6 copies of physical storage media)
Designed to transparently handle the loss of 2 copies of data without affecting database write availability and up to three copies without affecting read availability
Self-healing, data blocks and disks are continuously scanned for errors and repaired automatically.
How many copies of data can Aurora handle?
Aurora is able to handle up to 2 copies of lost data without affecting write operations and up to 3 copies of data without affecting read availability
What stops an Aurora instance from suffering data corruption/drive corruption?
Self-healing. Aurora constantly scans for errors and repairs them automatically
Difference between scaling DynamoDB vs scaling RDS?
DynamoDB can scale with “push button scaling” aka you click a button and you can increase the performance (WCU,RCU) units and size. No downtime required.
With an RDS instance you have to use read-replicas to scale out and eventually hit a limit or use a bigger instance type.
What are the storage consistencies of Dynamo DB?
Eventual Read Consistency (default) : Can take 1 second to reach consistency after a write. (.5 WCU)
Strongly Consistent Reads: Less than a second to reach consistency after a write (1 WCU)
How many data centers are DynamoDB instances spread across?
DynamoDB instances are spread across 3 geographically distinct data centers
What type of storage are DynamoDB instances stored on?
SSD Storage
What types of Redshift Configurations are there?
Single node (164GB): Used for small/med business
Multi-Node
Leader Node: Manages client connections and receives queries
Compute Nodes (up to 128 nodes): store and perform queries and computations
What is Elasticache?
A web service that caches critical data that requires high performance reads. Increases performance by storing data in-memory instead of on disk.
What are the two Elasticache caching engines?
Memcached
redis
When would you use Elasticache?
When you need to offload reads or higher read performance from data.
When would you use Redshift?
When you’ve got queries that span too many tables, databases, taking up tons of I/O
How many copies of your data are stored by default with Aurora? (physically)
6 copies (2 copies across 3 different AZs)
What is Amazon Athena?
Athena is a query service that makes it easy to analyze data in S3 using standard SQL commands.
Supported data formats: JSON, Apache Parquet, Apache ORC
What happens to the I/O of a single-AZ RDS instance during a DB snapshot or backup?
You may experience I/O being suspended briefly as the backup process initializes (under a few seconds). You may also experience higher latency for a period.
Default MySQL port number?
3306
With new RDS DB instances are automated backups enabled by default?
Yes, automated backups are enabled by default
What is the maximum retention interval for RDS backups?
35 days
Do you have to pay a transfer charge when replicating data from your primary RDS instance to your secondary RDS instance?
Nope, there is no charge associated with the action.
Are you charged if you have an RDS database that isn’t being used?
Yes, RDS databases are charged based on running time