Database Services Flashcards
What DB can’t be autoscaled?
MS SQL Server and Oracle DBs
How are read replicas accessed?
Via API endpoint same as other DB instances in RDS
When can RDS be encrypted?
You can only enable encryption for an Amazon RDS DB instance when you create it, not after the DB instance is created.
List Aurora Single Master Cluster types
Aurora Serverless, parallel query, and Global Database clusters are all single-master clusters
Aurora single-master clusters
a single DB instance performs all write operations and any other DB instances are read-only.
Aurora multi-master DB cluster
all DB instances can perform write operations. There isn’t any failover when a writer DB instance becomes unavailable, because another writer DB instance is immediately available to take over the work of the failed instance.
Aurora DB cluster
DB cluster consists of one or more DB instances and a cluster volume that manages the data for those DB instances. An Aurora cluster volume is a virtual database storage volume that spans multiple Availability Zones, with each Availability Zone having a copy of the DB cluster data. Two types of DB instances make up an Aurora DB cluster:
Primary DB instance – Supports read and write operations, and performs all of the data modifications to the cluster volume. Each Aurora DB cluster has one primary DB instance.
Aurora Replica – Connects to the same storage volume as the primary DB instance and supports only read operations. Each Aurora DB cluster can have up to 15 Aurora Replicas in addition to the primary DB instance. Maintain high availability by locating Aurora Replicas in separate Availability Zones. Aurora automatically fails over to an Aurora Replica in case the primary DB instance becomes unavailable. Aurora Replicas can also offload read workloads from the primary DB instance.
Amazon RDS Read Replicas
Amazon RDS Read Replicas provide enhanced performance and durability for RDS database (DB) instances. They make it easy to elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.
Amazon RDS replicates all databases in the source DB instance.
Amazon RDS Read Replicas - Where can they be deployed?
Read replicas can be within an Availability Zone, Cross-AZ, or Cross-Region.
RedShift
RedShift is a columnar data warehouse DB that is ideal for running long complex queries. RedShift can also improve performance for repeat queries by caching the result and returning the cached result when queries are re-run. Dashboard, visualization, and business intelligence (BI) tools that execute repeat queries see a significant boost in performance due to result caching.
What are foundational technologies used in Redshift?
OLAP (online analytic processing)
SQL
Colomnar data storage
Massively Parallel Processing (MPP) by distributing data and queries across all nodes
PostgreSQL compatible with JDBC and ODBC drivers available
EC2
HDD or SDD storage options (160GB + per node)
What are features of Redshift?
Analyze all your data using standard SQL and existing Business Intelligence (BI) tools
Clustered peta-byte scale data warehouse
Query directly from data files on S3 via RedShift Spectrum
Provides advanced compression - compression scheme selected automatically
Amazon RedShift Spectrum
a feature of Amazon Redshift that enables you to run queries against exabytes of unstructured data in Amazon S3, with no loading or ETL required.
Redshift Leader node
Manages client connections and receives queries. Simple SQL end-point. Stores metadata. Optimizes query plan. Coordinates query execution.
Redshift Compute Node
Stores data and performs queries and computations.
Local columnar storage.
Parallel/distributed execution of all queries, loads, backups, restores, resizes.
Up to 128 compute nodes.
Explain how Redshift offers 1. Durability and 2. High Availability
- Replication and continuous backups
2. Automatically recover from component and node failures
What is the scope of RedShift?
RedShift is an AZ service. Clusters can be run across multiple AZs by loading data into two Amazon Redshift data warehouse clusters in separate AZs from the same set of Amazon S3 input files.
How can you stand up Redshift in another AZ from Redshift in single AZ?
Restore Redshift snapshot in the second AZ
How does RedShift keep copies of data?
- Stores the original
- A replica on compute nodes (within the cluster).
- A backup copy on S3.