Database Technologies Flashcards
What situations do you use read replicas for? What form of consistency do you get with read replicas?
Read heavy workloads. Updates to the replicas are asynchronous and are therefore eventually consistent.
Do you need to make changes to your application to make use of read replicas?
Yes, connection strings in the application will need to be made
Are updates to replicas synchronous or asynchronous?
Asynchronous - reads and are eventually consistent
How many read replicas can an RDS instance have for MySQL and Aurora?
5 for MySQL and 15 for Aurora
How do you control actions that your systems and user can take on specific RDS instances?
IAM will control access to RDS resources. Security groups can be used to control network traffic to the RDS instance
What mechanism is used to control network access with RDS?
Security groups are leveraged for RDS netowrk security
Is RDS encrypted at rest by default? If not, when do you specify encryption?
Not by default. You specify encryption when you create the DB instance. OR you can create an encrypted snapshot of the DB by copying it and encrypting via KMS and then you this to restore the encrypted instance.
In a multi AZ RDS deployment, what happens if the primary fails?
The CNAME record is switched from the primary to the secondary as a part of automatic fail over.
If you are using IAM authentication to connect to a database rather than a database username and password, what must be done on the database to enable this?
SSL must be set when connecting to the DB. This allows us to use IAM roles to connect to the DB
Can you run any form of auditing on the underlying infrastructure of an RDS database?
No. RDS is a managed service
What options are available for RDS maintenance windows?
no preference or scheduled maintenance windows are available
What is an Aurora global database and what is the region configuration for this?
Aurora Global DB’s span multiple regions and enable robust DR with lower latency reads. They have 1 primary and 1 secondary region. Recommended over Cross Region Replication.
What 2 database engines is Aurora compatible with?
MySQL and PostgreSQL
How long are DynamoDB streams retained for?
24 Hours
If you need to create a DynamoDB global table what do you need to enable in Dynamo first to allow this?
Need to enable DynamoDB streams as these enable Dynamo to generate a change log to replicate data across regions
In Dynamo DB what two attributes make a composite key and what is the maximum size of EACH of the attributes
A composite key is made up of:
The Partition Key: 2KB
The SORT key: 1KB
In redshift, how do EVEN, ALL, KEY distributions distribute data across compute nodes?
Even: All data is spread evenly across nodes
All: Every table is sent to every node
Key: Data is spread based on a single column value
For objects cached in DAX, what is the TTL?
5 minutes (300 secs)
Does DynamoDB have a VPC endpoint?
yes
In DynamoDB you are received a provisioned throughput exception during peak load several months ago. You are anticipating a spike in load next week. On analysis you see that several keys are getting read repeatedly. What technology could you use to alleviate the problem?
DAX will cache reads from dynamo transparently with no need to update the application
Where can data be loaded into RedShift from?
S3, DMS, other DBs, almost anywhere.
What is the use case for RedShift Spectrum?
OLAP Queries on data held directly in S3.
In RedShift, what is the purpose of a leader node and a compute node? How many compute nodes can there be for a leader node and what are the maximum sizes?
Leader: manages client connections and queries
Compute: Executes queries, stores data and performs compute operations.
1 Leader: 128 Compute Nodes. Max size of 160GB
Can you authenticate to Aurora using IAM?
Yes. SSL must be enabled to support this.
What 3 actions can trigger an RDS failover?
Loss of AZ for primary, loss of network on primary, fail of storage on primary.
What mechanism is used to control who can manage an RDS instance?
IAM Policies.
Is replication for a Multi-AZ deployment synch or synch and what for of consistency do you get?
Synchronous and you get read after write consistency.
With respect to DynamoDB where would you use exponential back-off?
If you start getting provisioned throughput exception errors
What is a graph data structure and what type of AWS service is best suited for this?
Data with with highly complex relationships - such as social media posts. AWS Neptune supports graph data structures.
Does DynamoDB support bursting throughput?
Yes, through the use of burst credits. If these credits are exceeded, you’ll get a provisioned throughput exception.
When setting up a DynamoDB table, do we need to provision throughput?
Yes, you need to provision read and write capacity units.
1WCU = 1Kb/Sec.
1 RCU = 1 strong or 2 eventually consistent reads of 4Kb/Sec
What are the 3 NoSQL Databases and what data structures do they support. (One is also a caching mechanism)
DynamoDB - JSON based data structure
Neptune - Graph based data structures
Elasticache - Key Value Pairs
What’s the maximum SIZE of a compute node in Redship
160GB
What feature does Amazon have to support load balancing across read replicas?
A Reader Endpoint. This allows for load balancing across read replicas as read replicas auto-scale horizontally. This happens at the connection level.
Does Aurora support Cross Region Replication?
Yes - but its recommended to use Aurora Global DB instead
How many master instances do you get by default with Aurora? What happens if a master instance fails?
ONE. If this fails, a read replica is promoted to master
How long does an automated RDS snapshot take to complete and what are Recovery Point Implications of this?
Approximately 5 minutes. If this is being used as a recovery option then the RPO implications is that we will loose up to 5 mins of data.
How many of the 6 copies of your data are needed for Write and Read Operations?
4 for writes
3 for reads
How many copies of your data are maintained for Aurora and over how many AZ’s?
6 Copies, 3 AZ’s
What configs need to be made to PosgreSQL and Mysql to enable inflight SSL encryption and where are these made?
Postgres: In the rds console and set rds.forcessl=1
MySQL needs to be done at the DB level via a grant query and set Require SSL
In RDS, under which activities would we see downtime:
- Maintenance
- Scaling in read replicas
- Changing Instance types
All of these will result in some downtime
In Aurora, which instance does the writer endpoint point to?
The master
Which RDS databases can use IAM authentication?
Postgres and MySql - and Aurora when using MySQL and Postgres
Can data TO RDS be encrypted inflight?
Yes, via SSL.
What is the default retention period for an RDS automated backup? When a backup is restored, is it restored as a new instance or a an existing instance?
7 Days, configurable up to 35. Restores are to a new instance.
If your application is making updates to DynamoDB and you also have DAX acting as a cache, do these writes go to DAX or DynamoDB?
DAX is a write though cache - the write will go to DAX and then to DynamoDB
Is DAX MultiAZ? How many nodes can you have per DAX cluster
DAX is multi-AZ and the recommended config is spreading it across 3 AZ’s/ You can have 10 DAX nodes/Cluster.
What needs to be done to enable failover in Aurora?
Nothing, Its HA native
What frequency are automated RDS backups made? How do you restore to a point in time?
Daily. Point in time backups are enabled using the transaction logs
Which databases does RDS support transparent data encryption for?
SQL Server and Oracle
What happens to automatically created and manually created RDS snapshots if the RDS instance is enabled?
Automatic snaps are deleted. Manually created snaps persist. You will get an option to create a final snap on deletion
When using AWS Data Migration service for a heterogenous migration, what other service needs to be used (Hint: “Tool”)
The Schema Migration Tool
Can you set WCU and RCU values independently for DynamoDB?
Yes. Both WCU and RCU need to be set and they do not need to be the same value.
What is a DynamoDB Stream? What can it interact with?
A DynamoDB stream logs all changes made to DynamoDB data. This can be used to trigger a lambda function that can react to those changes.
Whats the role of a database option group?
An option group is used to manage database features across many instances
For DynamoDB, what time range does a point in time recovery allow you to restore to?
5 Minutes to 35 Days
What are the:
Start
Incremental
Maximum
Sizes for Aurora storage?
10GB Start
10GB increments
64TB max.
Storage is auto provisioned
What is a parallel Aurora Query?
PQ allows distributed processing of a single query using thousands of CPU’s on the storage layer. Offers faster processing of analytical queries at the expense of higher IO
What is the cost model for Dynamo DB?
Provisioned capacity for storage usage (or used capacity if auto-scaling).
We can deploy read replicas to help with high volumes of read queries. Aside of increasing IOPS and instance sizes for the RDS instance types, what can we do to help improve performance of write heavy queries?
For high volume writes, sharding can provide a performance gain - although application logic may need updating.