Databases Flashcards
Do NoSQL databases perform aggregations such as SUM() or AVG()
No
Do NoSQL databases scale horizontally?
Yes
Is DynamoDB Highly Available across Multiple AZs out of the box?
Yes
Does Dynamo DB integrate with IAM?
Yes
DynamoDB is made of ____________?
Tables
DynamoDB tables must have a _________?
Partition key
Each DynamoDB item has________?
Attributes
What is the maximum size of an item in dynamodb?
400kb
What is a partition key in DynamoDB?
It acts as a primary key and must be unique for each item. It also must be diverse enough to data is distributed.
What two primary key options are available in DynamoDB?
Partition Key
Partition Key + Sort Key
When using a partition key and sort key in DynamoDB, what is the limitation?
The partition and sort key combination must be unique.
What are the two table classes in DynamoDB?
DynamoDB standard
DynamodDB standard-IA
What does the sort key really do in the UI?
Allows you to sort on a column.
What are the capacity modes in DynamoDB?
Provisioned Mode
On-Demand Mode
How often can you switch between capacity modes in DynamoDB?
Every 24 hours
If you exceed your RCU or WCU in DynamodDB, what error is returned?
ProvisionedThroughputExceededException
How can you deal with ProvisionedThroughputExceededException without increasing WCU or RCU?
Exponential Backoff
What is the measurement for one WCU?
One item per second for an item up to 1kb in size.
If you have an item that is 4.5kb in size, how many WCU is required?
5 .. Rounding up is required.
What are the two kinds of reads in Dynamodb?
Strongly Consistent
Eventually Consistent
What is the problem with Eventually consistent reads?
Old data may be returned if there is replication latency.
What does a consistent read in DynamoDB do?
It reads the data after a write, but ensures that it is the most recent data.
How much more RCU do ConsistentReads use?
Twice as much
A single RCU is how much for a strongly consistent read?
One read per second up to 4kb in size
A single RCU is how much for an eventually consistent read?
Two read per second up to 4kb in size
Are WCU and RCU spread evenly amongst partitions?
Yes.. That means if I have 5 partitions and 5 WCU, each partition will get one.
In DynamoDB on-Demand mode, do you need to provision RCU/WCU?
No. This scales up and down automatically.
If you want to change an item’s attributes in DynamoDB, what API call is used?
UpdateItem
If you want to create an item in DynamoDB, what API call is used?
PutItem
If you want to fully replace an item in DynamoDB, what API call is used?
PutItem
If you want to read an item in DynamoDB, what API call is used?
GetItem
What does a ProjectedExpression in DynamoDB do?
It can be specified to retrieve only certain attributes from an item.
What is a FilterExpression in DynamodDB?
Additional filtering after the query operation has been completed. Only works with non-key attributes and does not support hash or range.
How much data will scan return in DynamodDB?
1MB, you must use pagination to keep on reading.
Can you use Scan to filter out data?
Yes, but you are still loading all attributes. This is inefficient.
What can be done in DynamoDB for more performance?
Use Parallel Scan.
How many PutItem API calls can be included in a BatchWriteItem call in Dynamo DB?
25 PutItem and/or DeleteItem in one call.
16MB od data written and up to 400K per item.
Can BatchWriteItems in DynamoDB be used with UpdateItem?
No
How many GetItem API calls can be included in a BatchGetItem call in Dynamo DB?
100 items, up to 16MB of data
In DynamoDB, are items being retrieved from a BatchGetItem API call retrieved in parallel?
Yes
What is PartiQL?
A SQL like query language for Dynamo DB
What are the two types of indexes in DynamoDB?
LSI and GSI
Does an LSI use the same partition key of the base table?
Yes
What data types can be used for a sort key in an LSI?
String, Number, Binary
How many LSI can you have per table?
5
When are LSIs defined?
Creation of the table
Can LSI only include certain attributes?
Yes, using attribute projections.
Does a GSI use the same partition key of the base table?
No
Do you have to provision WCU and RCU for a GSI?
Yes
Can a GSI be added or modified after a table is created?
Yes
When a GSI is throttled, what happens to the main table?
It is also throttled.
Where do LSIs get their WCU and RCU?
From the main table
What problem does DynamoDB Accelerator (DAX) solve?
The problem of too many reads.
What is the default TTL for DynamoDB DAX caches?
5 minutes
How many nodes can be in a DAX Cluster?
10
What are DynamoDB Streams?
They write item level modifications in a table.
Where can DynamoDB Streams stream to?
Kinesis Data Streams
Lambda
KCL
How long is data in a DynamoDB Streams retained?
24 hours
What do KEYS_ONLY show in DynamoDB Streams?
Only the key attributes of the modified items.
What does NEW_IMAGE show in DynamoDB Streams?
The entire item as it appears after it was modified
What does OLD_IMAGE show in DynamoDB Streams?
The entire item as it appears before it was modified
What does NEW_AND_OLD_IMAGES show in DynamoDB Streams?
Both the new and old images of the item.
Can DynamoDB streams be used retroactively?
No
If you want Lambda to Poll DynamoDB Streams, how would you configure it?
Using Triggers. Use the DynamoDB trigger.
What does TTL do in DynamoDB?
It allows you to automatically delete items after an expiry timestamp.
Does TTL use WCU?
No
What is the max length for expired items to be deleted?
48 hours
What data type should the TTL use?
Number with Unix Epoch timestamp
What is the best way to store images in DynamoDB?
Use S3 for the object and store the key into dynamoDB.
What are Global Tables in DynamoDB?
Multi-Region, Multi-Active, Fully replicated tables.
Does DynamodDB support DMS?
Yes
Do all RDS databases support ACID transactions?
Yes
In ACID transactions, what happens to a transactions that partially fails?
The entire transaction fails
How many read replicas can you have in Aurora?
15
Does Aurora have continuous backup to S3?
Yes
What are the two types of locks?
Shared
Exclusive
What does a shared lock do?
Allows reads, but prevents writes
What does an exclusive lock do?
Prevents all reads and writes to a resource. Only one transaction can hold the lock.
Do relational databases manage locks automatically?
Yes
What is Document DB?
A NoSQL database based on MongoDB
What is MemoryDB for Redis?
A redis compatible in-memory database service.
What is Amazon Keyspaces?
Managed Apache Cassandra NoSQL database
How do you query Amazon Keyspaces?
Cassandra Query Language (CQL)
What is Amazon Neptune?
A fully managed graph database.
What languages does Amazon Neptune support?
Gremlin, OpenCypher, and SPARQL
What is Amazon Timestream?
A managed timestream database
What is Amazon Redshift?
A petabyte scale data warehouse
Is Redshift OLAP or OLTP?
OLAP
What nodes are in a Redshift cluster?
A leader node and one or more compute nodes.
What does the leader node do?
It communicates with the client and also develops an execution plan.
What do the compute nodes do?
They execute the execution plan generated by the leader node.
What does each compute node have in regards to resources?
CPUs, Memory, and attached disk storage.
What are node slices in Redshift?
They process a portion of the workload that is assigned to that node.
Where do the resources from node slices come from?
They come from the resources assigned to the compute node.
What compression does RedShift Spectrum support?
GZIP and Snappy
What block size does RedShift use?
1MB
What happens when you turn the backup retention period to zero?
It disables automatic backup.
What cluster types must be used for Multi-AZ Redshift?
RA3
Does Redshift scale horizontally or vertically?
Both
What is the process for Redshift Scalling on the backend?
Your old cluster remains available for reads
A new cluster is created
The cname is flipped to a new clister
Data moved in parallel to new compute nodes
What are the Redshift distribution Styles?
Auto
Even
Key
All
What does the Redshift distribution style Auto do?
The default. It bases distribution on what it thinks is best.
What does the Redshift distribution style Even do?
It distributes data in a round robin fashion.
What does the Redshift distribution style Key do?
It distributes data based on the value in a column.
What does the Redshift distribution style All do?
It distributes the entire dataset to every node.
How do you get data into Redshift?
Using the COPY Command?
Where can I use the COPY command from?
S3, EMR, DynamoDB, and remote hosts
When using the COPY command from S3, what is required?
A manifest file and an IAM role.
How do you get data out of RedShift?
Use the UNLOAD command
What does Enhanced VPC routing do?
Ensures routing happens through the AWS backbone and not the internet.
What does Auto-Copy in S3 do?
It automatically loads data from S3
What does Amazon Aurora Zero ETL do?
It replicates data from Aurora to RedShift
What does Redshift Streaming ingestion do?
It loads data from Kinesis data streams or Managed Service for Kafka
Is the COPY command for Redsgift meant for internal or external use?
external
Can the COPY command decrypt data as it is loaded from S3?
Yes
What is a narrow table?
A table with lots of rows, but few columns
What is the best way to load a narrow table into RedShift?
Using a single COPY command.
What is the process for copying a KMS encrypted snapshot to another region?
In the destination region you will create a KMS key
Create a copy grant
Specify the KMS key ID for which you are creating the copy grant
In the source destination you will enable copying to the copy grant you created.
What does DBLink in RedShift do?
It allows you to connect to a PostgreSQL database.
What does Redshift Workload Management (WLM) do?
It prioritizes short fast queries over long slow ones.
How do you create different query queues in RedShift?
Using Workload Management (WLM)
What is RedShift concurrency scaling?
It automatically adds cluster capacity to handle and increase in concurrent read queries.
How many queues can you have with automatic workload management?
8
What are query monitoring rules in Workload Management (WLM)?
They monitor queries and when the defined threshold is exceeded, an action is performed. e.g., when a query hits 60 seconds, it aborts and is moved to a long running query queue.
How many queues can you have with manual workload management?
5
What does Short Query Acceleration (SQA) do?
It prioritizes short running queries over loner running ones.
Where are queries run when using Short Query Acceleration (SQA) ?
In a dedicated space so they don’t sit behind long queries.
Does SQA work with CTAS?
Yes
Are SQA and WLM similar?
Yes
When to use SQA over WLM?
When all you need to do is prioritize short running queries.
What does VACUUM do in RedShift?
It recovers space from deleted rows and restore sort order.
What are the four types of vacuum commands
VACUUM FULL
VACUUM DELETE ONLY
VACUUM SORT ONLY
VACUUM REINDEX
How does VACUUM FULL work?
Default
It resorts the rows and reclaims space from deleted rows.
How does VACUUM DELETE ONLY work?
It only reclaims deleted row space
How does VACUUM SORT ONLY work?
It resorts the rows, but does not reclaim disk space.
How does VACUUM Reindex work?
It is used with Interleaved sort keys
How does Elastic Resize work in RedShift?
You can add or remove nodes of the same type.
Cluster is down for a few minutes.
How does Classic Resize work in RedShift?
You can change node types and number of nodes.
Could be read-only for hours or days.
How do you keep your redshift cluster available while performing a classic resize?
Snapshot, Restore, Resize and then cut over to new cluster when copy is complete.
What Redshift node type allows independent scaling of compute and storage?
RA3
What does Redshift DataLake Export do?
Allows you to Unload Redshift Query to S3 in Parquet format.
Does RedShift support spatial datatypes?
Yes
Can you share live data across redshift clusters between regions?
Yes using cross-region data sharing.
What is the requirement to use Redshift cross-region data sharing?
RA3 node types
If you want to use your HSM witH RedShift, what certificates are required?
Both client and server
Does Redshift come in serverless?
Yes
What is a good use case for Redshift serverless?
Test environments
Ad hoc business analysis
What needs to be set up manually to use Redshift Serverless?
An IAM role with an action redshift-serverless:*
How is RedShift Serverless Billed?
Redshift Processing Units RPUs by the hours
Can you adjust the base capacity of RedShift Serverless RPU?
Yes
Can you set the maximum RPU in Redshift Serverless?
Yes
What can’t Redshift Serverless do that Redshift can?
Parameter Groups
Workload Management
Maintenance Windows / Version tracks
Is Redshift Serverless externally available?
No. It must be called from within the VPC.
What are some Redshift views for Monitoring performance?
SYS_QUERY_HISTORY
SYS_LOAD_HISTORY
SYS_SERVERLESS_USAGE
What is a materialized view in Redshift?
It is precomputing the query and storing the results..
A traditional view just stores the query.
What is materialized views good for?
Performance Optimization
How do you keep materialized views in synch automatically?
Set AUTO REFRESH
What does Redshift data sharing do?
It shares the data with other clusters for read purposes.
What are the encryption requirements for Redshift Data Sharing?
Both clusters must be encrypted.
What is a Redshift Lambda UDF?
It allows you to call a Lambda function in your SQL queries.
How do you register a Redshift External Function?
CREATE EXTERNAL FUNCTION
What are Redshift federated queries?
Allows you to access live data in Aurora and RDS using RedShift.
How do you connect to an Aurora / RDS instance for federated queries?
CREATE EXTERNAL SCHEMA
In Redshift, what are SYS views for?
To monitor query and workload usage.
In Redshift, what are STV views for?
They monitor Transient data containing snapshots of current system data.
In Redshift, what are SVV views for?
Metadata about DB objects that reference STV tables.
In Redshift, what are STL views for?
Generated from logs persisted to disk.
In Redshift, what are SVCS views for?
Details about queries on main and concurrency scaling clusters.
In Redshift, what are SVL views for?
Details about queries on main clusters
What is the Redshift Data API?
Secure HTTP endpoint for SQL Statements to Redshift clusters.
Is the Redshift Data API asynchronous?
Yes