Databases and Analytics Flashcards
RDS is for Online Transaction Processing (OLTP) workloads that _____ lots of _____ transactions
RDS is for Online Transaction Processing (OLTP) workloads that process lots of small transactions
Online Transaction Processing (OLTP) workloads use cases would be:
_______ orders
_______ transactions
Payments
_______ systems
Online Transaction Processing (OLTP) workloads use cases would be
Customer orders
Banking transactions
Payments,
Booking systems
Redshift is a ________ database that is used for Online ________ Processing (OLAP) or data ___________
Redshift is a relational database that is used for Online Analytics Processing (OLAP) or data warehousing
Online Analytics Processing (OLAP) is ideal for tasks like analyzing large amounts of data _________ and sales ___________
Online Analytics Processing (OLAP) is ideal for tasks like analyzing large amounts of data reporting and sales forecasting
Amazon RDS Read Replicas enable you to create one or more ____-____ _______ of your _______ instance within the _____ AWS Region or in a _________ AWS Region
Amazon RDS Read Replicas enable you to create one or more read-only copies of your database instance within the same AWS Region or in a different AWS Region
Read replicas are primarily used for _______ and improving performance not ________ ________
Read replicas are primarily used for scaling and improving performance not disaster recovery
Automatic ______ must be enabled in order to deploy a read ______
Automatic backups must be enabled in order to deploy a read replica
Multi-AZ in regards to databases is used only for disaster recovery. In the event of a failure, RDS will automatically ________ to the _______ instance
Multi-AZ in regards to databases is used only for disaster recovery. In the event of a failure, RDS will automatically failover to the standby instance
Scenario question: Your database is bottlenecking how can you get around it?
Create a Read Replicas
Aurora is Amazon’s proprietary database that is compatible with ______, as well as __________.
Aurora is Amazon’s proprietary database that is compatible with MySQL, as well as PostgreSQL
Amazon Aurora is very redundant. You always have __ copies of your data in each ___ within a minimum of __ AZ’s. Giving you a total of __ copies
Amazon Aurora is very redundant. You always have #2 copies of your data in each AZ with a minimum of #3 AZ’s. Giving you a total of #6 copies
You can take ________ with Auroras and share them with other AWS ________
You can take snapshots with Auroras and share them with other AWS accounts
DynamoDB is spread across #__ geographically ______ data centers to ensure _________
DynamoDB is spread across 3 geographically distinct data centers to ensure resiliency
With DynamoDB you get eventually ________ reads by default
With DynamoDB you get eventually consistent reads by default
What are the three read options you get with DynamoDB?
eventual consistency,
strong consistency,
transactional.
The acronym ACID refers to the four key properties of a transaction:
Atomicity,
Consistency,
Isolation,
Durability
Scenario questions that mention ACID requirements should make you think of _________ transactions
Scenario questions that mention ACID requirements should make you think of DynamoDB transactions
What effect does DynamoDB On-Demand Backup have on the performance or availability of your tables?
____ impact on performance while offering ____ backups
Zero impact on performance while offering full backups
In which region is DynamoDB On-Demand Backups retained?
Same region as the source table
DynamoDB Point-in-Time Recovery protects against accidental ______ or _______
DynamoDB Point-in-Time Recovery protects against accidental writes or deletes
When using DynamoDB Point-in-Time Recovery what is the last restorable point in the past?
5 minutes in the past.
Streams are time-ordered sequences of ____ level changes in a ____.
Streams are time-ordered sequences of item level changes in a table.
Every shard in a stream is stored for how long?
24 hours
A Database on an EC2 is ideal if you need (2)
____ _______ over instance and database or have a _____-_____ database engine (not avail in RDS)
full control over instance and database or have a third-party database engine (not avail in RDS)
Amazon RDS is ideal if you have data that is well-______ and ________
Amazon RDS is ideal if you have data that is well-formed and structured
DynamoDB 4 main features:
NoSQL
High I/O needs
Dynamic Scaling
In-memory performance
Data Warehouse for large volumes of aggregated data
Amazon Redshift
Amazon Elasticache is fast _________ storage for _____ amounts of data.
Amazon Elasticache is an in-_______ store.
Amazon Elasticache is fast temporary storage for small amounts of data.
Amazon Elasticache is an in-memory store.
Amazon EMR is an ________ workload using the ________ framework
Amazon EMR is an Analytics workload using the Hadoop framework
What are the 6 Database engines that Amazon RDS supports?
_____ Server
______ SQL
___ Server
_____
_____
______
MySQL Server
PostgreSQL
SQL Server
Aurora
Oracle
MariaDB
Amazon RDS has a ______ maintenance schedule by default but you can choose your own. OS & DB ________ is what happens during the weekly maintenance window
Amazon RDS has a weekly maintenance schedule by default but you can choose your own. OS & DB patching is what happens during the weekly maintenance window
When is the only time you can enable encryption on an AWS RDS DB instance?
When you create it
DBs that has encryption enabled cannot be disable it
You cannot have an ________ read replica of an _________ DB instance
You cannot have an encrypted read replica of an unencrypted DB instance
You cannot restore an unencrypted _______ or _______ to an encrypted __ instance
You cannot restore an unencrypted backup or snapshot to an encrypted DB instance
Read replicas of encrypted primary instances are ________
Read replicas of encrypted primary instances are encrypted
Amazon Aurora is up to five times faster than standard _____ databases and three times faster than standard _________ databases
Amazon Aurora is up to five times faster than standard MySQL databases and three times faster than standard PostgresSQL databases
Aurora Serverless Use Cases (5)
New or ___________ used applications
- *_____**-tenant applicatons
- *___________** Workloads
- *_______** Workloads
- *___________** and test databases
New or Infrequently used applications
- *Multi**-tenant applicatons
- *Unpredictable** Workloads
- *Variable** Workloads
- *Development** and test databases
Amazon ElastiCache is a fully managed caching service for _____ and _________
Amazon ElastiCache is a fully managed caching service for Redis and Memcached
Amazon ElastiCache is a ____/____ store
Amazon ElastiCache is a key/value store
Amazon ElastiCache can be put in front of databases such as ____ and _________
Amazon ElastiCache can be put in front of databases such as RDS and DynamoDB
ElastiCache nodes run on ___ _________ so you must choose an ________ ______ type
ElastiCache nodes run on EC2 instances so you must choose an instance family type
Between Memcached and Redis, which offers Data Persistence?
Between Memcached and Redis, which offers Data Persistence?
Redis offers Data Persistence
Memcached does not
Memcached will place nodes in Multi-AZs but you will not get ________ or _________
Memcached will place nodes in Multi-AZs but you will not get failover, or replication
You can use ElastiCache for caching, which accelerates ___________ and ________ performance
You can use ElastiCache for caching, which accelerates application and database performance
Amazon ElastiCache can also be a primary data store for use cases that don’t require durability like gaming ________, ________, and _______
Amazon ElastiCache can also be a primary data store for use cases that don’t require durability like gaming leaderboards, streaming, and analytics.
You can restore your DynamoDB database backup to any point in the last ___ days, the backups are __________. This feature is not _________ by _________.
At any point in the last 35 days, the backups are incremental. Not enabled by default.
DynamoDB Accelerator is an __-_______ cache that increases __________ (microsecond latency)
The acronym DAX represents DynamoDB Accelerator which is an In-memory cache that increases performance (microsecond latency)
What type of backups and copies of your data does Redshift offer?
Redshift offers continuous _________ backups
And always keeps ______ copies of your data
continuous incremental backups
Always keeps three copies of your data
Amazon EMR can be used for ___________ and ______ large amounts of ____
Amazon EMR can be used for transforming and moving large amounts of data
Kinesis Data Streams enables real-time processing of ________ ___ ____
Kinesis Data Streams enables real-time processing of streaming big data
The Kinesis Client Library helps you ______ and ______ data from a Kinesis data stream
The Kinesis Client Library helps you consume and process data from a Kinesis data stream
Kinesis Data Firehose _____, _____ and loads streaming data
Kinesis Data Firehose Captures, Transforms, and loads streaming data
With Kinesis Data Firehose there are no ____, everything is ______
With Kinesis Data Firehose there are no Shards, everything is automated
Between Memcached and Redis, which offers encryption?
Between Memcached and Redis, which offers encryption?
Redis offers encryption
Memcached does not
Kinesis Data Firehose enables ____ real-time _______ with existing business intelligence tools and dashboards
Kinesis Data Firehose enables near real-time analytics with existing business intelligence tools and dashboards
Kinesis Data Firehose possible destinations:
S3
_____
Data___
HTTP ______
Mongo___
S3
Splunk
Datadog
HTTP Endpoint
MongoDB
Kinesis Data Analytics provides ____-____ SQL processing for streaming data
Kinesis Data Analytics provides real-time SQL processing for streaming data
Kinesis Data Analytics destination can be
Kinesis ________
Kinesis _________
Lambda
Kinesis Data Streams
Kinesis Data Firehose
Lambda
Amazon Athena is used for _______ ____ in S3 using SQL
Amazon Athena is used for querying data in S3 using SQL
You would connect Amazon Athena to data sources other than S3 by using _____.
You would connect Amazon Athena to data sources other than S3 by using Lambda
The methods that can be used to optimize Amazon Athena include:
_______ you data
______ your data
________ your data
- *Partition** you data
- *Bucket** your data
Compress your data
AWS Glue is a fully managed extract, _______, and ____ service that is used for preparing data for analytics.
AWS Glue is a fully managed extract, transform and load service that is used for preparing data for analytics.
AWS Glue discovers data and stores the associated metadata in the AWS ____ ____ Catalog
AWS Glue discovers data and stores the associated metadata in the AWS Glue Data Catalog
You can use a _______ to populate the AWS Glue Data Catalog with _______
You can use a crawler to populate the AWS Glue Data Catalog with tables
Between Memcached and Redis, which offers Multithreading?
Between Memcached and Redis, which offers Multithreading?
Redis offers Multithreading
Memcached does not
A crawler can crawl multiple data stores in a ______ ___
A crawler can crawl multiple data stores in a single run
A real-time solution to process or move data is called _____.
A real-time solution to process or move data is called Kinesis
Simple Queue Service (SQS) and Kinesis can both be queues, but each has its pros and cons:
SQS is _____ and _____
Kinesis is _____ & can store data for up to a ____
SQS is easier and simpler
Kinesis is faster & can store data for up to a year
Anytime Serverless SQL comes up on the test think of _______ _______.
Anytime Serverless SQL comes up on the test think of Amazon Athena
Quicksight is a service that is used for _______ the data in a dashboard or _____
Quicksight is a service that is used for visualizing the data in a dashboard or graph
The acronym DAX represents __________ ___________
The acronym DAX represents DynamoDB Accelerator
An Aurora global database consists of ___ primary AWS Region where your data is mastered and up to ____ read-only, _________ AWS Regions
An Aurora global database consists of one primary AWS Region where your data is mastered and up to five read-only, secondary AWS Regions
Amazon Athena is an interactive _____ service that makes it easy to _______ data in Amazon __ using standard SQL.
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
Athena is serverless, so there is no infrastructure to manage, and you pay only for the ______ that you run.
Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.