6_DynamoDB, Redshift, Elasticache, Aurora Flashcards

Question 1

Q

DynamoDB vs RDS

DynamoDB offers “push button” scaling, meaning that you can scale your database on the fly, without any downtime.

RDS is not so easy and you usually have to use a bigger instance size (scale up) or to add a read replica.

Answer

A

DynamoDB vs RDS

DynamoDB offers “push button” scaling, meaning that you can scale your database on the fly, without any downtime.

RDS is not so easy and you usually have to use a bigger instance size (scale up) or to add a read replica.

Question 2

Q

DynamoDB

Stored exclusively on SSD storage to provide high I/O performance
Spread across 3 geographically distinct data centres
Eventual Consistent Reads (default)
- Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data (Best read performance)
Strongly Consistent Reads
- A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read

Answer

A

DynamoDB

Stored on SSD storage
Spread across 3 geographically distinct data centres
Eventual Consistent Reads (default)
- Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data (Best read performance)
Strongly Consistent Reads
- A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read

Question 3

Q

DynamoDB Accelerator (DAX) [SAA-C02]

Fully managed, highly available, in-memory cache
10x performance improvement
Reduces request time from milliseconds to microseconds - even under load
No need for developers to manage caching logic
Compatible with DynamoDB API calls

Question 4

Q

DynamoDB Backup and Restore [SAA-C02]

Point -in-Time Recovery (PITR)

Protects against accidental writes or deletes
Restore to any point in the last 35 days
Incremental backups
Not enables by default
Latest restorable: five minutes in the past

Question 5

Q

DynamoDB Streams [SAA-C02]

DynamoDB Stream is an ordered flow of information about changes to items in an Amazon DynamoDB table. When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table.

When you enable DynamoDB Streams on a table, you can associate the stream ARN with a Lambda function that you write. Immediately after an item in the table is modified, a new record appears in the table’s stream. AWS Lambda polls the stream and invokes your Lambda function synchronously when it detects new stream records.

Time-ordered sequence of item-level changes in a table
Stored for 24 hours
Inserts, updates and deletes

Question 6

Q

DynamoDB - Global Tables [SAA-C02]

Globally distributed applications
Based on DynamoDB streams
Multi-region redundacy for DR or HA
No application rewrites
Replication latency under one second

Question 7

Q

Redshift

Amazon Redshit is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1000 per terabyte per year, less than a tenth of most other data warehousing solutions.

Redshift is used for Business Intelligence
Available in only 1 AZ

Answer

A

Redshift

Amazon Redshit is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1000 per terabyte per year, less than a tenth of most other data warehousing solutions.

Question 8

Q

Redshift Configuration

Single Node (up to 160Gb)
Multi-Node
- Leader Node (manages client connections and receives queries)
- Compute Node (store data and perform queries and computations). Up to 128 Compute Nodes

Answer

A

Redshift Configuration

Single Node (up to 160Gb)
Multi-Node
- Leader Node (manages client connections and receives queries)
- Compute Node (store data and perform queries and computations). Up to 128 Compute Nodes

Question 9

Q

Columnar Data Storage

Instead of storing data as a series of rows, Amazon Redshift organizes the data by column. Unlike row-based systems, which are ideal for transaction processing, column-based systems are ideal for data warehousing and analytics, where queries often involve aggregates performed over large data sets. Since only the columns involved in the queries are processed and columnar data is stored sequentially on the storage media, column-based systems require far fewer I/Os, greatly improving query performance.

Answer

A

Columnar Data Storage

Instead of storing data as a series of rows, Amazon Redshift organizes the data by column. Unlike row-based systems, which are ideal for transaction processing, column-based systems are ideal for data warehousing and analytics, where queries often involve aggregates performed over large data sets. Since only the columns involved in the queries are processed and columnar data is stored sequentially on the storage media, column-based systems require far fewer I/Os, greatly improving query performance.

Question 10

Q

Advanced Compression

Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. Amazon Redshift employs multiple compression techniques and can often achieve significant compression relative to traditional relational data stores. In addition, Amazon Redshift doesn’t require indexes or materialized views and so uses less space than traditional relational database systems.

When loading data into an empty table, Amazon Redshift automatically samples your data and selects the most appropriate compression scheme.

Answer

A

Advanced Compression

Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. Amazon Redshift employs multiple compression techniques and can often achieve significant compression relative to traditional relational data stores. In addition, Amazon Redshift doesn’t require indexes or materialized views and so uses less space than traditional relational database systems.

When loading data into an empty table, Amazon Redshift automatically samples your data and selects the most appropriate compression scheme.

Question 11

Q

Massively Parallel Processing (MPP)

Amazon Redshift automatically distributes data and query load across all nodes.
Amazon Redshift makes it easy to add nodes to your data warehouse and enables you to maintain fast query performance as your data warehouse grows.

Answer

A

Massively Parallel Processing (MPP)

Amazon Redshift automatically distributes data and query load across all nodes.
Amazon Redshift makes it easy to add nodes to your data warehouse and enables you to maintain fast query performance as your data warehouse grows.

Question 12

Q

Redshift - Backups

Enabled by default with a 1 day retention period.
Maximum retention period is 35 days.
Redshift always attempts to maintain at least three copies of your data (the original and replica on the compute nodes and a backup in Amazon S3).
Redshift can also asynchronously replicate your snapshots to S3 in another region for disaster recovery.

Question 13

Q

Aurora Scaling

Start with 10Gb, scales in 10Gb increments to 64Tb (Storage Autoscaling)
Compute resources can scale up to 32vCPUs and 244Gb of memory
2 copies of your data is contained in each availability zone, with minimum of 3 availability zones. 6 copies of your data
You can share Aurora Snapshots with other AWS accounts.
Use Aurora Serverless if you want a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

Answer

A

Aurora Scaling

Start with 10Gb, scales in 10Gb increments to 64Tb (Storage Autoscaling)
Compute resources can scale up to 32vCPUs and 244Gb of memory
2 copies of your data is contained in each availability zone, with minimum of 3 availability zones. 6 copies of your data
Aurora is designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability
Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and repaired automatically

Question 14

Q

Aurora Replicas

3 Types of Replicas are available:

Aurora Replicas (currently 15)
MySQL Read Replicas (currently 5)
PostgresQL Replicas

Automated failover is only available with Aurora Replicas.

Answer

A

Aurora Replicas

2 Types of Replicas are available:

Aurora Replicas (currently 15)
MySQL Read Replicas (currently 5)

Question 15

Q

Aurora - Additional Tips

Aurora has automated backup turned on by default. You can also take Snapshots with Aurora.
You can share Aurora Snapshots with other AWS accounts.

Question 16

Q

Elasticache

Elasticache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases.

Caching improves application performance by storing critical pieces of data in memory for low-latency access. Cached information may include the results of I/O-intensive database queries or the results of computationally-intensive calculations.

Use Elasticache to increase database and web application performance.

Answer

Study These Flashcards

A

Elasticache

Elasticache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases.

Caching improves application performance by storing critical pieces of data in memory for low-latency access. Cached information may include the results of I/O-intensive database queries or the results of computationally-intensive calculations.

Question 17

Q

Types of Elasticache

Memcached
- A widely adopted memory object caching system. Elasticache is protocol compliant with Memcached, so popular tools that you use today with existing Memcached environmnents will work seamlessly with the service.
Redis
- A popular open-source in-memory key-value store that supports data structures such as sorted sets and lists. Elasticache supports Master/Slave replication and Multi-AZ which can be used to achieve cross AZ redundancy.
  - Redis is multi-AZ
  - You can do backups and restores of Redis

Answer

Study These Flashcards

A

Types of Elasticache

Memcached
- A widely adopted memory object caching system. Elasticache is protocol compliant with Memcached, so popular tools that you use today with existing Memcached environmnents will work seamlessly with the service.
Redis
- A popular open-source in-memory key-value store that supports data structures such as sorted sets and lists. Elasticache supports Master/Slave replication and Multi-AZ which can be used to achieve cross AZ redundancy.

Question 18

Q

Elasticache Exam Tips

Typically you will be given a scenario where a particular database is under a lot of stress/load. You may be asked which service you should use to alleviate this.

Elasticache is a good choice if your database is particularly read heavy and not prone to frequent changing (use Read Replicas instead when more writes/updates to the main DB).

Redshift is a good answer if the reason your database is feeling stress is because management keep running OLAP transactions on it etc.

Answer

Study These Flashcards

A

Elasticache Exam Tips

Typically you will be given a scenario where a particular database is under a lot of stress/load. You may be asked which service you should use to alleviate this.

Elasticache is a good choice if your database is particularly read heavy and not prone to frequent changing (use Read Replicas instead when more writes/updates to the main DB).

Redshift is a good answer if the reason your database is feeling stress is because management keep running OLAP transactions on it etc.

Question 19

Q

DMS [SAA-C02]

DMS allows you to migrate databases from one source to AWS
The source can either be on-premises, or inside AWS itself ot another cloud provider such as Azure
You can do homogenous migrations (same DB engines) or heterogeneous migrations
If you do a heterogeneous migration, you will need the AWS Schema Conversion Tool (SCT)

Answer

Study These Flashcards

A

Question 20

Q

Caching Strategies on AWS [SAA-C02]

Caching is a balancing act between up-to-date, accurate information and latency. We can use the following services to cache on AWS:

CloudFront
API Gateway
ElastiCache - Memcached and Redis
DynamoDB Accelerator (DAX)

Answer

Study These Flashcards

A

Question 21

Q

EMR [SAA-C02]

EMR is used for big data processing
Consists for a master node, a core node, and optionally a task node
By default, log data is stored on the master node
You can configure replication to S3 on five-minutes intervals for all log data from the master node; however this can only be configured when creating the cluster for the first time

Answer

Study These Flashcards

A

6_DynamoDB, Redshift, Elasticache, Aurora Flashcards

(21 cards)