AWS Database Services Flashcards by Jaja Apostol

When to use Database on EC2

Full control over instance and database

Preferred DB not available under RDS

How well did you know this?

Not at all

Perfectly

When to use Amazon RDS

Need traditional relational database for OLTP

Your data is well-formed and structured

Existing applications requiring RDBMS

How well did you know this?

Not at all

Perfectly

When to use Amazon DynamoDB

Name/value pair data

Unpredictable data structure

In-memory performance with persistence

High I/O needs

Require dynamic scaling

How well did you know this?

Not at all

Perfectly

When to use Amazon RedShift

Data warehouse for large volumes of aggregated data

Primarily OLAP workloads

How well did you know this?

Not at all

Perfectly

When to use Amazon Neptune

Relationships between objects are of high value

How well did you know this?

Not at all

Perfectly

When to use Amazon ElastiCache

Fast temporary storage for small amounts of data

Highly volatile data (non-persistent)

How well did you know this?

Not at all

Perfectly

When to use Amazon S3

Binary large objects (BLOBs)

Static websites

How well did you know this?

Not at all

Perfectly

A managed service that makes it easy to set up, operate, and scale a relational database in the cloud

Amazon Relational Database Service (RDS)

How well did you know this?

Not at all

Perfectly

RDS facts

Relational databases are known as Structured Query Language (SQL) databases.

Non-relational databases are known as NoSQL databases.

RDS is an Online Transaction Processing (OLTP) type of database.

Aurora is Amazon’s proprietary database.

RDS is a fully managed service and you do not have access to the underlying EC2 instance (no root access)

How well did you know this?

Not at all

Perfectly

Amazon Relational Database Service (RDS) features and benefits

SQL type of database.

Can be used to perform complex queries and joins.

Easy to setup, highly available, fault tolerant, and scalable.

Used when data is clearly defined.

Common use cases include online stores and banking systems.

How well did you know this?

Not at all

Perfectly

Database engines supported by Amazon RDS

SQL Server.

Oracle.

MySQL Server.

PostgreSQL.

Aurora.

MariaDB.

How well did you know this?

Not at all

Perfectly

Amazon RDS includes:

Security and patching of the DB instances.

Automated backup for the DB instances.

Software updates for the DB engine.

Easy scaling for storage and compute.

Multi-AZ option with synchronous replication.

Automatic failover for Multi-AZ option.

Read replicas option for read heavy workloads.

How well did you know this?

Not at all

Perfectly

A database environment in the cloud with the compute and storage resources you specify

Database Instance

How well did you know this?

Not at all

Perfectly

Amazon RDS encryption

You can encrypt your Amazon RDS instances and snapshots at rest by enabling the encryption option for your Amazon RDS DB instance.

Encryption at rest is supported for all DB types and uses AWS KMS.

You cannot encrypt an existing DB, you need to create a snapshot, copy it, encrypt the copy, then build an encrypted DB from the snapshot.

How well did you know this?

Not at all

Perfectly

A collection of subnets (typically private) that you create in a VPC and that you then designate for your DB instances

Database Subnet Group

How well did you know this?

Not at all

Perfectly

Database Subnet Group facts:

Each DB subnet group should have subnets in at least two Availability Zones in each region.

It is recommended to configure a subnet group with subnets in each AZ (even for standalone instances).

How well did you know this?

Not at all

Perfectly

AWS Charges for RDS:

DB instance hours (partial hours are charged as full hours).

Storage GB/month.
I/O requests/month – for magnetic storage.

Provisioned IOPS/month – for RDS provisioned IOPS SSD.

Egress data transfer.

Backup storage (DB backups and manual snapshots).

How well did you know this?

Not at all

Perfectly

RDS Scalability:

You can only scale RDS up (compute and storage).

You cannot decrease the allocated storage for an RDS instance.

You can scale storage and change the storage type for all DB engines except MS SQL.

How well did you know this?

Not at all

Perfectly

How RDS provides fault tolerance and disaster recovery

Study These Flashcards

RDS provides multi-AZ for disaster recovery which provides fault tolerance across availability zones:

Multi-AZ RDS creates a replica in another AZ and synchronously replicates to it (DR only).

There is an option to choose multi-AZ during the launch wizard.

AWS recommends the use of provisioned IOPS storage for multi-AZ RDS DB instances.

Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable.

You cannot choose which AZ in the region will be chosen to create the standby DB instance.

Read Replicas

Study These Flashcards

Read Replicas – provide improved performance for reads:

Read replicas are used for read heavy DBs and replication is asynchronous.

Read replicas are for workload sharing and offloading.

Read replicas provide read-only DR.

Read replicas are created from a snapshot of the master instance.

Must have automated backups enabled on the primary (retention period > 0).

A fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

Study These Flashcards

Amazon DynamoDB

Dynamo DB features and benefits

Study These Flashcards

NoSQL type of database (non-relational).

Fast, highly available, and fully managed.

Used when data is fluid and can change.

Common use cases include social networks and web analytics.

DynamoDB Facts

Study These Flashcards

Push button scaling means that you can scale the DB at any time without incurring downtime.

SSD based and uses limited indexing on attributes for performance.

DynamoDB is a Web service that uses HTTP over SSL (HTTPS) as a transport and JSON as a message serialization format.

Amazon DynamoDB stores three geographically distributed replicas of each table to enable high availability and data durability.

Data is synchronously replicated across 3 facilities (AZs) in a region.

Provides low read and write latency.

Scale storage and throughput up or down as needed without code changes or downtime.

DynamoDB is schema-less.

DynamoDB can be used for storing session state.

Amazon DynamoDB cross-region replication

Study These Flashcards

Cross-region replication allows you to replicate across regions:

Amazon DynamoDB global tables provides a fully managed solution for deploying a multi-region, multi-master database.

When you create a global table, you specify the AWS regions where you want the table to be available.

DynamoDB performs all the necessary tasks to create identical tables in these regions and propagate ongoing data changes to all of them.

Amazon DynamoDB read models:

Eventually consistent reads (Default): The eventual consistency option maximizes your read throughput (best read performance). An eventually consistent read might not reflect the results of a recently completed write. Consistency across all copies reached within 1 second. Strongly consistent reads: A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read (faster consistency).

A fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.

Amazon DynamoDB Accelerator (DAX)

A fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and existing Business Intelligence (BI) tools

Amazon RedShift

Amazon RedShift Facts

RedShift is a SQL based data warehouse used for analytics applications. RedShift is a relational database that is used for Online Analytics Processing (OLAP) use cases. RedShift is used for running complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. RedShift is ideal for processing large amounts of data for business intelligence. RedShift is 10x faster than a traditional SQL DB.

RedShift uses columnar data storage:

Data is stored sequentially in columns instead of rows. Columnar based DB is ideal for data warehousing and analytics. Requires fewer I/Os which greatly enhances performance.

RedShift provides advanced compression:

Data is stored sequentially in columns which allows for much better performance and less storage space. RedShift automatically selects the compression scheme.

How does Redshift enhance availability and improve durability

RedShift uses replication and continuous backups to enhance availability and improve durability and can automatically recover from component and node failures.

The 3 copies of data Redshift keeps

The original. A replica on compute nodes (within the cluster). A backup copy on S3.

RedShift provides continuous/incremental backups:

Multiple copies within a cluster. Continuous and incremental backups to S3. Continuous and incremental backups across regions. Streaming restore.

RedShift provides fault tolerance for the following failures:

Disk failures. Nodes failures. Network failures. AZ/region level disasters.

A web service that makes it easy to deploy and run Memcached or Redis protocol-compliant server nodes in the cloud.

Amazon ElastiCache

Amazon ElastiCache Benefits

The in-memory caching provided by ElastiCache can be used to significantly improve latency and throughput for many read-heavy application workloads or compute-intensive workloads. Best for scenarios where the DB load is based on Online Analytics Processing (OLAP) transactions.

ElastiCache Web session Store Use Case

In cases with load-balanced web servers, store web session information in Redis so if a server is lost, the session info is not lost, and another web server can pick it up

ElastiCache Use Cases

Web Session Store Database Caching Leaderboards Streaming data dashboards

ElastiCache Database Caching Use Case

Use Memcached in front of AWS RDS to cache popular queries to offload work from RDS and return results faster to users

ElastiCache Leaderboards Use Case

Use Redis to provide a live leaderboard for millions of users of your mobile app

ElastiCache Streaming data dashboards Use Case

Provide a landing spot for streaming sensor data on the factory floor, providing live real-time dashboard displays

ElastiCache Facts

ElastiCache EC2 nodes cannot be accessed from the Internet, nor can they be accessed by EC2 instances in other VPCs. Can be on-demand or reserved instances too (but not Spot instances). ElastiCache can be used for storing session state.

simplest model, can run large nodes with multiple cores/threads, can be scaled in and out, can cache objects such as DBs.

Memcached ElastiCache Engine

complex model, supports encryption, master / slave replication, cross AZ (HA), automatic failover and backup/restore.

Redis ElastiCache Engine

A web service that enables businesses, researchers, data analysts, and developers to process vast amounts of data easily and cost-effectively.

Amazon EMR

Amazon EMR Features

EMR utilizes a hosted Hadoop framework running on Amazon EC2 and Amazon S3. Managed Hadoop framework for processing huge amounts of data. Also support Apache Spark, HBase, Presto and Flink.

Amazon EMR Use Cases

Most commonly used for log analysis, financial analysis, or extract, translate and loading (ETL) activities.

AWS Database Services Flashcards

(47 cards)