- Non-relational, No SQL type database (fully managed) - Key-value store and document store - Fully serverless service - -->No need to launch or pay for instances - -->You are just allocating characteristics of the performance Horizontal scaling - -->Seamless scalability to any scale with PUSH BUTTON scaling or AUTO SCALING which means you can increase or decrease the performance w/out any interruption - ----->As opposed to Amazon RDS (Relational Database Service) you can scale your 'instance' up or down but you will have downtime b/c you have to restart your instance Highly available and can be reserved -->On demand backup and restore DynamoDB is made up of: - -Tables - -->Items exist in the tables - ----->Attributes exist in the items Global Tables - -fully managed multi-region, multi-master solution - -So, data can exist across multiple regions and be fully synchronized

Databases and Analytics Flashcards by Christine Darcy

Which Type of Database has a rigid scheme (SQL) and is scaled vertically?

Relational Database
Non-Relational Database

Relational Database vs Non-Relational Database

Key difference between Relational and Non-relational is how data is MANAGED and how data is STORED

Relational Database

-Organized by tables, rows and columns
-Rigid scheme (SQL)
-Rules enforced within database
-Typically scaled vertically
-Support complex queries and joins
-Amazon RDS, Oracle, MySQL, IMB DB2, PostgreSQL

How well did you know this?

Not at all

Perfectly

Which Type of Database allows varied data storage models, has a flexible schema w/ data stored in key value pairs, columns, documents or graphs and scales horizontally?

Relational Database
Non-Relational Database

Non-Relational Database

Key difference between Relational and Non-relational is how data is MANAGED and how data is STORED

-Varied data storage models
-Flexible schema (noSQL) - data stored in key value pairs, columns, documents or graphs
-Rules can be defined in application code (outside database)
-Scales horizontally
-Unstructured, simple language that supports any kind of schema
-Amazon DynamoDB, MongoDB (documents), Redis, Neo4j

How well did you know this?

Not at all

Perfectly

Which Type of Database is used for Online Transaction Processing (OLTP) and is best for short transactions and simple queries?

Operational/transactional Database
Analytical Database

Operational/transactional Database

Key differences are USE CASES and how the database is OPTIMIZED

-Online Transaction Processing (OLTP)
-Production DBs that process transactions
——ie: adding customer records, checking stock availability (INSERT, UPDATE, DELETE)
-Short transactions and simple queries

Relational examples:
—->Amazon RDS, Oracle, IBM DB2, MySQL

Non-relational examples:
—->Mongo DB, Cassandra, Neo4j, Hbase, Amazon DynamoDB

How well did you know this?

Not at all

Perfectly

Which Type of Database is used for Online Analytics Processing (OLAP) for long transactions and complex queries?

Operational/transactional Database
Analytical Database

Analytical Database

Key differences are USE CASES and how the database is OPTIMIZED

-Online Analytics Processing (OLAP) - the source data comes from OLTP DBs
-Data warehouse
——Typically separated from the customer facing DBs.
——Data is extracted for decision making
-Long transactions and complex queries

Relational examples:
—>Amazon RedShift, Teradata, HP Vertica

Non-relational examples:
—>Amazon EMR, MapReduce

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option if you need full control over instances and the database?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

Database on EC2

Use Case:
–Need full control over instance and database (you manage it)

–3rd party database engine (not available in RDS)

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option if you need a traditional Relational Database w/ well-formed and structured data?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

Amazon RDS

Use Case:

–Need traditional relational database

–Data is well-formed and structured

–Ex. Oracle, PostgreSQL, Microsoft SQL, MariaDB, MySQL

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option if you need a non-SQL database w/ in-memory performance and dynamic scaling?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

Amazon DynamoDB

Use Case:
§ NoSQL database
–In-memory performance

–High I/O needs

–Dynamic scaling

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option if you have a data warehouse w/ large volumes of aggregated data?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

Amazon Redshift

Use Case:

–Data warehouse for large volumes of aggregated data

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option for fast-temporary storage for small amounts of data?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

Amazon ElastiCache

Use Case:

–Fast temporary storage for small amounts of data

–In-memory database

–High performance

How well did you know this?

Not at all

Perfectly

Which AWS Database is the best option for analytic workloads using the Hadoop framework?

Amazon RDS
Amazon Dynamo DB
Amazon Redshift
Amazon ElastiCache
Amazon Elastic Map Reduce (EMR)
Amazon EC2

○ Amazon Elastic Map Reduce (EMR)

Use Case:

–Analytics workloads using the Hadoop framework

How well did you know this?

Not at all

Perfectly

Amazon Relational Database Service (RDS)

-Managed relational database - Structured Query Language (SQL) Databases
-Easy to setup, highly available, fault tolerant, and scalable
-Runs on EC2 instances so you must choose an instance family/type
-An Online Transaction Processing (OLTP) type of database

Common use cases:
–Online stores and banking systems

Can encrypt your Amazon RDS instances and snapshots at rest
—>Encryption uses AWS Key Management Service (KMS)

RDS supports the following database engines:

-SQL Server, Oracle, MySQL server, PostgreSQL, and Aurora
-Scales up by increasing INSTANCE size (compute and storage) or changing the INSTANCE type

Disaster recovery with Multi-AZ option by providing a passive standby instance:

Example of RDS database:
–Amazon Aurora

How well did you know this?

Not at all

Perfectly

Describe how Amazon Relational Database Service (RDS) provides disaster recovery using Multi-AZ option

Disaster recovery with Multi-AZ option by providing a passive standby instance:

RDS Master

-Runs in Availability zone
-Primary database (reads and writes)

RDS Master–> POINTS TO–>RDS Standby Instance

RDS Standby instance
–Master synchronously replicates to the Standby instance in a different Availability Zone

Read Replica

-An ‘asynchronous’ replication of the RDS Master so there is a little bit of a delay
-Located in same Availability zone
-Used to scale horizontally for reads/queries only (kind of like IDAA at SF)
-Application servers can ONY read from the read replica (can only write to the RDS Master)

How well did you know this?

Not at all

Perfectly

What is the name of the database that is part of the Amazon RDS family, is SQL and PosstgreSQL compatible, and features a distributed, fault tolerant, self-healing storage system that auto-scales up to 128TP per database instance?

Amazon Aurora

–RDS family

–My SQL and PostgreSQL- compatible relational database

–Features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128TB per database instance

–VERY fast

How well did you know this?

Not at all

Perfectly

Amazon DynamoDB

Non-relational, No SQL type database (fully managed)
Key-value store and document store
Fully serverless service
–>No need to launch or pay for instances
–>You are just allocating characteristics of the performance

Horizontal scaling

–>Seamless scalability to any scale with PUSH BUTTON scaling or AUTO SCALING which means you can increase or decrease the performance w/out any interruption
—–>As opposed to Amazon RDS (Relational Database Service) you can scale your ‘instance’ up or down but you will have downtime b/c you have to restart your instance

Highly available and can be reserved
–>On demand backup and restore

DynamoDB is made up of:

-Tables
–>Items exist in the tables
—–>Attributes exist in the items

Global Tables

-fully managed multi-region, multi-master solution
-So, data can exist across multiple regions and be fully synchronized

How well did you know this?

Not at all

Perfectly

Fully managed in-memory Cache for DynamoDB that increases performance up to 10x:

Dynamo Gateway
Dynamic Duo
Dynamo DB Accelerator (DAX)
Amazon Auto Scaling

Dynamo DB Accelerator (DAX)

Fully managed in-memory Cache for DynamoDB that increases performance up to 10x

How well did you know this?

Not at all

Perfectly

Amazon Redshift

Study These Flashcards

A Structured Query Language (SQL) based data warehouse used for analytics and applications

Relational database used for Online Analytics Processing (OLAP) use cases

Uses EC2 instances, so you must choose an instance family/type

Keeps three copies of your data

Provides continuous/incremental backups

Amazon Elastic Map Reduce (EMR)

Study These Flashcards

Managed cluster platform for BIG DATA including Apache Hadoop and Apache Spark

Hadoop is a framework for big data

Used for processing data for analytics and business intelligence
–Can also be used for transforming and moving large amounts of data

Performs Extract, Transform, and Load (ETL) functions

Amazon Elasticache

Study These Flashcards

Fully managed implementations
A key/value store
In-memory database used to cache data
High performance and low latency

Web session store (Redis)
–In cases w/ load-balanced web servers, store web session information in Redis so if a server is lost, the session info is not lost, and another web server can pick it up

Database caching (Memcached)
--Use Memcached in front of AWS RDS or DynamoDB to cache popular queries to offload work from RDS and return results faster to users

Leaderboards
–Use Redis to provide a live leaderboard for millions of users of your mobile app

Streaming data dashboards
–Provide a landing spot for streaming sensor data on the factory floor, providing live real-time dashboard displays

ElastiCache Node runs on EC2 instance
–Data is loaded into ElastiCache and is often used as web session store

Amazon Athena

Study These Flashcards

Amazon Athena

Interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL
–Serverless database

Can be connected to other data sources with Lambda

Uses a managed Data Catalog (AWS Glue) to store information and schemas in the databases and tables
○ AWS Glue
□ Metadata Catalog that can be used with Amazon Athena
□ Fully managed Extract, Transform, and Load (ETL) service
□ You transform and move the data to various destinations with AWS Glue
□ It is used to prepare and load data for analytics

A managed Data Catalog that can store information and schemas in the databases and tables:

Study These Flashcards

AWS Glue

Metadata Catalog that can be used with Amazon Athena

Fully managed Extract, Transform, and Load (ETL) service

You transform and move the data to various destinations with AWS Glue

It is used to prepare and load data for analytics

Amazon Kinesis Data Streams

Study These Flashcards

Amazon Kinesis Data Streams

Service for processing streaming data
–Makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information

Producers send data which is stored in shards for up to 7 days (Shard - logical chunks that help maintain order)
—>Consumers process the data and save to another service

Typically data that is very large and consistent volumes
—>Ex. Data recording info about equipment temperature, movement of a car, etc

Which Amazon Kinesis Data Stream has no shards, is completely automated and elastically scalable, and saves data directly to another service (such as S3, Splunk, Redshift, or Elastisearch)?

Amazon Kinesis Data Firehose
Amazon Kinesis Data Analytics

Study These Flashcards

Amazon Kinesis Data Firehose

No shards, completely automated and elastically scalable

Saves data directly to another service such as S3, Splunk, Redshift, or Elastisearch

Which Amazon Kinesis Data Stream provides real-time SQL processing for streaming data?

Amazon Kinesis Data Firehose
Amazon Kinesis Data Analytics

Study These Flashcards

Amazon Kinesis Data Analytics

Provides real-time SQL processing for streaming data

Processes and moves data between different AWS compute and storage services:

Amazon Kinesis Data Firehose
Amazon Neptune
Amazon Aurora
Amazon Data Pipeline

Study These Flashcards

AWS Data Pipeline

Processes and moves data between different AWS compute and storage services

Save results to services such as:
—>S3, RDS, DynamoDB, and EMR

A scalable, serverless, embeddable, machine learning-powered Business Intelligence (BI) service that provides a fast, cloud powered business analytics service which include easy to build visualizations and rich dashboards: Amazon Kinesis Data Firehose Amazon QuickSight Amazon Aurora Amazon Data Pipeline

Amazon QuickSight A scalable, serverless, embeddable, machine learning-powered Business Intelligence (BI) service Provides a fast, cloud powered business analytics service Easy to build stunning visualizations and rich dashboards Can be accessed from any browser or mobile device

Fully managed graph database: Amazon Kinesis Data Firehose Amazon Neptune Amazon Aurora Amazon Data Pipeline

Amazon Neptune Fully managed graph database Ex. Facebook

Fully managed document non-relational database service: Amazon Kinesis Data Firehose Amazon Neptune Amazon DocumentDB Amazon Data Pipeline

Amazon DocumentDB Fully managed document non-relational database service Queries and indexes JSON data Supports MongoDB workloads

Fully managed ledger database that provides cryptographically verifiable transaction logging: Amazon Kinesis Amazon Neptune Amazon Quantum Ledger Database (QLDB) Amazon Data Pipeline

Amazon Quantum Ledger Database (QLDB) Fully managed ledger database immutable change history Provides cryptographically verifiable transaction logging A recording (ledger) of what transactions have taken place

Fully managed service for joining public and private networks using Hyperledger Fabric and Ethereum:

Databases and Analytics Flashcards

(29 cards)