Databases and Analytics Flashcards by Kerry Gambrel

Describe Relational Databases

a collection of data items with pre-defined relationships between them.
Think like Excel Spreadsheets

How well did you know this?

Not at all

Perfectly

Describe NoSQL Databases

Non relational databases

Built for specific data models and have a flexible schemas for building applications

How well did you know this?

Not at all

Perfectly

Name some of the benefits of NOSQL database

Flexiblity
Scalability
High-performance: optimized for a specific data model
Highly functional: optimized for the data model
Example Key-value, document, graph, in-memory, search databases

How well did you know this?

Not at all

Perfectly

What does RDS stand for and what is it

Relational Database Service
Managed DB service for DB using SQL as a query language
Allows you to creater databases in the cloud that are managed by AWS

How well did you know this?

Not at all

Perfectly

What is Aurora

a relational database service
Supports PostgreSQL and MySQL
not in the free tier

How well did you know this?

Not at all

Perfectly

What are caches in AWS databases

in-memory databases with high performance, low latency

helps reduce loadoff databases for read intensive workloads

How well did you know this?

Not at all

Perfectly

When it comes to Elastic cache what is Amazon responsible for

OS maintenance / patching, optimizations, setup, config, monitoring, failure recovery and backups

How well did you know this?

Not at all

Perfectly

What is DynamoDB

Fully Managed Highly database available with replication across 3 AZ
Part of the NOSQL database - not relational database
Scales to massive workloads, distributed serverless database

How well did you know this?

Not at all

Perfectly

What type of database is DynamoDB

Key-Value database

How well did you know this?

Not at all

Perfectly

What is DynamoDB Accelerator -DAX

Fully Managed in-memory cache for DynamoDB

How well did you know this?

Not at all

Perfectly

What is a DynamoDB Global Tables

Makes a DynamoDB table accessible with low latency in multiple regions

How well did you know this?

Not at all

Perfectly

What is Redshift

a fully managed, petabyte-scale data warehouse service in the cloud
Its OLAP - online analytical processing (analytics and data warehousing)
Load data once every hour, not every second
10X better performance than other data warehouses, scale to PBs of data
Columnar storage of data (instead of row based)
Massively Parallel Query Execution (MPP)
Pay as you go
Has SQL interface for performing the queries
BI tools such as AWS quick sight

How well did you know this?

Not at all

Perfectly

What is Amazon EMR

Elastic Map Reduce
Helps creating Hadoop clusters (BIG Data) to analyze and process vast amount of data
The clusters can be made of hundreds of EC2 instances
Also supports Apache Spark, HBase, Presto, Flink
EMR takes care of all the provisioning and config
Auto-scaling and integrated with Spot instances

How well did you know this?

Not at all

Perfectly

What are some of the use cases for EMR

data processing, machine learning, web indexing, big data

How well did you know this?

Not at all

Perfectly

Describe Athena

Severless query service to perfrom analytics against S3 objects
Uses standard SQL language to query the files
Supports CSV, JSON, ORC,Avri and Parquet (built on presto)
Pricing 5.00 per TB of data scanned
Use compressed or columnar data for cost-savings (less scan)
Use cases Business intelligence / analytics/ reporting, analyze and query VPC Flow Logs, ELB Logs, CloudTrail trails

How well did you know this?

Not at all

Perfectly

What database will you use if you need to analyze data in S3 using serverless SQL

Study These Flashcards

Athena

What type of database is mongoDB

Study These Flashcards

a noSQL database

What is Neptune

Study These Flashcards

Fully managed grapgh database (Like Facebook)

What can you do with Neptune

Study These Flashcards

Build and run apps working with highly connected datasets - optimized for these complex and hard queries
Can store up to billions of relations and query the graph with milliseconds latency
Highly available with replications across multiple AZs
Great for knowledge graphs, fraud detection,

What is QLDB

Study These Flashcards

Quantum Ledger Database - Used to review history of all the changes made to your application data over time
Immutable system - no entry can be removed or modified, cryptography verfiable

What is a ledger

Study These Flashcards

A book recording financial transactions

What is DMS

Study These Flashcards

Database Migration Service - AWS migrating service

What is AWS GLUE

Study These Flashcards

Managed extract, transform, and load (ETL) service
Useful to prepare and transform data for analytics
Fully serverless service

If you want to build a relational database what service would you use

Study These Flashcards

OLTP: RDS and Aurora (SQL)

In-memory database

ElastiCache

Key/Value Database

DynnamoDB (serverless) and DAX (cache for DynamoDB)

Warehouse

OLAP:Redshift (SQL)

Haddop Cluster

EMR

Athena

query data on Amazon S3 (serverless and SQL)

QuickSight

dashboards on your data (serverless)

DocumentDB

Aurora for MongoDB (JSON-NoSQL database)

Amazon QLDB

Financial Transactions Ledger (immutable journal, cryptography verfiable

Amazon Managed Blockchain

managed Hyperledger Fabric and Ethereum blockchains

Glue

Managed ETL (Extract Transform Load and Data Catalog service

Database Migration

DMS

Neptune

: Graph database

Databases and Analytics Flashcards

(36 cards)