Databases and Analytics Flashcards

1
Q

Describe Relational Databases

A

a collection of data items with pre-defined relationships between them.
Think like Excel Spreadsheets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe NoSQL Databases

A

Non relational databases

Built for specific data models and have a flexible schemas for building applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name some of the benefits of NOSQL database

A

Flexiblity
Scalability
High-performance: optimized for a specific data model
Highly functional: optimized for the data model
Example Key-value, document, graph, in-memory, search databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does RDS stand for and what is it

A

Relational Database Service
Managed DB service for DB using SQL as a query language
Allows you to creater databases in the cloud that are managed by AWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Aurora

A

a relational database service
Supports PostgreSQL and MySQL
not in the free tier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are caches in AWS databases

A

in-memory databases with high performance, low latency

helps reduce loadoff databases for read intensive workloads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When it comes to Elastic cache what is Amazon responsible for

A

OS maintenance / patching, optimizations, setup, config, monitoring, failure recovery and backups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is DynamoDB

A

Fully Managed Highly database available with replication across 3 AZ
Part of the NOSQL database - not relational database
Scales to massive workloads, distributed serverless database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What type of database is DynamoDB

A

Key-Value database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is DynamoDB Accelerator -DAX

A

Fully Managed in-memory cache for DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a DynamoDB Global Tables

A

Makes a DynamoDB table accessible with low latency in multiple regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Redshift

A

a fully managed, petabyte-scale data warehouse service in the cloud
Its OLAP - online analytical processing (analytics and data warehousing)
Load data once every hour, not every second
10X better performance than other data warehouses, scale to PBs of data
Columnar storage of data (instead of row based)
Massively Parallel Query Execution (MPP)
Pay as you go
Has SQL interface for performing the queries
BI tools such as AWS quick sight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Amazon EMR

A

Elastic Map Reduce
Helps creating Hadoop clusters (BIG Data) to analyze and process vast amount of data
The clusters can be made of hundreds of EC2 instances
Also supports Apache Spark, HBase, Presto, Flink
EMR takes care of all the provisioning and config
Auto-scaling and integrated with Spot instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some of the use cases for EMR

A

data processing, machine learning, web indexing, big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe Athena

A

Severless query service to perfrom analytics against S3 objects
Uses standard SQL language to query the files
Supports CSV, JSON, ORC,Avri and Parquet (built on presto)
Pricing 5.00 per TB of data scanned
Use compressed or columnar data for cost-savings (less scan)
Use cases Business intelligence / analytics/ reporting, analyze and query VPC Flow Logs, ELB Logs, CloudTrail trails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What database will you use if you need to analyze data in S3 using serverless SQL

A

Athena

17
Q

What type of database is mongoDB

A

a noSQL database

18
Q

What is Neptune

A

Fully managed grapgh database (Like Facebook)

19
Q

What can you do with Neptune

A

Build and run apps working with highly connected datasets - optimized for these complex and hard queries
Can store up to billions of relations and query the graph with milliseconds latency
Highly available with replications across multiple AZs
Great for knowledge graphs, fraud detection,

20
Q

What is QLDB

A

Quantum Ledger Database - Used to review history of all the changes made to your application data over time
Immutable system - no entry can be removed or modified, cryptography verfiable

21
Q

What is a ledger

A

A book recording financial transactions

22
Q

What is DMS

A

Database Migration Service - AWS migrating service

23
Q

What is AWS GLUE

A

Managed extract, transform, and load (ETL) service
Useful to prepare and transform data for analytics
Fully serverless service

24
Q

If you want to build a relational database what service would you use

A

OLTP: RDS and Aurora (SQL)

25
Q

In-memory database

A

ElastiCache

26
Q

Key/Value Database

A

DynnamoDB (serverless) and DAX (cache for DynamoDB)

27
Q

Warehouse

A

OLAP:Redshift (SQL)

28
Q

Haddop Cluster

A

EMR

29
Q

Athena

A

query data on Amazon S3 (serverless and SQL)

30
Q

QuickSight

A

dashboards on your data (serverless)

31
Q

DocumentDB

A

Aurora for MongoDB (JSON-NoSQL database)

32
Q

Amazon QLDB

A

Financial Transactions Ledger (immutable journal, cryptography verfiable

33
Q

Amazon Managed Blockchain

A

managed Hyperledger Fabric and Ethereum blockchains

34
Q

Glue

A

Managed ETL (Extract Transform Load and Data Catalog service

35
Q

Database Migration

A

DMS

36
Q

Neptune

A

: Graph database