Big Data Flashcards
Does Redshift support HA? How many AZ’s can you be in? How do you achieve HA?
No not natively, can only be in a single AZ
You will need to replicate data to another instance in a different AZ
What is the use case for Redshift? Does it replace RDS?
Use case is for BI and big data, does not replace RDS
How much data can Redshift store?
16 PB, huge amount of data
What is EMR?
Big data platform for ETL using open source
What rules apply to EMR? What is underlyihng EMR?
VPC rules apply, as underlying is EC2 instances
How can you save on costs with EMR?
EC2 cost savings with spot and RI
What are the 3 types of Kinesis services?
- Kinesis Data stream
- Kinesis firehose
- Kinesis Data analytics and SQL
What is Kinesis for?
Ingesting, Processing, Analyzing real-time streaming data
What is Kinesis Data stream?
Realtime for ingesting data, consumer setup and scaling are custom
What is Kinesis Firehose?
Near realtime (60 seconds) more managed service plug and play into select aws services, automatically scales
What is Kinesis Data Analytics?
Allows processing and analyzing data with SQL as it comes in
What is Kinesis vs SQS?
Kinesis for realtime and big data but more config vs. SQS is not realtime but simpler