Big Data Flashcards
Can you use Redshift in place of RDS?
No
Can you use Redshift in place of RDS?
No
What is Redshift?
Essentially RDS for Business intelligence
What is AWS EMR?
Amazon Elastic MapReduce
A managed big data platform that lets us process data based on open source tools.
What is Amazon EMR built out of?
groupings of EC2 instances
Do EC2 rules apply to your Amazon EMR instance?
Yes
What is AWS Kinesis?
A service that allows you to ingest, process, and analyze real-time streaming data.
What is the purpose of Kinesis Data Streams?
Real-time streaming for ingesting data
What does ETL stand for?
Extract Transform Load
What is the speed of Kinesis Data Streams?
Real time
With Kinesis Data Streams are you responsible for creating the consumer and scaling the stream?
Yes
What is the purpose of Kinesis Data Firehose?
Data transfer tool to get information to S3, Redshift, Elsaticsearch, or Splunk
What is the speed of Kinesis Data Firehose?
Near Real time (Within 1 minute)
With Kinesis Data Firehose are you responsible for creating the consumer and scaling the stream?
No
What is Athena?
Serverless SQL
An interactive query service that makes it easy to analyze data in S3 using SQL.
What is Glue?
Serverless ETL
A serverless data integration service that makes it easy to discover, prepare, and combine data. It allows you to perform ETL workloads without managing underlying servers.
Can you use Athena to query logs stored in an S3 bucket?
Yes
Are Athena and Glue serverless?
Yes