Big Data Flashcards
Redshift
Redshift isn’t a replacement for RDS in traditional apps
Redshift only supports single AZ deployments
You can create multiple clusters in different AZs, but they’re technically different deployments. It’s high available by default
EMR (elastic map reduce)
Is made up of EC2 instances
This means you can employ your standard EC2 instance cost-savings measures. RI’s - Spot Instances
Kinesis
Kinesis is the only service w/ a real time response
If the question asks for a real-time solution to processing or moving data look for Kinesis
Kinesis Firehose
Near real time & ease of use
SQS and Kinesis Queues
Each has its own pros & cons
SQS is easier & simpler
Kinesis is faster & can store data up to a year
Athena
Anytime serverless SQL or querying data that is stored in S3 comes up think Athena
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run
Glue
Serverless ETL
It can help create a schema for your data when paired w/ Athena
Quick Sight
Creating a dashboard? Quick Sight is a data visualization tool
Elastic Search
Excels when combined w/ a log stash & kibana elk stack to query server logs
Search engine commonly used for log analytics, full-text search, security intelligence, business intelligence, and operations intelligence use cases.