Big Data Whitepaper Flashcards
What is the minimum number of DPUs required for a Glue ETL job?
2
What is the default number of DPUs allocated to a Glue ETL job
10
What two languages does Glue ETL use when generating code?
Python and Spark
What is the minimum interval for Glue ETL jobs?
5 minutes (not the right tool for streaming data)
What kind of databases is AWS Glue not compatible with?
NoSQL
What is the maximum item size in DynamoDB?
400KB
How many data centers is DynamoDB replicated across?
3
What needs to be used to achieve regional replication in DynamoDB?
DynamoDB Streams
By default, how often does Elasticsearch take snapshots and backup to S3?
Daily
What is the maximum EBS volume size per Elasticsearch instance?
1.5TB
What is the default maximum number of nodes per ES domain?
20
What is the absolute maximum number of nodes per ES domain?
100
What are two anti-patterns for ES?
OLTP and Ad-hoc queries
What Quicksight edition must be used if requiring encryption at rest?
Enterprise
When should EC2 be used in a big data setting vs other serverless or managed solutions?
Specialized environments or compliance requirements