11. S3 & Athena Flashcards
What is S3 Cross Region Replication?
- Must be setup for each region you want replication to happen
- Read-only files that are updated in near real-time
- Great for dynamic content that needs to be available at low-latency in few regions
What is Athena?
- serverless service to perform analytics directly against S3 files
- uses SQL language to query the files
- charged per query and amount of data scanned
- “analyze data directly on S3” = Athena
- best for business intelligence / analytics / reporting, analyze and query VPC Flow Logs, ELB Logs, CloudTrail trails, etc
What is Multi-Part upload?
- S3 feature to help parallelize uploads (speed up transfers)
- mandatory for files > 5GB
- recommended for files > 100 MB
What is S3 Transfer Acceleration?
- uploads only
- increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region
- compatible with multi-part upload
What are S3 Byte-Range fetches?
- parallelize GETs by requesting specific byte ranges
- better resilience in case of failures
- can be used to speed up downloads
- can be used to retrieve only partial data (for example, the head of a file)
What is S3 Select?
- retrieve less data using SQL by performing server-side filtering
- can filter by rows and columns (simple SQL statements)
- less network transfer, less CPU cost client-side
How do you ensure that an event notification is sent for every successful write to your S3 bucket?
Enable versioning
What file types does Athena support?
CSV, JSON, ORC, Avro, and Parquet
Is S3 a global or regional service?
Regional, but buckets must have a globally unique name across all regions and all accounts
If you’re hosting a static website and receive a 403 Forbidden error, what should you do?
Make sure the bucket policy allows public reads
What cloud storage class automatically moves objects between Access Tiers based on usage for a small monthly monitoring fee?
S3 Intelligent-Tiering