Data Stores Flashcards
Concepts
What are the three types of data persistence?
Persistent, Transient, Ephemeral
Concepts
What is a persistent data store?
Durable data that sticks around after reboots, restarts, power cycles, etc.
Glacier, RDS
Concepts
What is a transient data store?
Temporary data that is stored and passed along to another process or storage
ex. SQS, SNS
Concepts
What is an ephemeral data store?
A temporary store where data is lost after stop.
ex. EC2 Instance store, memcached
Concepts
Explain IOPS
A measure of how fast we can read and write to a device
Concepts
What is throughput?
A measure of how much data can be moved at a time
Concepts
What are the two types of consistency models in data storage?
ACID and BASE
Concepts
What is ACID consistency model?
Atomic - All or nothing
Consistent - Transactions are valid
Isolated - Transactions don’t interfere with each other
Durable - Transactions stick around and won’t disappear.
Concepts
What is the BASE consistency model?
Basic availability - Values availability even if stale
Soft-state - Might not be consistent across all stores
Eventual consistency - Will achieve consistency eventually
Concepts
What is the benefit of ACID over BASE?
Data is always consistent
Concepts
What is the benefit of BASE?
It scales much better than ACID
S3
What kind of Store is S3?
Object Store
S3
What is the maximum object that can be stored in S3?
5TB
S3
What is the maximum PUT size when uploading to S3?
5GB
S3
When is it recommended to use multi-part uploads?
When the file size is larger than 100MB
S3
The S3 “path” is not a file path, but what?
ex: s3://mybucket/finance/april/16/invoice_45675.pdf
A “key”
This key uniquely identifies the record in the file store. The record be
S3
What security measures does S3 provide?
Recourse-based (object, ACL, bucket policy)
User-based (IAM Policies)
Optional multi-factor auth before elete
S3
True or False: You can enable versioning in S3?
True
S3 will create a new version of the file with each write
S3
What is the benefit to S3 versioning?
Enables “rollback,” and can be integrated with lifecycle management capabilities
True or False: Versioning is not compatible with S3 lifecycle capabilities
False
Versioning IS compatible with lifecycle capabilities
S3
What is the downside to versioning?
Old versions count as billable size until permanently deleted
Ensure you have a lifecycle management policy to control costs
S3
True or False: S3 supports cross-region replication?
True
S3
What are the benefits of cross-region replication?
Security Requirements
Compliance Requirements
Latency
S3
Why should you consider latency for cross-region replication?
Customers accessing data closer to them results in faster response times
S3
Tier: S3 Standard
Frequently accessed data with redundancy
S3
Tier: Standard-IA
Long-lived, infrequently accessed data
S3
Tier: One Zone-IA
Long-lived, infrequently accessed, non-critical data
S3
Tier: Reduced Redundancy
Frequently accessed, non-critical data
S3
Tier: Intelligent Tiering
Long-Lived data with changing or unknown access patterns
S3
Tier: Glacier
Long-Term data archiving
Retrieval times ranfing from minutes to hours
S3
Tier: Glacier Deep Archive
Long-term data archiving
Retrieval times within 12 hours
S3
What is intelligent tiering?
Automatic moving of data between tiers based on usage
To reduced storage costs
S3
Why should someone consider intelligent tiering when it’s more expensive?
Depending on the use case, you will make it up in reduced storage costs
S3
True or False: Intelligent tiering will automatically archive unused data?
False: You have to configure archival policies
S3
What are the 3 benefits of S3 lifecycle management?
Optimize storage costs
Adhere to data retention policies
Keep S3 buckets well maintained
S3
In what two ways can lifecycle management transition your files?
Transition between storage classes
Transition to archive
S3
What three ways can files be marked as lifecycle managed?
Prefixes
Tags
Current version vs previous version
S3
What services are used with DataLakes?
S3 Analytics
Athena, Redshift, Spectrum, Quicksight
S3
What service is used with streaming data?
S3 Analytics
Kinesis Firehose
S3
What services support AI/ML
S3 Analytics
Rekognition, Lex, MXNet
S3
What service supports Storage class analysis?
S3 Analytics
S3 Management Analytics