S3 Flashcards
S3 basics
- Safe place to store files
- Object-based storage (can upload)
- Data is spread across multiple devices & facilities
- 0b to 5TB per file
- Storage unlimited
- Buckets - universal namespace
- After uploading a file, you are provided with a HTTP 200 code back to your browser
S3 objects
Consist of:
- Key & value
- Version ID
- Metadata
- Subresources:
- Access control lists
- Torrents
S3 data consistency and guarantees
- ‘Read after Write’ consistency for PUTS of new objects
- ‘Eventual Consistency’ for overwrite PUTS and DELETES (i.e. aren’t immediate, if you try to read immediately, you may get an older version)
Guarantees: - 99.99% availability for the S3 platform
- 99.99% (11 9s) durability for S3 information (i.e. info not being lost)
S3 features
- Tiered storage available
- Lifecycle management
- Versioning
- Encryption
- MFA for deleting objects
- Secure data further using Access Control Lists and Bucket Policies
S3 Storage Classes
- Standard
- Infrequently Accessed (IA): rapid access only when really needed. Charged a retrieval fee
- One zone IA: lower cost option for IA, only one AZ
- Intelligent tiering: optimises costs by moving data between tiers automatically. No impact on performance or operational overhead
S3 Glacier
- Glacier: secure, durable, low-cost data archiving. Store any amount of data at or lower than on-prem. Retrieval times configurable from minutes to hours
- Glacier Deep Archive: lowest-cost storage class, retrieval time of 12 hours is acceptable
S3 costs
Costs incurred on:
- Storage
- Number of requests
- Storage management pricing (i.e. moving between tiers)
- Data transfer
- Transfer acceleration (fast transfer of files over long distances using CloudFront)
- Cross-region replication
S3 Versioning
- All versions of an object are stored (including writes, deletes receive a delete marker over that version)
- Once enabled, versioning cannot be disabled (only suspended) - have to delete bucket
- Integrates with lifecycle rules (i.e. moving to Glacier)
- Versioning’s MFA delete capability for additional layer of security
Lifecycle management with S3
- Lifecycle Rules to automate managing objects (moving between tiers/delete after certain number of days etc.)
- Can be used in conjunction with versions (i.e. applied to certain versions)
S3 Object Lock and Glacier Vault Lock
- S3 object lock: store objects with ‘write once, read many’ (WORM) model. Stops objects from being modified or deleted
- Can assist in meeting regulatory requirements for WORM such as:
- Governance mode: only users with special permissions can modify
- Compliance mode: no one (not even root user) can delete until expiry
- Legal holds: placed on object indefinitely until removed
- Glacier vault lock: similar to object lock, allows for WORM models in Glacier
S3 Performance
- Prefixes: subfolders within a bucket. Can get better performance by spreading reads across different prefixes
- Limitations with KMS: uploading/downloading with KMS encryption counts towards KMS quota and adds to latency
- Multipart uploads: recommended for objects over 100MB, required for files over 5GB. Can parallelize uploads as well
- Byte-range fetches: parallelize downloads by specifying byte ranges
S3 Select and Glacier Select
- Allows you to select certain objects with SQL statements. E.g. a csv file within a zip file, instead of downloading and unzipping
- Highly regulated industries write data directly to Glacier to satisfy compliance rules. Others have lifecycle rules then move objects to Glacier. Glacier Select lets you run SQL queries against Glacier
AWS Organisations and Consolidated Billing
Organisations:
- An account management service that enables consolidation of multiple AWS accounts to manage centrally.
- Best practice is to have root account just for billing and other accounts for certain teams/role types (Devs, testers etc.). Policies made at top level and inherited
Consolidated billing:
- paying account is independent but settles the resource bills for linked accounts.
- benefit is the ability to aggregate services for better pricing
- Service Control Policies enable/disable account services
Sharing S3 buckets across accounts
- Use bucket policies & IAM (entire bucket, programmatic access only)
- Use bucket ACLs & IAM (down to individual objects, programmatic access only)
- Cross-account IAM roles. Programmatic and console access
Cross-region replication
- Bucket files exist in another region
- Versioning must be enabled on both the source and destination buckets
- Files in existing bucket are not replicated automatically but all subsequent updated files will be replicated automatically