S3, Databases and Analytics Flashcards
What is Amazon S3? and how are files stored ?
Amazon S3 (Simple Storage Service) is an object storage service that allows people to store files (objects) in “buckets” (directories).
What are the naming conventions for S3 buckets?
S3 buckets must:
- Have a globally unique name
- Be 3-63 characters long
- Not contain uppercase or underscores
- Not start with the prefix xn–
- Not end with -s3alias
- Start with a lowercase letter or number
- Not be an IP address
What is the maximum size of an S3 object?
The maximum size of an S3 object is 5TB. For files larger than 5GB, you must use multi-part upload.
What is the difference between the key and object in Amazon S3?
The key is the full path to the object in S3. The object is the content or file stored in the bucket. Example of key: s3://my-bucket/my_folder/my_file.txt.
What are the types of metadata that can be associated with S3 objects?
S3 objects can have:
- Metadata (system or user-defined key-value pairs)
- Tags (Unicode key-value pairs, up to 10 per object)
- Version ID (if versioning is enabled)
What are the security options for Amazon S3?
- User-based: IAM policies that control API access.
- Resource-based: Bucket policies, Object ACLs, and Bucket ACLs.
- Encryption: Server-side and client-side encryption.
What can an S3 bucket policy control?
S3 bucket policies (JSON-based) can:
- Grant public access to a bucket
- Force encryption of objects during upload
- Grant cross-account access
What is S3 versioning and its benefits?
S3 versioning allows multiple versions of the same file to exist. Benefits include protection against accidental deletions and easy rollback to previous versions.
What are the two types of replication in S3?
- Cross-Region Replication (CRR): Replicates objects to a bucket in a different region.
- Same-Region Replication (SRR): Replicates objects within the same region.
Both require versioning to be enabled.
What are the use cases for CRR and SRR in S3?
- CRR: Compliance, lower latency, replication across accounts.
- SRR: Log aggregation, replication between production and test accounts.
Name the main S3 storage classes and their use cases.
- S3 Standard: For frequently accessed data (e.g., big data analytics).
- S3 Standard-IA: For infrequently accessed data (e.g., disaster recovery).
- S3 One Zone-IA: For infrequently accessed data in a single AZ.
- S3 Glacier: For archival data with varying retrieval times.
- S3 Intelligent-Tiering: For automatic cost optimization.
What is server-side encryption in Amazon S3? how is different to client side encryption
Server-side encryption means that S3 encrypts your files after they are uploaded to the server.
Client-side encryption: The client encrypts the data before sending it to S3 and is responsible for managing the encryption keys.
What is the AWS Snow Family?
The AWS Snow Family includes offline devices (Snowcone, Snowball, Snowmobile) for data migration to S3 and edge computing, used when transferring large amounts of data is impractical over the network.
What is AWS Storage Gateway?
AWS Storage Gateway is a hybrid cloud service that connects on-premises environments with S3, providing file, volume, and tape gateway options for backup, disaster recovery, and tiered storage.
What is S3 Standard used for?
General-purpose storage class for frequently accessed and updated data with high durability and fast access.
How many Availability Zones does S3 Standard store data in?
A minimum of three Availability Zones.
Can S3 Standard host static websites?
Yes, it can host static websites.
What is S3 Standard-Infrequent Access (S3 Standard-IA) designed for?
Infrequently accessed data with lower storage costs but higher retrieval costs.
What is S3 Standard-IA commonly used for?
What is the minimum storage duration for S3 Standard-IA?
Long-term storage and backup.
30 days
Where does S3 One Zone-Infrequent Access store data?
In a single AWS Availability Zone for cost savings.