Amazon S3 Flashcards
Amazon S3 use cases
List a few use cases of amazon S3
- Backup & Storage
- Disaster Recovery
- Archive
- Hybrid Cloud Storage
- Application hosting
- Media hosting
- Data lakes & big data analytics
- Software delivery
- Static website
Sysco run analytics on its data for insights
Nasdaq uses S3 Glacier
What are Amazon S3 buckets?
How do Amazon S3 buckets function in the context of cloud storage?
What types of data are commonly stored in Amazon S3 buckets?
Why are Amazon S3 buckets a fundamental component in cloud-based data storage?
Amazon S3 (Simple Storage Service) buckets are containers for storing and organizing data in the cloud.
These buckets act as virtual containers to hold various types of data, including images, videos, documents, and other files at regional level even though available at a global level.
Each bucket is identified by a globally unique name and can be configured with specific access controls and storage settings.
- Buckets must have a globally unique name(across all regions all accounts)
- Buckets are defined at the region level.
- S3 look like a gloval service but buckets are created in a region.
Amazon S3 buckets are crucial in cloud-based data storage because they provide a secure and scalable way to store various types of data. Whether it’s photos, videos, or important documents, these buckets ensure that your data is accessible, organized, and protected in the cloud.
What are Amazon S3 objects?
How do Amazon S3 objects function within the context of cloud storage?
What distinguishes Amazon S3 objects from S3 buckets?
Why are Amazon S3 objects an essential element in cloud-based data storage?
Amazon S3 objects are the individual pieces of data stored within an S3 bucket.
Each object consists of data (such as a file or document) and metadata (information about the data).
- Objects have a key (unique).
- The key is a full path.
- There is no concept of directories within buckets.
- Keys are made up of an prefix+object_name separated with “/”.
Amazon S3 objects play a vital role in cloud-based data storage as they represent the individual pieces of data stored in S3 buckets.
Whether it’s a photo, a video, or any other kind of file, these objects make sure your data is organized, accessible, and secure in the cloud.
How does security access work in Amazon S3?
What measures are in place to secure access to data stored in Amazon S3?
How does S3 provide control over who can access & manipulate stored data
Why is understanding Amazon S3 security access crucial for protecting sensitive information in the cloud?
Security access in Amazon S3 is managed through a combination of mechanisms:
IAM Policies:[User-Based] (IAM) policies define permissions for users, roles, or groups, specifying what actions they can perform on S3 resources.
Bucket Policies:[Resource-Based] S3 bucket policies allow for fine-grained control over access to the entire bucket, including defining rules based on IP addresses, conditions, or other factors.
Access Control Lists (ACLs):[Resource-based] ACLs enable users to control access at the individual object level, specifying which AWS accounts or groups of users have permission to access specific objects.
Encryption: S3 supports server-side encryption to protect data at rest.
Note: An IAM principal can access an S3 object if - The user IAM permission ALLOW it OR the resource policy ALLOWS it AND there’s is no explicit DENY.
Understanding Amazon S3 security access is crucial because it ensures that only authorized users can access and manipulate data stored in S3. By using IAM policies, bucket policies, ACLs, and encryption, sensitive information is protected from unauthorized access, providing a secure environment for cloud-based data storage.
What is Amazon S3 Static Website Hosting?
How does Amazon S3 facilitate the hosting of static websites?
What distinguishes static websites from dynamic ones?
Why is Amazon S3 Static Website Hosting a popular choice for certain types of web applications?
Amazon S3 Static Website Hosting allows users to host static websites (those with fixed content) directly from an S3 bucket.
To enable static website hosting, users configure their S3 bucket, define the main HTML document, and set up optional error documents. Unlike dynamic websites that generate content on-the-fly, static websites present pre-existing content to users.
Static websites are typically composed of HTML, CSS, JavaScript, and media files. Amazon S3 provides a cost-effective and scalable solution for hosting such websites, ensuring reliable performance and ease of setup.
If you a get a 403 Forbidden error, make sure the bucket policy allows public reads.
Amazon S3 Static Website Hosting is a popular choice for certain types of web applications, especially those with fixed content, such as personal blogs, portfolios, or informational websites.
It provides a simple and cost-effective way to host static content, ensuring reliable performance and scalability for users who don’t require dynamic, database-driven websites.
What is Amazon S3 versioning?
How does Amazon S3 versioning work in the context of cloud storage?
benefits of versioning offer in terms of data management and recovery?
Why is understanding Amazon S3 versioning important for ensuring data integrity and protection?
Amazon S3 versioning is a feature that allows users to keep multiple versions of an object (file) in the same S3 bucket. When versioning is enabled, each time an object is overwritten or deleted, a new version is created.
Users can then retrieve and restore previous versions of objects, providing a history of changes.
Benefits of versioning include:
1. Data Protection: It protects against accidental deletion or overwrites by maintaining a history of changes.
2. Recovery: In case of data corruption or unintended changes, users can easily revert to a previous version.
3. Audit Trail: Versioning creates an audit trail, helping track modifications and who made them.
- Versioning is enabled at bucket level.
- Easy roll backs to previoius version
Note: - Any file that is not versioned prior to enabling cersioning will have version “null”.
- Suspending versioning doesn’t delte the previous versions.
Understanding Amazon S3 versioning is crucial because it ensures that your online treasures are protected from accidental changes or deletions. It’s like having a magical safety net that keeps a history of all the changes, allowing you to go back in time and recover any version of your data.
What is Amazon S3 Replication, specifically CRR and SRR?
How do Cross-Region Replication (CRR) and Same-Region Replication (SRR) work in Amazon S3?
Why would you choose CRR, and when would SRR be more appropriate?
Why is Amazon S3 Replication important for enhancing data durability and availability?
This replication can occur within the same AWS region (SRR) or across different AWS regions (CRR).
Cross-Region Replication (CRR): This involves replicating objects from one S3 bucket in one region to another S3 bucket in a different region. CRR helps enhance data durability and availability by providing redundancy in geographically distant locations.
Same-Region Replication (SRR): SRR replicates objects from one S3 bucket to another within the same AWS region. While it may not provide geographical redundancy, SRR is useful for maintaining copies of data within the same region for compliance or business continuity purposes.
- Buckets can be different acccounts.
- Copying is asynchronous
- Needs proper IAM permission to S3.
USE-CASES:
- CRR: Compliance, Lower latency access, replication across accounts.
- SRR: log aggregation, live replication between production and test accounts.
NOTE:
- After you enable replication, only new objects are replicated.
- But, you can still replicate existing objects using S3 Batch Replication.
- for DELETE, you can replicate delete markers from source to target. Deletions with version ID are not replicated to avoid malicious deletes.
- There is no chaining of rpelication, i.e; If bucket 1 has replication into bucket 2, which has replcation into bucket 3. Then objects create in bucket 1 aren’t replication to bucket 3.
Amazon S3 Replication is important because it provides an extra layer of protection for your data. Whether you want copies within the same region for quick access or copies in a different region for added security, replication ensures that your online treasures are resilient and available, even in the face of unexpected events.
What are the different Amazon S3 storage classes?
How do S3 storage classes differ in terms of performance, durability, and cost?
In what scenarios would you choose one storage class over another?
Why is understanding S3 storage classes important for optimizing cloud storage costs?
Amazon S3 offers several storage classes, each designed for specific use cases:
1. Standard:
* It provides low-latency and high-throughput performance, suitable for frequently accessed data.
* Use-cases: Big Data Analytics, Mobile & gaming applications, content distributio…
2. Intelligent-Tiering:
* This class automatically moves objects between two access tiers based on changing access patterns, optimizing costs.
* Use-case (standard-IA): Disaster Recovery, backups
3. One Zone-Infrequent Access (Z-IA):
* It stores data in a single availability zone, providing a cost-effective option for infrequently accessed data with lower durability.
* Use-case: Storing secondary copies of on-premise data or data you can recreate.
4. Glacier:
* Glacier offers very low-cost storage for archival data with longer retrieval times.
5. Glacier Deep Archive:
* This is the lowest-cost storage class, designed for data archival with the longest retrieval times.
D3 storage have high durabality(99.9999) and Availability(99.99)
Understanding S3 storage classes is important because it helps you choose the right magical place for each of your online treasures. Whether it’s frequently accessed stories or precious treasures you only open once in a while, selecting the right storage class ensures you’re optimizing your cloud storage costs.
You have a 25 GB file that you’re trying to upload to S3 but you’re getting errors. What is a possible solution for this?
1. The file size limit on S3 is 5GB
2. Update your bucket policy to allow the larger file
3. Use Multi-part upload when uploading file larger than 5GB
4. Encrypt the file.
- Use Multi-part upload when uploading file larger than 5GB
Multi-Part Upload is recommended as soon as the file is over 100 MB.
You have updated an S3 bucket policy to allow IAM users to read/write files in the S3 bucket, but one of the users complain that he can’t perform a PutObject API call. What is a possible cause for this?
1. THe S3 bucket policy must be wrong
2. The user is lacking permissions
3. The IAM user must have an explicit DENY in the attached IAM policy.
4. You need to contact AWS support to lift this limit.
The IAM user must have an explicit DENY in the attached IAM policy.
A user can be granted/restricted permission through IAM policy or bucket policy and an explicit DENY in the attached IAM policy can restrict permission.
IMP Facts for AWS
S3 (11)
- Bucket Naming convention (3)
- Object limits (Number & Size) in buckets (2)
- S3 application use-case scenarios (5)
- Bucket names are globally unique
- Bucket names should range in 3 - 63 characters, all lower case, no underscores.
- Bucket names start with a lowercase letter or a number.
- Bucket names Can’t be IP formatted e.g; 1.1.1.1
- Buckets - in an AWS account you can have 100 soft limit, with an extension upto 1000 (hard per account) of buckets.
- Unlimited objects in bucket, wich can range between 0 bytes to 5 TB.
- Key = Name, Valye = Data
- S3 is an object store - not a file service (like EFS)or block(like EBS) service. Hence, you can’t mount an S3 bucket (like an EBS block) to any resource like EC2, nor can you link it like EFS to an AWS service.
- Its great for large scale data storage, distribution or upload. (eg: Data lakes)
- Greate for offloading data on S3.
- Its ideal for inputting and outputting unprocessed/processed data to many AWS products/services.
What is S3 Intelligent Tiering?
Explain the concept of S3 Intelligent Tiering.
Understand the storage optimization feature offered by Amazon S3.
Dynamic storage optimization for S3.
Answer: S3 Intelligent Tiering is a storage class in Amazon Simple Storage Service (S3) that automatically optimizes storage costs by moving data between two access tiers:
- frequent access
- Infrequent access based on access patterns.
Real world Use-Case: For a company storing vast amounts of data on S3, Intelligent Tiering can significantly reduce costs by automatically moving data that is accessed less frequently to lower-cost storage tiers, while keeping frequently accessed data readily available.
Suitable Analogy: Think of S3 Intelligent Tiering like a self-organizing filing cabinet in an office. Files that are frequently accessed are kept in the top drawers for easy reach, while files that are rarely accessed are stored in the lower drawers, optimizing space and accessibility.
Clarifier: S3 Intelligent Tiering continuously monitors access patterns and automatically adjusts the storage tiering, eliminating the need for manual intervention and ensuring cost-efficient storage.
Dynamic storage optimization for S3.
What are the differences between S3 Glacier and S3 Glacier Deep Archive?
Explain the distinctions between S3 Glacier and S3 Glacier Deep Archive.
Explain varying storage and retrieval characteristics of the S3 classes.
Different storage classes for archiving data on Amazon S3.
Answer: S3 Glacier and S3 Glacier Deep Archive are both storage classes offered by Amazon S3 for long-term data archival, but they differ in terms of retrieval times, costs, and use cases.
Real world Use-Case: S3 Glacier is suitable for data that may need to be retrieved infrequently and within a few minutes to hours, while S3 Glacier Deep Archive is designed for data that is accessed very rarely and with longer retrieval times, typically ranging from 12 to 48 hours.
Suitable Analogy: Comparing S3 Glacier to a storage unit that requires some time to retrieve items stored in the back, while S3 Glacier Deep Archive is akin to storing items in a basement that takes longer to access and retrieve due to the deeper storage.
Clarifier: S3 Glacier offers faster retrieval times and slightly higher costs compared to S3 Glacier Deep Archive, making it more suitable for data that requires occasional access. On the other hand, S3 Glacier Deep Archive provides the lowest storage costs among Amazon S3 storage classes but with the longest retrieval times.
Different storage classes for archiving data on Amazon S3.
What is S3 Glacier Flexible Retrieval?
How does S3 Glacier Flexible Retrieval differ from other retrieval options in Amazon S3 Glacier?
Understand S3 Glacier Flexible Retrieval and other retrieval strategies.
S3 Glacier Flexible Retrieval offers expedited, standard, and bulk retrieval options with variable retrieval rates.
Answer: S3 Glacier Flexible Retrieval is a retrieval option within Amazon S3 Glacier that allows users to retrieve archived data with varying speed and cost options.
It provides:
- expedited,
- standard,
- bulk retrieval options.
Each with different retrieval rates and corresponding costs.
Real-world Use-Case: Imagine a company needing to retrieve critical data quickly for an urgent analysis while opting for a less urgent retrieval for historical records.
S3 Glacier Flexible Retrieval offers the flexibility to balance speed and cost according to specific needs.
Suitable Analogy: Think of S3 Glacier Flexible Retrieval like selecting different shipping options for packages—expedited for urgent deliveries, standard for regular shipments, and bulk for large-scale deliveries.
S3 Glacier Flexible Retrieval provides:
- expedited,
- standard,
- bulk retrieval options.
Enabling users to balance retrieval speed and cost effectively.
What is S3 Lifecycle Configuration?
Understand the key components and benefits of using S3 Lifecycle Configuration.
S3 Lifecycle Configuration help in managing objects stored in Amazon S3
S3 Lifecycle Configuration automates the process of managing object storage based on predefined rules.
Answer: S3 lifecycle configuration is a feature in Amazon Simple Storage Service (S3) that automates the transition of objects between different storage tiers or the deletion of objects based on predefined rules.
It enables users to define rules to automatically migrate objects to lower-cost storage classes or delete them when they’re no longer needed.
This helps optimize storage costs and performance by efficiently managing data throughout its lifecycle.
Real-world Use-Case: A company might use lifecycle policies to move older data to cheaper storage tiers, reducing storage costs without manual intervention.
Suitable Analogy: Think of S3 lifecycle configuration as a filing system where files are automatically moved to archive cabinets when they’re no longer actively used.
Lifecycle policies can be based on factors: object age, access frequency
S3 lifecycle configuration enhances storage efficiency and cost-effectiveness.