Amazon S3 Flashcards

1
Q

Amazon S3 use cases

List a few use cases of amazon S3

A
  1. Backup & Storage
  2. Disaster Recovery
  3. Archive
  4. Hybrid Cloud Storage
  5. Application hosting
  6. Media hosting
  7. Data lakes & big data analytics
  8. Software delivery
  9. Static website

Sysco run analytics on its data for insights

Nasdaq uses S3 Glacier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are Amazon S3 buckets?

How do Amazon S3 buckets function in the context of cloud storage?

What types of data are commonly stored in Amazon S3 buckets?

Why are Amazon S3 buckets a fundamental component in cloud-based data storage?

A

Amazon S3 (Simple Storage Service) buckets are containers for storing and organizing data in the cloud.
These buckets act as virtual containers to hold various types of data, including images, videos, documents, and other files at regional level even though available at a global level.
Each bucket is identified by a globally unique name and can be configured with specific access controls and storage settings.
- Buckets must have a globally unique name(across all regions all accounts)
- Buckets are defined at the region level.
- S3 look like a gloval service but buckets are created in a region.

Amazon S3 buckets are crucial in cloud-based data storage because they provide a secure and scalable way to store various types of data. Whether it’s photos, videos, or important documents, these buckets ensure that your data is accessible, organized, and protected in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are Amazon S3 objects?

How do Amazon S3 objects function within the context of cloud storage?

What distinguishes Amazon S3 objects from S3 buckets?

Why are Amazon S3 objects an essential element in cloud-based data storage?

A

Amazon S3 objects are the individual pieces of data stored within an S3 bucket.
Each object consists of data (such as a file or document) and metadata (information about the data).

  • Objects have a key (unique).
  • The key is a full path.
  • There is no concept of directories within buckets.
  • Keys are made up of an prefix+object_name separated with “/”.

Amazon S3 objects play a vital role in cloud-based data storage as they represent the individual pieces of data stored in S3 buckets.
Whether it’s a photo, a video, or any other kind of file, these objects make sure your data is organized, accessible, and secure in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does security access work in Amazon S3?

What measures are in place to secure access to data stored in Amazon S3?

How does S3 provide control over who can access & manipulate stored data

Why is understanding Amazon S3 security access crucial for protecting sensitive information in the cloud?

A

Security access in Amazon S3 is managed through a combination of mechanisms:

IAM Policies:[User-Based] (IAM) policies define permissions for users, roles, or groups, specifying what actions they can perform on S3 resources.

Bucket Policies:[Resource-Based] S3 bucket policies allow for fine-grained control over access to the entire bucket, including defining rules based on IP addresses, conditions, or other factors.

Access Control Lists (ACLs):[Resource-based] ACLs enable users to control access at the individual object level, specifying which AWS accounts or groups of users have permission to access specific objects.

Encryption: S3 supports server-side encryption to protect data at rest.

Note: An IAM principal can access an S3 object if - The user IAM permission ALLOW it OR the resource policy ALLOWS it AND there’s is no explicit DENY.

Understanding Amazon S3 security access is crucial because it ensures that only authorized users can access and manipulate data stored in S3. By using IAM policies, bucket policies, ACLs, and encryption, sensitive information is protected from unauthorized access, providing a secure environment for cloud-based data storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Amazon S3 Static Website Hosting?

How does Amazon S3 facilitate the hosting of static websites?

What distinguishes static websites from dynamic ones?

Why is Amazon S3 Static Website Hosting a popular choice for certain types of web applications?

A

Amazon S3 Static Website Hosting allows users to host static websites (those with fixed content) directly from an S3 bucket.
To enable static website hosting, users configure their S3 bucket, define the main HTML document, and set up optional error documents. Unlike dynamic websites that generate content on-the-fly, static websites present pre-existing content to users.

Static websites are typically composed of HTML, CSS, JavaScript, and media files. Amazon S3 provides a cost-effective and scalable solution for hosting such websites, ensuring reliable performance and ease of setup.

If you a get a 403 Forbidden error, make sure the bucket policy allows public reads.

Amazon S3 Static Website Hosting is a popular choice for certain types of web applications, especially those with fixed content, such as personal blogs, portfolios, or informational websites.
It provides a simple and cost-effective way to host static content, ensuring reliable performance and scalability for users who don’t require dynamic, database-driven websites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Amazon S3 versioning?

How does Amazon S3 versioning work in the context of cloud storage?

benefits of versioning offer in terms of data management and recovery?

Why is understanding Amazon S3 versioning important for ensuring data integrity and protection?

A

Amazon S3 versioning is a feature that allows users to keep multiple versions of an object (file) in the same S3 bucket. When versioning is enabled, each time an object is overwritten or deleted, a new version is created.
Users can then retrieve and restore previous versions of objects, providing a history of changes.
Benefits of versioning include:
1. Data Protection: It protects against accidental deletion or overwrites by maintaining a history of changes.
2. Recovery: In case of data corruption or unintended changes, users can easily revert to a previous version.
3. Audit Trail: Versioning creates an audit trail, helping track modifications and who made them.

  • Versioning is enabled at bucket level.
  • Easy roll backs to previoius version
    Note:
  • Any file that is not versioned prior to enabling cersioning will have version “null”.
  • Suspending versioning doesn’t delte the previous versions.

Understanding Amazon S3 versioning is crucial because it ensures that your online treasures are protected from accidental changes or deletions. It’s like having a magical safety net that keeps a history of all the changes, allowing you to go back in time and recover any version of your data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Amazon S3 Replication, specifically CRR and SRR?

How do Cross-Region Replication (CRR) and Same-Region Replication (SRR) work in Amazon S3?

Why would you choose CRR, and when would SRR be more appropriate?

Why is Amazon S3 Replication important for enhancing data durability and availability?

A

This replication can occur within the same AWS region (SRR) or across different AWS regions (CRR).

Cross-Region Replication (CRR): This involves replicating objects from one S3 bucket in one region to another S3 bucket in a different region. CRR helps enhance data durability and availability by providing redundancy in geographically distant locations.

Same-Region Replication (SRR): SRR replicates objects from one S3 bucket to another within the same AWS region. While it may not provide geographical redundancy, SRR is useful for maintaining copies of data within the same region for compliance or business continuity purposes.

  • Buckets can be different acccounts.
  • Copying is asynchronous
  • Needs proper IAM permission to S3.

USE-CASES:
- CRR: Compliance, Lower latency access, replication across accounts.
- SRR: log aggregation, live replication between production and test accounts.

NOTE:
- After you enable replication, only new objects are replicated.
- But, you can still replicate existing objects using S3 Batch Replication.
- for DELETE, you can replicate delete markers from source to target. Deletions with version ID are not replicated to avoid malicious deletes.
- There is no chaining of rpelication, i.e; If bucket 1 has replication into bucket 2, which has replcation into bucket 3. Then objects create in bucket 1 aren’t replication to bucket 3.

Amazon S3 Replication is important because it provides an extra layer of protection for your data. Whether you want copies within the same region for quick access or copies in a different region for added security, replication ensures that your online treasures are resilient and available, even in the face of unexpected events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the different Amazon S3 storage classes?

How do S3 storage classes differ in terms of performance, durability, and cost?

In what scenarios would you choose one storage class over another?

Why is understanding S3 storage classes important for optimizing cloud storage costs?

A

Amazon S3 offers several storage classes, each designed for specific use cases:
1. Standard:
* It provides low-latency and high-throughput performance, suitable for frequently accessed data.
* Use-cases: Big Data Analytics, Mobile & gaming applications, content distributio…
2. Intelligent-Tiering:
* This class automatically moves objects between two access tiers based on changing access patterns, optimizing costs.
* Use-case (standard-IA): Disaster Recovery, backups
3. One Zone-Infrequent Access (Z-IA):
* It stores data in a single availability zone, providing a cost-effective option for infrequently accessed data with lower durability.
* Use-case: Storing secondary copies of on-premise data or data you can recreate.
4. Glacier:
* Glacier offers very low-cost storage for archival data with longer retrieval times.
5. Glacier Deep Archive:
* This is the lowest-cost storage class, designed for data archival with the longest retrieval times.

D3 storage have high durabality(99.9999) and Availability(99.99)

Understanding S3 storage classes is important because it helps you choose the right magical place for each of your online treasures. Whether it’s frequently accessed stories or precious treasures you only open once in a while, selecting the right storage class ensures you’re optimizing your cloud storage costs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You have a 25 GB file that you’re trying to upload to S3 but you’re getting errors. What is a possible solution for this?
1. The file size limit on S3 is 5GB
2. Update your bucket policy to allow the larger file
3. Use Multi-part upload when uploading file larger than 5GB
4. Encrypt the file.

A
  1. Use Multi-part upload when uploading file larger than 5GB

Multi-Part Upload is recommended as soon as the file is over 100 MB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

You have updated an S3 bucket policy to allow IAM users to read/write files in the S3 bucket, but one of the users complain that he can’t perform a PutObject API call. What is a possible cause for this?
1. THe S3 bucket policy must be wrong
2. The user is lacking permissions
3. The IAM user must have an explicit DENY in the attached IAM policy.
4. You need to contact AWS support to lift this limit.

A

The IAM user must have an explicit DENY in the attached IAM policy.

A user can be granted/restricted permission through IAM policy or bucket policy and an explicit DENY in the attached IAM policy can restrict permission.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

IMP Facts for AWS

S3 (11)
- Bucket Naming convention (3)
- Object limits (Number & Size) in buckets (2)
- S3 application use-case scenarios (5)

A
  1. Bucket names are globally unique
  2. Bucket names should range in 3 - 63 characters, all lower case, no underscores.
  3. Bucket names start with a lowercase letter or a number.
  4. Bucket names Can’t be IP formatted e.g; 1.1.1.1
  5. Buckets - in an AWS account you can have 100 soft limit, with an extension upto 1000 (hard per account) of buckets.
  6. Unlimited objects in bucket, wich can range between 0 bytes to 5 TB.
  7. Key = Name, Valye = Data
  8. S3 is an object store - not a file service (like EFS)or block(like EBS) service. Hence, you can’t mount an S3 bucket (like an EBS block) to any resource like EC2, nor can you link it like EFS to an AWS service.
  9. Its great for large scale data storage, distribution or upload. (eg: Data lakes)
  10. Greate for offloading data on S3.
  11. Its ideal for inputting and outputting unprocessed/processed data to many AWS products/services.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is S3 Intelligent Tiering?

Explain the concept of S3 Intelligent Tiering.

Understand the storage optimization feature offered by Amazon S3.

Dynamic storage optimization for S3.

A

Answer: S3 Intelligent Tiering is a storage class in Amazon Simple Storage Service (S3) that automatically optimizes storage costs by moving data between two access tiers:
- frequent access
- Infrequent access based on access patterns.

Real world Use-Case: For a company storing vast amounts of data on S3, Intelligent Tiering can significantly reduce costs by automatically moving data that is accessed less frequently to lower-cost storage tiers, while keeping frequently accessed data readily available.

Suitable Analogy: Think of S3 Intelligent Tiering like a self-organizing filing cabinet in an office. Files that are frequently accessed are kept in the top drawers for easy reach, while files that are rarely accessed are stored in the lower drawers, optimizing space and accessibility.

Clarifier: S3 Intelligent Tiering continuously monitors access patterns and automatically adjusts the storage tiering, eliminating the need for manual intervention and ensuring cost-efficient storage.

Dynamic storage optimization for S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the differences between S3 Glacier and S3 Glacier Deep Archive?

Explain the distinctions between S3 Glacier and S3 Glacier Deep Archive.

Explain varying storage and retrieval characteristics of the S3 classes.

Different storage classes for archiving data on Amazon S3.

A

Answer: S3 Glacier and S3 Glacier Deep Archive are both storage classes offered by Amazon S3 for long-term data archival, but they differ in terms of retrieval times, costs, and use cases.

Real world Use-Case: S3 Glacier is suitable for data that may need to be retrieved infrequently and within a few minutes to hours, while S3 Glacier Deep Archive is designed for data that is accessed very rarely and with longer retrieval times, typically ranging from 12 to 48 hours.

Suitable Analogy: Comparing S3 Glacier to a storage unit that requires some time to retrieve items stored in the back, while S3 Glacier Deep Archive is akin to storing items in a basement that takes longer to access and retrieve due to the deeper storage.

Clarifier: S3 Glacier offers faster retrieval times and slightly higher costs compared to S3 Glacier Deep Archive, making it more suitable for data that requires occasional access. On the other hand, S3 Glacier Deep Archive provides the lowest storage costs among Amazon S3 storage classes but with the longest retrieval times.

Different storage classes for archiving data on Amazon S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is S3 Glacier Flexible Retrieval?

How does S3 Glacier Flexible Retrieval differ from other retrieval options in Amazon S3 Glacier?

Understand S3 Glacier Flexible Retrieval and other retrieval strategies.

S3 Glacier Flexible Retrieval offers expedited, standard, and bulk retrieval options with variable retrieval rates.

A

Answer: S3 Glacier Flexible Retrieval is a retrieval option within Amazon S3 Glacier that allows users to retrieve archived data with varying speed and cost options.
It provides:
- expedited,
- standard,
- bulk retrieval options.

Each with different retrieval rates and corresponding costs.

Real-world Use-Case: Imagine a company needing to retrieve critical data quickly for an urgent analysis while opting for a less urgent retrieval for historical records.
S3 Glacier Flexible Retrieval offers the flexibility to balance speed and cost according to specific needs.

Suitable Analogy: Think of S3 Glacier Flexible Retrieval like selecting different shipping options for packages—expedited for urgent deliveries, standard for regular shipments, and bulk for large-scale deliveries.

S3 Glacier Flexible Retrieval provides:
- expedited,
- standard,
- bulk retrieval options.

Enabling users to balance retrieval speed and cost effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is S3 Lifecycle Configuration?

Understand the key components and benefits of using S3 Lifecycle Configuration.

S3 Lifecycle Configuration help in managing objects stored in Amazon S3

S3 Lifecycle Configuration automates the process of managing object storage based on predefined rules.

A

Answer: S3 lifecycle configuration is a feature in Amazon Simple Storage Service (S3) that automates the transition of objects between different storage tiers or the deletion of objects based on predefined rules.
It enables users to define rules to automatically migrate objects to lower-cost storage classes or delete them when they’re no longer needed.
This helps optimize storage costs and performance by efficiently managing data throughout its lifecycle.

Real-world Use-Case: A company might use lifecycle policies to move older data to cheaper storage tiers, reducing storage costs without manual intervention.

Suitable Analogy: Think of S3 lifecycle configuration as a filing system where files are automatically moved to archive cabinets when they’re no longer actively used.

Lifecycle policies can be based on factors: object age, access frequency

S3 lifecycle configuration enhances storage efficiency and cost-effectiveness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do S3 presigned URLs enhance secure access to objects?

What is the purpose and mechanism behind generating S3 presigned URLs?

S3 presigned URLs grant temporary access to S3 objects.

Presigned URLs enable controlled, time-limited access.

A

Answer: S3 presigned URLs are URLs that grant temporary access to objects in Amazon S3.
They are generated using AWS SDKs or the AWS CLI by users who have permission to access the object.
These URLs are valid for a specified duration and allow access to the object without requiring the requester to have AWS credentials.
Presigned URLs are commonly used for securely sharing S3 objects or enabling temporary access to private content.

Real-world Use-Case: A media streaming service may generate presigned URLs for authorized users to access private videos for a limited time.

Suitable Analogy: Think of a presigned URL as a time-limited ticket to access a restricted area without needing a permanent pass.

Clarifier: Presigned URLs are typically used for scenarios where direct access to S3 objects is necessary but the objects are not publicly accessible.

S3 presigned URLs provide secure and temporary access to private S3 objects.

17
Q

How do S3 Select and Glacier Select enhance data retrieval?

What are S3 Select and Glacier Select, and how do they optimize data retrieval from object storage?

S3 Select and Glacier Select enable selective retrieval of data from objects stored in Amazon S3 and Glacier, respectively.

Select options streamline data extraction from large datasets.

A

Answer: S3 Select and Glacier Select are features that allow users to retrieve specific data subsets from objects stored in Amazon S3 and Glacier without needing to download the entire object.
They use SQL-like queries to extract only the necessary data, reducing data transfer costs and processing time.
S3 Select is suitable for interactive querying of data stored in S3, while Glacier Select offers similar functionality for data stored in Glacier archives.

Real-world Use-Case: A data analyst might use S3 Select to query a large CSV file stored in S3 and retrieve only relevant rows based on specific criteria.

Suitable Analogy: S3 Select and Glacier Select act like precise search filters, allowing users to extract only the information they need from vast data repositories.

Clarifier: These features significantly reduce data transfer and processing overheads associated with retrieving large datasets from object storage.

Filtering at source (S3 bucket environement filter)

S3 Select and Glacier Select revolutionize data extraction from object storage services.

18
Q

How does S3 event notification enhance system integration?

S3 event notifications notify external systems or services about changes to objects stored in S3 buckets.

S3 events facilitate event-driven architecture.

What role do S3 events play in automating workflows and triggering actions?

A

Answer: S3 event notifications are notifications generated by Amazon S3 whenever certain events occur within S3 buckets.
These events include object creation, deletion, or modification.
Users can configure S3 event notifications to trigger actions in response to these events, such as invoking AWS Lambda functions, updating database records, or sending notifications.
S3 event notifications enable automation and integration with external systems based on changes in S3 storage.

Real-world Use-Case: A company might use S3 event notifications to trigger image processing tasks whenever new images are uploaded to an S3 bucket.

Suitable Analogy: S3 events act like alarms that ring whenever something changes in a specific room, prompting appropriate actions.

Clarifier: S3 event notifications can integrate with various AWS services and third-party applications to automate workflows and respond to changes in real-time.

S3 event notifications enable seamless integration and automation of processes involving S3 storage.

19
Q

How do S3 Access logs enhance visibility and security?

What purpose do S3 Access logs serve, and how do they contribute to monitoring and auditing?

S3 Access logs record requests made to S3 buckets, providing insights into access patterns and aiding in security analysis.

Access logs support compliance and troubleshooting efforts.

A

Answer: S3 Access logs are log files automatically generated by Amazon S3 that capture detailed information about requests made to S3 buckets.
These logs include details such as the requester’s IP address, the time of the request, the requested action, and the response status.

S3 Access logs are useful for monitoring access patterns, troubleshooting issues, and performing security analysis.
They also support compliance efforts by providing an audit trail of S3 bucket activity.

Real-world Use-Case: A security team might analyze S3 Access logs to identify and block suspicious access attempts or unauthorized access to sensitive data.
Suitable Analogy: S3 Access logs are like security cameras installed around a building, recording every entry and exit for review and analysis.

Clarifier: Configuring S3 Access logging helps organizations track access to their S3 buckets and detect unauthorized or suspicious activity.

S3 Access logs are essential for maintaining visibility and ensuring the security of data stored in S3 buckets.

20
Q

How does S3 Object Lock enhance data protection and compliance?

What is S3 Object Lock, and how does it contribute to ensuring data immutability and compliance?

S3 Object Lock prevents the deletion or modification of objects for a specified retention period, ensuring data integrity and compliance.

Object Lock facilitates data governance and tamper-proof storage.

A

Answer: S3 Object Lock is a feature of Amazon S3 that allows users to enforce retention periods and legal holds on objects stored in S3 buckets.
Once enabled, Object Lock prevents the deletion or modification of objects for the duration of the retention period, regardless of the permissions granted to users.
This helps ensure data immutability, prevent accidental or malicious alterations, and meet regulatory compliance requirements, such as those in financial services or healthcare industries.

! Real-world Use-Case: Organizations subject to regulatory compliance, such as HIPAA or GDPR, can use S3 Object Lock to maintain data integrity and meet retention requirements.

Suitable Analogy: S3 Object Lock acts like a digital vault with a time-locked mechanism, ensuring that stored items remain untouched until the lock expires.

Clarifier: Object Lock settings can be set at the object level or the bucket level, allowing for granular control over data retention policies.

  • Legal Hold(ON/OFF)
  • Governance(Specified period but changable)
  • Compliance (Fixed period and Fixed config)

S3 Object Lock provides an essential tool for ensuring data integrity, compliance, and long-term preservation of critical information.

21
Q

How does S3 Access Point simplify access control for S3 buckets?

What is the purpose and functionality of S3 Access Points in managing access to S3 buckets?

S3 Access Points provide a simplified way to manage access to shared data stored in S3 buckets by using unique endpoints with specific permissions.

Access Points enhance security and simplify access management.

A

Answer: S3 Access Points are unique hostname endpoints that simplify managing data access to shared S3 buckets.
They allow granular control over access permissions by enabling users to set access policies and permissions at the access point level, rather than at the bucket level.
This facilitates secure and fine-grained access to shared data, simplifies access management, and reduces the risk of misconfigurations or unintended access.

Real-world Use-Case: An organization may create multiple S3 Access Points for different teams or applications, each with tailored access policies to ensure data security and compliance.

Suitable Analogy: S3 Access Points function like checkpoints at different entrances to a secure facility, each with its own access rules and permissions.

Clarifier: S3 Access Points support both VPC endpoint policies for private access within AWS and bucket policies for public access, offering flexibility in access control.

S3 Access Points provide a secure and efficient way to manage access to shared data stored in S3 buckets, improving security posture and access control.