S3 Flashcards
To maintain compliance with HIPAA, all healthcare-related data being stored on Amazon S3 needs to be encrypted at rest. Assuming S3 is being used for storing the data, which of the following are the preferred methods of encryption?
choose 2.
- Store the data on encrypted EBS volumes.
- Enable Server Side Encryption on your S3 bucket. S3 automatically applies AES-256 encryption.
- Encrypt the data locally using your own encryption keys and then transfer the encrypted data to S3.
- Store the data in S3 as EBS snapshots
- Enable Server Side Encryption on your S3 bucket. S3 automatically applies AES-256 encryption.
- Encrypt the data locally using your own encryption keys and then transfer the encrypted data to S3.
You could encrypt locally or let S3-SSE handle encryption for you. Local encryption will generally cost more due to overhead, testing and management not required if you use the certified S3 offering.
To enable cross-region replication in S3, what is not required?
- Permission on the destination bucket
- Versioning
- Enable S3 Streams
- Enable the cross-region replication option
- Enable S3 Streams
S3 Streams is not a real feature or option. Permissions, versioning, and cross-region replication features must all be configured for cross-region replication to function.
Private content exists on S3. You wish to share this content confidentially with others in the company organization as well as some outside contractors. What is the ideal solution to do so?
- Create an IAM policy allowing the necessary access. Create an IAM Group, and add all users into the group and apply the policy
- Create a bucket policy permitting specific IAM users access to the objects
- Create a bucket policy permitting a specific role to access the objects; grant the appropriate users access to the role
- Generate pre-signed URLs for the content to be distributed
- Make the content public, but only share the URL with people who need it
- Generate pre-signed URLs for the content to be distributed
Generating pre-signed URLs will ensure the most flexibility. IAM policies and roles are a nice idea here; however, not everyone will have an IAM user or be federated for access. As such, pre-signed URLs allow us to even create a separate URL for each viewer if we’d like. We can also revoke the pre-signed URLs when we wish, and even add
You wish to identify when an S3 bucket is made public, and automatically remediate this with an automated action that reverts it back to a private bucket. How could one efficiently accomplish this? (Choose 2)
- Use AWS Config Rules to identify the change, and trigger a Lambda function to change the Bucket ACL & policy back to private
- Use CloudTrail logs to identify any change to the bucket, and revert the change with Lambda
- Restrict users from making a bucket public through the use of IAM User policies
- Use Amazon Macie, along with CloudWatch Events to identify the public state, and automate is resolution through Lambda
- Use AWS Config Rules to identify the change, and trigger a Lambda function to change the Bucket ACL & policy back to private
- Use Amazon Macie, along with CloudWatch Events to identify the public state, and automate is resolution through Lambda
AWS Config has a built-in AWS Config Rule to detect public buckets. Because of this, Use AWS Config Rules to identify the change, and trigger a Lambda function to change the Bucket ACL & policy back to private” is a solid option. Use Amazon Macie, along with CloudWatch Events to identify the public state, and automate is resolution through Lambda” is also good, as Macie has the ability to alert upon public bucket exposure and automate its resolution through CloudWatch Events. The reason “Use CloudTrail logs to identify any change to the bucket, and revert the change with Lambda” is not correct is because although this could work, it would be highly inefficient and would be prone to missing actions. The CloudTrail log entries would require us to set up automation inspecting every single policy change made to the bucket. This could be a lot of overhead and complexities that make this inefficient. In addition, inspecting policy changes this way will require a large amount of logic to ensure we catch the changes at hand. Using combinations of Macie, CloudWatch Events, and/or AWS Config Rules, we can simply watch for the EFFECT of a change, vs. identifying specific calls that may possibly make up the unwanted change. Lastly, the reason Restrict users from making a bucket public through the use of IAM User policies” is incorrect, is although we do not want people to create public buckets that shouldn’t, there will most likely be some that will have that ability at some point in the company history and future. Applying protection at the bucket itself would be the angle to take here in addition to IAM user policies and IAM Roles that follow the principle of least privilege.
Your site uses machine learning algorithms to modify user-uploaded images in interesting ways, generating new images in under a second as a result. Both the original user image and the generated images are currently stored in S3 – but your site is currently growing with 50Gb of new content added per day, driving up your storage costs. Recent usage statistics have shown that both user uploaded and generated images are heavily accessed in the first 21 days after upload or creation, after which access sharply drops off. After 120 days they are never accessed again. You want to keep the good buzz you site has going and want to ensure that images are there when users need them, but at the same time you want to reduce storage costs to keep you site profitable. Which of the below is the best trade-off of the two?
- Store all images on S3. After 21 days move them both user uploaded and generated images to S3-IA with a lifecycle policy, then after 120 days move them to Glacier for archival purposes
- Store all images on S3-IA in the first 21 days. After 21 days move both user uploaded and generated images to S3-1Z-IA with a lifecycle policy, then after 120 days move them to Glacier for archival purposes
- Store all images on S3 in the first 21 days. After 21 days, move user images to S3-IA and generated images to S3-1Z-IA. Delete all content older than 120 days via lifecycle policy
- Store all images on S3 in the first 21 days. After 21 days move them both to S3-IA with a lifecycle policy. Create Lambda function that runs daily that deletes anything older than 120 days
- Store all images on S3 in the first 21 days. After 21 days, move user images to S3-IA and generated images to S3-1Z-IA. Delete all content older than 120 days via lifecycle policy
With a complex scenario like this, it’s a good a to break it down into components. In the first 21 days, due to the high usage of the images any storage that includes retrieval costs will not be suitable – ruling out any IA storage. After 21 days as usage drops off significantly IA becomes a viable option. Taking it a step further – as your site is generating the images based on the user uploaded image, generated images are easily replaceable if lost, as long as you have the user image. This means that a reduced redundancy storage option is valid for generated images – S3-1Z-IA. Anything older than 120 days can be deleted as it is no longer needed.
Your manager has approached you about storing some old media files in AWS. These files need to be stored at the lowest cost possible. It is acceptable to wait for files to become available. Which of the following S3 Storage Tiers is best suited for this request?
- S3 Glacier
- S3 One Zone – Standard-Infrequent Access
- S3 Infrequently Accessed
- S3 Standard
- S3 Glacier
S3 Glacier is a secure, durable, and low-cost storage class for data archiving. You can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low yet suitable for varying needs, S3 Glacier provides three retrieval options that range from a few minutes to hours.
What is NOT a chargeable event in S3?
- Transfer from S3 to EC2 in the same region
- Transfer OUT to another Region
- PUT / GET / LIST
- Versioned data
- Transfer from S3 to EC2 in the same region
S3 will not charge for Transfer within the same region to or from EC2. All data leaving the S3 region will incur a transfer charge, except when destined for CloudFront. Additionally web operations such as PUT / GET / LIST will incur a seperate charge along with the data storage itself.
When creating a website, and hosting exclusively on S3 while using Route53 to point an Alias to the bucket, what naming conventions must be met?
- Bucket name must be DNS compliant
- Any bucket name can be used for S3 hosting
- Bucket name must match the URL
- Bucket name must not contain periods
- Bucket name must match the URL
When directing a route53 Alias to the bucket, the bucket name must match the URL such as: www.mysite.com or mysite.com — would also be the name of two different buckets (though one bucket can redirect to another). Though any bucket name can host a website, if using aliases in Route53 this is a requirement.
What is the primary unit of data in S3 called?
- Bucket
- Tag
- Object
- File
- Object
A “file” in S3 would be referred to as an Object. S3 is an object store, which means it is a key-value store. The Key being the “name” of the object, and the value being its contents.
You are working on a research project for a healthcare insurer and your first task is to ingest 6 months of trial data collected by about 30 participating physicians around the country. Each data set is about 15 GB in size and contains protected health information. You are proposing to use S3 Transfer Acceleration for the data upload to an S3 bucket but a colleague raises some concerns about that. Which of the following statements are valid?
- It will take a long time because S3 Transfer Acceleration does not support all bucket level features including multipart uploads.
- The name of your bucket used for Transfer Acceleration must be DNS-compliant and must not contain periods (‘.’).
- Most physicians have only about 40 to 50Mbps of available bandwidth. S3 Transfer Acceleration is therefore not a good option.
- Because S3 Transfer Acceleration is not a HIPAA eligible service, you can’t use it to transfer protected health information between the physicians and your Amazon S3 bucket.
The name of your bucket used for Transfer Acceleration must be DNS-compliant and must not contain periods (‘.’).
S3 TA supports all bucket level features including multipart uploads. AWS has expanded its HIPAA compliance program to include Amazon S3 Transfer Acceleration as a HIPAA eligible service. In general; if there are recurring transfer jobs, and there is more than 25Mbps of available bandwidth, and it will not take more than a week to transfer over the Internet, S3 Transfer Acceleration is an acceptable option.
You are uploading multiple files ranging 10 GB – 20 GB in size to AWS S3 bucket by using multi- part upload from an application on EC2. Once the upload is complete, you would like to notify a group of people who do not have AWS IAM accounts. How can you achieve this?(choose 2 options)
- Use S3 event notification and configure Lambda function which sends email using AWS SES non-sandbox.
- Use S3 event notification and configure SNS which sends email to subsribed email addresses.
- Write a custom script on your application side to poll S3 bucket for new files and send email through SES non-sandbox.
- Write a custom script on your application side to poll S3 bucket for new files and send email through SES sandbox.
- Use S3 event notification and configure Lambda function which sends email using AWS SES non-sandbox.
- Use S3 event notification and configure SNS which sends email to subsribed email addresses.
Answer: A, B
The Amazon S3 notification feature enables you to receive notifications when certain events happen in your bucket. To enable notifications, you must first add a notification configuration identifying the events you want Amazon S3 to publish, and the destinations where you want Amazon S3 to send the event notifications.
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
AWS Simple Email Service (SES) is a cost-effective email service built on the reliable and scalable infrastructure that Amazon.com developed to serve its own customer base. With Amazon SES, you can send transactional email, marketing messages, or any other type of high-quality content.
To help prevent fraud and abuse, and to help protect your reputation as a sender, we apply certain restrictions to new Amazon SES accounts.
We place all new accounts in the Amazon SES sandbox. While your account is in the sandbox, you can use all of the features of Amazon SES. However, when your account is in the sandbox, we apply the following restrictions to your account:
You can only send mail to verified email addresses and domains, or to the Amazon SES
mailbox simulator.
You can only send mail from verified email addresses and domains.
Note
This restriction applies even when your account is not in the sandbox.
You can send a maximum of 200 messages per 24-hour period.
You can send a maximum of 1 message per second.
You can request to move out of the sandbox mode when you are ready for production mode.
For more information on how to move out of sandbox mode, refer to the documentation here.
https://docs.aws.amazon.com/ses/latest/DeveloperGuide/request-production-access.html
Option A triggers Lambda function which uses non-sandbox SES to send email to people who does not have AWS IAM account nor verified in AWS SES.
Option B triggers SNS.
The following document describes how to add SNS event notification to a bucket.
https://docs.aws.amazon.com/AmazonS3/latest/dev/ways-to-add-notification-config-to-bucket.html
Options C and D, although sounds feasible options, it requires compute resources to continuously monitor S3 for new files.
We should use AWS provided features where ever are applicable. Custom solutions can be built when AWS provided features do not meet the requirement.
What service does S3 transfer acceleration utilize for ingesting data?
- WAF
- S3
- CloudFront
- Hadoop
- CloudFront
CloudFront provides the edge points to ingest data closer to the user; this will allow for the data to enter the AWS optimized network as early as possible within the transfer.
You work for a large insurance company that has issued 10,000 insurance policies. These policies are stored as PDFs. You need these policies to be highly available, and company policy says that the data must be able to survive the simultaneous loss of two facilities. What storage solution should you use?
- Glacier
- S3
- EBS
- A single EC2 instance with an EBS volume provisioned as a secondary volume.
- S3
Your best solution would be to use S3, which redundantly stores multiple copies of your data in multiple facilities and on multiple devices within each facility.
You created a bucket named “myfirstbucket” in US West region. What are valid URLs for accessing the bucket? (Choose 2 options)
http://myfirstbucket.s3.us-west-1.amazonaws.com
http://s3.myfirstbucket.us-west-1.amazonaws.com
http://s3.us-west-1.amazonaws.com/myfirstbucket
http://s3-us-west-1-amazonaws.com/myfirstbucket
http://s3.amazonaws.com/myfirstbucket
http://myfirstbucket.s3.us-west-1.amazonaws.com
http://s3.us-west-1.amazonaws.com/myfirstbucket
Answer: A, C
For option A, it matches the virtual-hosted-style URL and it is correct.
For option B, it does not match any of the above-mentioned URL patterns. It is incorrect.
For option C, it matches the path-style URL and it is correct.
For option D, it does not match any of the above-mentioned URL patterns.
For option E, it matches path-style URL, but since the bucket is in us-west-1 region, it must contain the region in the endpoint. So it is incorrect.
https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro
NOTE: Option C and D are different. (Dot and Hyphen).
Option C – http://s3.us-west-1.amazonaws.com/myfirstbucket
Option D – http://s3-us-west-1-amazonaws.com/myfirstbucket
What is the durability of S3 – IA?
99.9%
99.999999999%
99%
99.99%
99.999999999%
S3 Standard – IA is designed for the same 99.999999999% durability as S3 Standard and Amazon Glacier.
You have been tasked with storing some PDFs used a couple of times a month in AWS. These files need to be available within seconds when requested and the company cannot afford for these files to go missing therefore they must survive an outage of an Availability Zone. Which of the following S3 Storage Tiers is best suited for this request?
S3 Infrequently Accessed
S3 One Zone – Infrequently Accessed
Glacier
S3 Standard
S3 Infrequently Accessed
S3 Standard-IA is for data that is accessed less frequently, but requires rapid access when needed. S3 Standard-IA offers the high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee.
What is the name of the services that can automatically transition objects in S3 across storage classes, including moving into archive (Glacier) and even expire (delete) objects per defined rules?
- S3 Object Management
- S3 Lifecycle Management
- S3 Transition Manager
- This cannot be done natively; a script would need to be created and run on a schedule
- S3 Lifecycle Management
S3 Lifecycle Management is designed to offer this functionality. “S3 Object Management” and “S3 Transition Manager” are simply non-existent features and “This cannot be done natively; a script would need to be created and run on a schedule” is simply wrong since it can in fact be done natively within S3 per the Lifecycle Management.
You have an application which writes application logs to version enabled S3 bucket. Each object has multiple versions attached to it. After 60 days, application deletes the objects in S3 through DELETE API on the object. However, in next month’s bill, you see charges for S3 usage on the bucket. What could have caused this?
- DELETE API call on the object only deletes latest version.
- DELETE API call on the object does not delete the actual object, but places delete marker on the object.
- DELETE API call moves the object and its versions to S3 recycle bin from where object can be restored till 30 days.
- DELETE API for all versions of the object in version enabled bucket cannot be done through API. It can be only done by bucket owner through console.
- DELETE API call on the object does not delete the actual object, but places delete marker on the object.
Answer: B
When versioning is enabled, a simple DELETE cannot permanently delete an object.
Instead, Amazon S3 inserts a delete marker in the bucket, and that marker becomes the current version of the object with a new ID. When you try to GET an object whose current version is a
delete marker, Amazon S3 behaves as though the object has been deleted (even though it has not been erased) and returns a 404 error.
The following figure shows that a simple DELETE does not actually remove the specified object. Instead, Amazon S3 inserts a delete marker.
To permanently delete versioned objects, you must use DELETE Object versionId.
The following figure shows that deleting a specified object version permanently removes that object.
For information on how to delete versioned objects through API, refer documentation here.
https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html#delete-obj-version-enabled-bucket-rest
Option A is not true. DELETE call on object does not delete latest version unless DELETE call is made with the latest version id.
Option C is not true. AWS S3 does not have recycle bin.
Option D is not true. DELETE call on versioned object can be made through API by providing version id of the object’s version to be deleted.
Your current website currently manages its state locally. This state is preventing the ability to scale properly and requires the use of sticky sessions on the load balancers. You wish to change this. What is not an acceptable way to do this?
- Track the state in DynamoDB
- Track the state in Elasticache running Redis
- Track the state in an SQL database
- Track the state using an S3 object that contains customer details
- Track the state using an S3 object that contains customer details
S3 is eventually consistent. Though consistency is in fact quite fast most of the time, there are no guarantees around it and when managing something like state the data may change more rapidly than S3 will be good for. Updated/changed data is an entirely new object in the eyes of S3 (either new version, or a complete overwrite). As such, rapidly changing data is not ideal. In addition, due to eventual consistency, its possible a state request could pull old state information (even if just stale by milliseconds or seconds) which could cause some serious issues in our application.
You have chosen to use S3 – OneZone-IA with your cloud application. Which limitations have you considered in doing so?
- 1Zone-IA is available only in the US-STANDARD region.
- 1Zone-IA offers only 99.50% availability. Therefore you have to design your application to re-create any objects that may be temporally unavailable.
- 1Zone-IA has a 3 – 5 hour data recovery windows.
- 1Zone-IA offers only 99.50% durability. Therefore you have to design your application to re-create any objects that may be lost.
- 1Zone-IA requires supplementary Access Control Lists.
- 1Zone-IA offers only 99.50% availability. Therefore you have to design your application to re-create any objects that may be temporally unavailable.
In exchange for a significant cost savings, 1Zone-IA has the same Durability as S3, but a lower Availability SLA.