Simple Storage Service (S3) Flashcards

Question

S3 Object Storage Classes - S3 One Zone-IA

Answer 1

-It's cheaper to storage than S3 Standard or Infrequent Access. -Data is only stored in one AZ - It does not provide the multi-AZ resilience model. -Provides 11 9s of durability. (Availability) -It has a per GB data retrieval fee (higher tha Standard), overall cost increases with frequent data access. -It has a minimum duration charge of 30 days - Objects can be stored for less, but the minimum billing always applies. -It has a minimum capacity charge of 128KB per object. -S3 One Zone-IA should be used for long-lived data, which is NON-CRITICAL & REPLACEABLE and where access is INFREQUENT.

Answer 2

-Data is replicated at least in 3 AZs. (able to cope with multiple AZ failure) -Provides 11 9s of durability. (Availability) -It has a per GB data retrieval fee (higher tha Standard), overall cost increases with frequent data access. -It has a minimum duration charge of 90 days - Objects can be stored for less, but the minimum billing always applies. -It has a minimum capacity charge of 128KB per object. S3 Glacier Instant should be used for LONG-LIVED DATA, accessed ONCE per QUARTER with MILLISECOND ACCESS

Answer 3

-Data is replicated at least in 3 AZs. (able to cope with multiple AZ failure) -Provides 11 9s of durability. (Availability) -Objects cannot be made publicly accessible... any access of data (beyond object metadata) requires a retrieval process. -When you retrieve Objects from this class, they're stored in the S3 Standard-IA class on a temporary basis. You access them and then they're removed. -You can retrieve them permanently by changing the class back to one of the S3 ones. Retrieval jobs come in three different types: - Expedited (1-5 minutes) - Standard (3-5 hours) - Bulk (5-12 hours) Faster = More Expensive -It means it has a first byte latency of minutess or hours. -It has a minimum duration charge of 90 days - Objects can be stored for less, but the minimum billing always applies. -It has a minimum capacity charge of 40KB per object. S3 Glacier Flexible Retrieval is for situations where you need to store archival data, where frequent or real-time access isn't needed, for example: yearly access, and you're OK with minutes to hours for retrieval operations.

Answer 4

-It's the cheapest storage option -Data is replicated at least in 3 AZs. (able to cope with multiple AZ failure) -Provides 11 9s of durability. (Availability) -It has a minimum duration charge of 180 days - Objects can be stored for less, but the minimum billing always applies. -It has a minimum capacity charge of 40KB per object. -Objects cannot be made publicly accessible... any access of data (beyond object metadata) requires a retrieval process. -When you retrieve Objects from this class, they're stored in the S3 Standard-IA class on a temporary basis. You access them and then they're removed. -You can retrieve them permanently by changing the class back to one of the S3 ones. Retrieval jobs come in three different types: - Standard (12 hours) - Bulk (48 hours) -It means it has a first byte latency of hours or days. S3 Glacier Flexible Deep Archive should be used for data which is archival, which rarely, if ever, needs to be accessed and where hours or days is tolerable for the retrieval process. It's more suited for secondary long-term archival backups or data which comes under legal or regulatory requirements in terms of retention length.

Answer 5

It's a storage class which contains five different storage tiers. With it, when you move objects into this class, there are a range of ways that an Object can be stored. - Frequent Access = costs the same as S3 Standard - Infrequent Access = costs the same as S3 Standard-IA - Archive Instant Access = comparable costs to S3 Glacier Instant - Archive Access = comparable costs to S3 Glacier Flexible - Deep Archive = comparable costs to S3 Glacier Deep Archive -You don't have to worry about moving objects between tiers, with Intelligent-Tiering system does this for you. -It will monitor the usage of the object.. If the object is in regular use, it would stay within the Frequent Access tier. -Intelligent-Tiering monitors and automatically moves any objects not access for 30 days to a low cost infrequent access tier and eventually to archive instant access, archive access or deep archive tiers. -You can also add a configuration, based on a bucket (prefix or object tag), any objects which are accessed less frequently can be moved into the three Archive tiers. -- Archive Instant Access (90 days) --To move objects to even "Colder" tiers when objects aren't accessed (90 > 270) - Archive Access (OPTIONAL) --To move objects to even "Colder" tiers when objects aren't accessed (180 > 730) - Deep Archive (OPTIONAL) Using these two optional tiers, means that your applications must support these tiers, because retrieving objects requires specific API calls. -Intelligent-Tiering has a monitoring and automation cost per 1,000 objects. (management fee) Intelligent-Tiering should be used for LONG-LIVED DATA, with CHANGING or UKNNOWN patterns.

Answer 6

W/O Bucket Keys Are a way to help S3 scale and reduce costs when using KMS encryption -Each Object "Put" using sse-kms uses a unique DEK *** -Unique DEK is stored with the Object -Each DEK is an API call to KMS For every single, object that the user uploads, it needs a single unique call to KMS to generate a DEK to return that DEK to S3, use that key to encrypt the object and then store the two side by side. -Calls to KMS have a cost & levels where throttling occurs -To generate DEK operations can only be run either 5,500 or 10,000 or 50,000 p/s shared across regions. (number depends on which regions you use) -Using a single KMS key results in a "scaling limit" for PUTS per second per key W/ Bucket Keys Same architecture, but instead of the KMS key being used to generate each individual DEK, instead it's used to generate a time limited bucket key. *** -This is given to the Bucket -This is used for a period of time to generate any DEKs within the bucket, for individual object encryption operations -Bucket keys significantly reduces KMS API calls - reducing cost and increasing scalability -Not Retroactive, only effects objects after enabled on bucket Things to keep in mind, when using Bucket Keys. -CloudTrail KMS events now show the Bucket. After you enable an S3 Bucket Key, if you're using CloudTrail to look at KMS logs, then those logs are going to show the Bucket ARN instead of your Object ARN -Works with cross/same region replication When S3 replicates an encrypted object, it preserves the encryption settings of that encrypted object. -If replicating plaintext to a bucket using Bucket Keys, the object is encrypted at the destination side (ETAG changes)

Answer 7

You can create Lifecycle rules on S3 Buckets, which can automatically transition or expire Objects in the Bucket, they are a great way to optimize the cost for larger S3 Buckets. -Is a set of rules -Rules consist of actions, which apply based on criteria (Do X if Y is true) -You can apply them on a Bucket or groups of objects (define by prefix or tags) Theres two types of actions: -Transition Actions = Which change the storage class of which ever Object or Objects are affected. You could transition Objects from S3 Standard to for example: S3 IA, after 30 days. -Expiration Actions = Which can delete which ever Object or Objects versions are affected. You might want to expire Objects or versions entirely, after a certain time period. (This could be useful to keep Buckets tighty) Both of these could work on versions if you have a version enabled Bucket. Lifecycle Configuration offer a way to automate deletion of Objects or Objects Versions or change the class of Objects, to optimize costs overtime. -Rules can be based on access.

Answer 8

This process works like a "waterfall". -Transitions can't happen on a upwards direction, only down. There are some restrictions: -Be careful when transitioning smaller objects, from Standard to IA to Intelligent-Tiering or One-Zone IA (because of the minimum in those classes) (Smaller Objects can cost more because of minimum size) -There's a 30-Day minimum period, where an Object needs to remain on S3 Standard before then moving to IA. (APPLIES ONLY WHEN USING LIFECYCLE CONFIGURATIONS) -If you want to create a single rule which transitions Objects from Standard to IA or One Zone-IA and THEN to Glacier classes... you have to wait an additional of 30-days before then transitioning those Objects to any of the Glacier classes.

Answer 9

A feature which allows you to configure the replication of Objects between a source and destination S3 Bucket. There are two types of replication supported by S3: -Cross-Region Replication (CRR) = Allows the replication of Objects from a source Bucket to a destination Bucket, in different AWS Regions. -Same-Region Replication (SRR) = Allows the replication of Objects from a source Bucket to a destination Bucket, in the same AWS Region. The architecture for both types of replication is simple, it only differs depending on wether the Buckets are in the same AWS account or not. (Both types of replication supports both) -In both cases, the replication configuration is applied to the source Bucket. The replication configuration configures S3 to replicate from the source Bucket to a destination Bucket, and it specifies a few important things: - Destination Bucket to use. - IAM role to use for the process.. the role is configured to allow the S3 service to assume it. (Trust Policy) The roles Permissions Policy gives it the permission to read Objects on the source Bucket and permisisons to replicate those Objects to the destination Bucket. - The replication is encrypted (SSL) Inside ONE account, both S3 Buckets are owned by the same AWS account. (They both trust the account) In different AWS account, the destination Bucket because it's in a different account doesn't trust the source account or the role that's used to replicate the Bucket content. There's a requirement if you want to configure a replication between different accounts: -Add a Bucket Policy on the destination Bucket, which allows the role in the source account to replicate Objects into it.

Answer 10

-The default is to replicate All Objects within a Bucket or a subset (filter objects by prefix or tags or both) -Select which storage class, the Objects will use in the destination Bucket. - Default is to maintain the same class -You can define the ownership of the Objects in the destination Bucket. - Default is to maintain the same owner (source account) - Replication Time Control (RTC) = This is a feature which adds a guaranteed 15-minute replication SLA on to this process. Even adds extra monitoring to see which Objects are queued to replication. (Only use if you got really strict set of requirements)

Answer 11

- Replication is NOT retroactive & Versioning needs to be ON. (in both Buckets) (If you enable replication on a Bucket that already has Objects, those won't be replicated ) -It's a ONE-WAY replication process from source to destination. - Replication is capable of handling Objects, which are unencrypted and encrypted using SSE-S3 & SSE-KMS (with extra config) -Not capable of replicating Objects, that are using SS3-C, because S3 is not in possesion of the keys in order to access the plaintext version of the Object. -Replication requires the source bucket owner needs permissions to Object. -It will not replicate system events (so if any changes are made in the source Bucket by Lifecycle Manament, they will not be replicated to the destination Bucket, only User events are) and they can't be in Glacier or Glacier Deep Archive classes. -By default DELETES are NOT replicated. (Delete markers) (You can enable it)

Answer 12

-SRR - Log Aggregation = So if you got multiple different S3 Buckets, which stores logs in different systemsy, then you could use this, to aggregate those logs to a single S3 Bucket. -SRR - PROD and TEST Sync = To configure some sort of synchronization between Production and Test accounts or maybe replicate data between them periodically. -SRR - Resilience with strict sovereignty requirements = There are some countries and sectors which can not have data leaving the specific AWS region, because of sovereignty requirements. (Account level isolation) -CRR - Global Resilience Improvements = So you can have backups copied in another AWS regions. -CRR - Latency Reduction = For customers in another region

Answer 13

Are a way that you can give another person or an application, access to an Object inside a S3 Bucket, using our credentials in a safe and secure way. (Access to a private Bucket with no public permissions) There are 3 common solutions to give an unauthenticated user access to a Bucket: - Give an AWS identity - Provide AWS Credentials - Make it Public If the user only needs short-term access to the bucket, NONE of these are ideal. For this case you should use Presigned URLs. How does it work? An authenticated user with enough permissions, can request S3 to generate a presignedURL, but it would need to provide security credentials, specify a Bucket name, an Object key, and an expired date in time as well as indicate, how the Object would be accesed. Then S3 would create the presignedURL and return it to the user. The presignedURL can then be passed to an Unauthenticated Principal, so it can used to access the specific Object in a specific Bucket, until it expires. When the person uses the presignedURL, that person is actually interacting with S3 as the "authenticated user" who generated it. Can be used for: - Downloads from S3 (PUT) - Uploads to S3 (GET) - As part of a Serverless architecture, where access to a private S3 Bucket, needs to be controlled and you don't want to run thick applications servers to broker that access. - Can be used for web applications that have offloaded their media files to S3. Since a lot of unauthenticated users have to access it, you can use the presignedURL to give them access. (When you offload media into S3) 1. User requests. 2. Application Server with an authenticated IAM identity, requests a presignedURL to S3. 3. S3 generates the presignedURL to access the "Media Bucket". 4. S3 then returns the presignedURL to the Application Server and through the end user. 5. The Web Application that's running on the user's computer will use the presignedURL to securely access the particular Object stored in the "Media Bucket".

Answer 14

- You can create a URL for an object you have NO ACCESS TO. - When using the URL, the permissions match the identity which generated it... (so if he has permissions, you will be able to see it, if not, you wouldn't) - Access denied could mean the generating ID NEVER HAD ACCESS... or DOESN'T NOW. - DON'T GENERATE WITH A ROLE.. URL stops working when temporary credentials expire.

Answer 15

Are ways where you can retrieve parts of Objects, rather than the entire Object. -S3 can store HUGE objects (up to 5TB) and infinite number of objects. -You often want to retrieve the ENTIRE OBJECT. -Retrieving a 5TB Object.. TAKES TIME, IT CONSUMES 5TB. ** -Filtering at the client side DOESN'T REDUCE THIS. ** -S3/Glacier select let you use SQL-Like statements. ** So you create this, supply it to that service, and that service uses this SQL-like statement to select parts of that Object and THIS part and ONLY this part is sent to the client in a PRE-FILTERED way by S3. (You only consume the pre-filtered part) -Allow you to operate on a number of files format, with this level of functionality. CSV, JSON, Parquet, BZIP2 compression for CSV and JSON - w/o S3 Select Filtering occurs in-app full amount retrieved and billed by S3 - w/ S3 Select Filtering occurs ON THE SERVICE the data delivered by S3 is PRE-FILTERED In this way, we can achieve faster speeds (~400%+) and (~80%) faster, because the filtering occurs before it's transfered to the application.

Answer 16

This is a feature which allows you to create event notifications configurations on a Bucket. -Notification are generated when certain events occur in a Bucket. -Can be delivered to SNS, SQS and Lambda Functions. (We have to add a Resource Policy allowing S3 principal access) This means you can have event-driven processes which occur, as a result of things happening within S3. Different types of events are supported: -Object Created (Put, Post, Copy, CompleteMultiPartUpload) -Object Deletion (*, Delete, DeleteMarkerCreated) -Object Restore (Post (Initiated), (Completed)) -Replication (OperationMissedThreshold, OperationReplicatedAfterThreshold, OperationNotTracked, OperationFailedReplication)

Answer 17

-We have to enable this feature on the Bucket -To perform this, you'll need a Source Bucket and a Target bucket. -It registers Bucket and Object Access -The logging is managed by S3 Log Delivery Group, which reads the logging configuration you set on the source Bucket Best efforts log delivery, accesses to Source Bucket are usually logged in Target bucket within a few hours. To use this feature you'll need to give access to the Target Bucket, and this is done by using a Bucket ACL, which allows the "S3 Log Delivery Group". And this is how, it can deliver the logs generated on the Source Bucket to the Target Bucket. -Logs are delivered as "Log Files", which consist of Log Records, Records are newline-delimited, each record consists of Attributes (such as date/time, the requester, the operations, error codes and much more) and these are space-delimited. A single Target Bucket can be used for many Source Buckets and you can separate these easily, using prefixes in the Target Bucket. (This is configured within the logging configuration which is set on the Source Bucket) -Access Logging provides detail information of the requests, which are made to a Source Bucket and they are useful to many applications. (For example: Security Functions and Security Audits) It can also help you to understand the access patterns of your customers based and understand any charges on your S3 bill. -If you use this feature, you'll need to personally manage the lifecycle or deletion of any of the Log Files.

Answer 18

A feature of S3 which improves the manageability of S3 Buckets, especially when you have Buckets which are used by many different teams or users, or when Buckets store objects with a wide range of functions. -Simplify managing access to S3 Buckets/Objects -Rather than 1 Bucket w/ 1 Bucket Policy... -You can create many access points for a Bucket and each of these can have different policies So different access controls from a permissions perspective -Each with different network access controls - can be limited, in terms of where they can be accessed from -Each access point has its own endpoint address -Create access points, via Console or "aws s3control create-access-point" --name secretcats --acount-id 123456789012 --bucket catpics ***

Answer 19

-Resource (Bucket) Policy applied to the Bucket are large and difficult to manage -Access points w/ Internet Origin / Access points w/ VPC Origin -Each Access Point has a unique DNS address for network access (This DNS address that we would give to our staff) -Each Access Point has its own policy Access point policies control permissions for access via Access point & is functionally equivalent to a bucket policy. Access Point policy can restrict identities to certain prefix(s), tags or actions based on need. -From the VPC side, Access Points can be set to only allow VPC origin (Tied to a specific VPC) - Requires a VPC Endpoint. Access via this route can be enforced by endpoint policies -Any permissions defined on an Access Point need to be also defined on the Bucket Policy - Matching permissions or delegation (on the bucket policy you grant wide open access via the access point)

Simple Storage Service (S3) Flashcards

(43 cards)