2_S3 and Glacier Flashcards
S3 - Exam Tips
- Read the S3 FAQ before taking the exam!
- S3 is object based i.e. allows you to upload files
- Files can be from 0 byte to 5 TB
- There is unlimited storage
- S3 is a universal namespace, that is, (bucket) names must be unique globally
- https://s3-us-west-1.amazonaws.com/julienheck
- Not suitable to install programs or OS on.
- Successful uploads will generate a HTTP 200 status code
- You can turn on MFA Delete
- You can use Multi-Object Delete to delete large numbers of objects
- By default, you can provision up to 100 buckets per AWS account.
- By default, buckets are private and all objects stored inside them are private
S3 - Exam Tips
- Read the S3 FAQ before taking the exam!
- S3 is object based i.e. allows you to upload files (not suitable to install programs or OS)
- Files can be from 0 byte to 5 GB with PUT operation, 5 TB with Multipart upload (faster)
- There is unlimited storage
- You can use Multi-Object Delete to delete large numbers of objects
- S3 is a universal namespace, that is, (bucket) names must be unique globally
- https://s3-us-west-1.amazonaws.com/julienheck
S3 - Core fundamentals of an S3 object
- Key (name of the object)
- Value (data)
- Version ID (for versioning)
- Metadata
- Subresources
- Access Control Lists
- Torrent
S3 - Core fundamentals of an S3 object
- Key (name)
- Value (data)
- Version ID
- Metadata
- Subresources
- Access Control Lists
- Torrent
S3 - Consistency
- Read after Write consistency for PUTS of new objects
- Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate)
S3 - Consistency
- Read after write consistency for PUTS of new objects
- Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate)
S3 - Storage Classes/Tiers
- S3 Standard: 99.99% availability, 99.999999999% durability, stored redundantly across multiple devices in multiple facilities, and is designed to sustain the loss of 2 facilities concurrently.
- S3 - IA (Infrequently Accessed): For data that is accessed less frequently, but requires rapid access when needed. 99.99% availability. Lower fee than S3, but you are charged a retrieval fee
- S3 One Zone - IA: want a lower-cost option for infrequently accessed data, but do not require the multiple Availability Zone data resilience.
- S3 - Intelligent Tiering: Designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead.
- S3 Glacier: Is a secure, durable, and low-cost storage class for data archiving. You can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. Retrieval times configurable from minutes to hours.
- S3 Glacier Deep Archive: Is Amazon S3’s lowest-cost storage class where a retrieval time of 12 hours is acceptable.
S3 - Storage Classes/Tiers
- S3 : 99.99% availability, 99.999999999% durability, immediately available, frequently accessed
- S3 - IA (Infrequently Accessed) : For data that is accessed less frequently, but requires rapid access when needed. 99.99% availability. Lower fee than S3, but you are charged a retrieval fee
- S3 - RRS (Reduced Redundancy Storage) : Designed to provide 99.99% availability and 99.99% durability
- Glacier - Archived data, where you can wait 3-5 hours before accessing
S3 - Pricing Tiers [SAA-C02]
Understand how to get the best value out of S3:
- Preferably use S3 - Intelligent Tiering as it takes advantage of S3 - Standard and S3 - IA.
- Use S3 Standard only if you have a lot of objects (you pay for monitoring and automation in S3 - IT per 1000 objects).
S3 Intelligent Tiering
When access pattern to web application using S3 storage buckets is unpredictable, you can use S3 Intelligent Tiering storage class. S3 Intelligent Tiering storage class includes two access tiers: frequent access and infrequent access. Based upon access patterns, it moves data between these tiers which helps in cost saving. S3 Intelligent Tiering storage class has the same performance as of Standard storage class.
S3 Glacier - Archive Retrieval
- Expedited: Data accessed using Expedited retrievals are typically made available within 1–5 minutes.
- Standard: Standard retrievals allow you to access any of your archives within several hours. Standard retrievals typically complete within 3–5 hours.
- Bulk: Bulk retrievals are S3 Glacier’s lowest-cost retrieval option, which you can use to retrieve large amounts, even petabytes, of data inexpensively in a day. Bulk retrievals typically complete within 5–12 hours.
S3 - Security
- By default, all newly created bucket are PRIVATE
- You can setup access control to your buckets using:
- IAM policies: You can only grant users within your own AWS account permission to access your Amazon S3 resources
- Access Control Lists (ACLs): You can only grant other AWS accounts (not specific users) access to your Amazon S3 resources
- Bucket Policies: can be used to add or deny permissions across some or all of the objects within a single bucket. Policies can be attached to users, groups, or Amazon S3 buckets, enabling centralized management of permissions. You can grant users within your AWS account or another AWS account access to your Amazon S3 resources
- S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be sent to another bucket and even another bucket to another account.
S3 - Securing your buckets
- By default, all newly created bucket are private
- You can setup access control to your buckets using:
- IAM policies: You can only grant users within your own AWS account permission to access your Amazon S3 resources
- Access Control Lists (ACLs): You can only grant other AWS accounts (not specific users) access to your Amazon S3 resources
- Bucket Policies: can be used to add or deny permissions across some or all of the objects within a single bucket. Policies can be attached to users, groups, or Amazon S3 buckets, enabling centralized management of permissions. You can grant users within your AWS account or another AWS account access to your Amazon S3 resources
- S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be done to another bucket.
S3 - Security
Customers may use four mechanisms for controlling access to Amazon S3 resources: Identity and Access Management (IAM) policies, bucket policies, Access Control Lists (ACLs) and query string authentication. IAM enables organizations with multiple employees to create and manage multiple users under a single AWS account.
With IAM policies, companies can grant IAM users finegrained control to their Amazon S3 bucket or objects while also retaining full control over everything the users do. With bucket policies, companies can define rules which apply broadly across all requests to their Amazon S3 resources, such as granting write privileges to a subset of Amazon S3 resources. Customers can also restrict access based on an aspect of the request, such as HTTP referrer and IP address.
With ACLs, customers can grant specific permissions (i.e. READ, WRITE, FULL_CONTROL) to specific users for an individual bucket or object. With query string authentication, customers can create a URL to an Amazon S3 object which is only valid for a limited time.
S3 - Security
Customers may use four mechanisms for controlling access to Amazon S3 resources: Identity and Access Management (IAM) policies, bucket policies, Access Control Lists (ACLs) and query string authentication. IAM enables organizations with multiple employees to create and manage multiple users under a single AWS account.
With IAM policies, companies can grant IAM users finegrained
control to their Amazon S3 bucket or objects while also retaining full control over everything the users do. With bucket policies, companies can define rules which apply broadly across all requests to their Amazon S3 resources, such as granting write privileges to a subset of
Amazon S3 resources. Customers can also restrict access based on an aspect of the request, such as HTTP referrer and IP address. With ACLs, customers can grant specific permissions (i.e. READ, WRITE, FULL_CONTROL) to specific users for an individual bucket or object. With query string authentication, customers can create a URL to an Amazon S3 object which is only valid for a limited time.
S3 - Encryption
- In Transit
- SSL/TLS
- At Rest
- Server Side Encryption:
- SSE - S3: S3 Managed Keys
- SSE - KMS: AWS Key Management Service, managed Keys
- SSE - C: Server-side encryption with customer provided key
- Client Site Encryption
- Server Side Encryption:
S3 - Encryption
- In Transit
- SSL/TLS
- At Rest
- Server Side Encryption:
- SSE - S3: S3 Managed Keys
- SSE - KMS: AWS Key Management Service, managed Keys
- SSE - C: Server-side encryption with customer provided key
- Client Site Encryption
- Server Side Encryption:
S3 - Encryption
SSE-S3 provides an integrated solution where Amazon handles key management and key protection using multiple layers of security. You should choose SSE-S3 if you prefer to have Amazon manage your keys.
SSE-C enables you to leverage Amazon S3 to perform the encryption and decryption of your objects while retaining control of the keys used to encrypt objects. With SSE-C, you don’t need to implement or use a clientside library to perform the encryption and decryption of objects you store in Amazon S3, but you do need to manage
the keys that you send to Amazon S3 to encrypt and decrypt objects. Use SSE-C if you want to maintain your own encryption keys, but don’t want to implement or leverage a clientside encryption library.
SSE-KMS enables you to use AWS Key Management Service (AWS KMS) to manage your encryption keys. Using AWS KMS to manage your keys provides several additional benefits. With AWS KMS, there are separate
permissions for the use of the master key, providing an additional layer of control as well as protection against unauthorized access to your objects stored in Amazon S3. AWS KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, AWS KMS provides additional security controls to support customer efforts to comply with PCIDSS, HIPAA/HITECH, and FedRAMP industry requirements.
Using an encryption client library, such as the Amazon S3 Encryption Client, you retain control of the keys and complete the encryption and decryption of objects clientside using an encryption library of your choice. Some customers prefer full end to end control of the encryption and decryption of objects; that way, only encrypted objects are transmitted over the Internet to Amazon S3. Use a clientside library if you want to maintain control of your encryption keys, are able to implement or use a clientside encryption library, and need to have your objects encrypted before they are sent to Amazon S3 for storage.
S3 - Encryption
SSE-S3 provides an integrated solution where Amazon handles key management and key protection using multiple layers of security. You should choose SSE-S3 if you prefer to have Amazon manage your keys.
SSE-C enables you to leverage Amazon S3 to perform the encryption and decryption of your objects while retaining control of the keys used to encrypt objects. With SSE-C, you don’t need to implement or use a clientside library to perform the encryption and decryption of objects you store in Amazon S3, but you do need to manage
the keys that you send to Amazon S3 to encrypt and decrypt objects. Use SSE-C if you want to maintain your own encryption keys, but don’t want to implement or leverage a clientside encryption library.
SSE-KMS enables you to use AWS Key Management Service (AWS KMS) to manage your encryption keys. Using AWS KMS to manage your keys provides several additional benefits. With AWS KMS, there are separate
permissions for the use of the master key, providing an additional layer of control as well as protection against unauthorized access to your objects stored in Amazon S3. AWS KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, AWS KMS provides additional security controls to support customer efforts to comply with PCIDSS, HIPAA/HITECH, and FedRAMP industry requirements.
Using an encryption client library, such as the Amazon S3 Encryption Client, you retain control of the keys and complete the encryption and decryption of objects clientside using an encryption library of your choice. Some customers prefer full endtoend control of the encryption and decryption of objects; that way, only encrypted objects are transmitted over the Internet to Amazon S3. Use a clientside library if you want to maintain control of your encryption keys, are able to implement or use a clientside encryption library, and need to have your objects encrypted before they are sent to Amazon S3 for storage.
S3 - Version Control
- Stores all versions of an object (including all writes and even if you delete an object; be careful when versioning large files)
- Great backup tool
- Once enabled, versioning cannot be disabled, only suspended
- Integrates with Lifecycle rules
- Versioning’s MFA Delete capability, which uses multi-factor authentication, can be used to provide an additional layer of security
S3 - Version Control
- Stored all versions of an object (including all writes and even if you delete an object; be careful when versioning large files)
- Great backup tool
- Once enabled, versioning cannot be disabled, only suspended
- Integrates with Lifecycle rules
- Versioning’s MFA Delete capability, which uses multi-factor authentication, can be used to provide an additional layer of security
S3 - Life Cycle Management
- Automates moving your objects between the different storage tiers.
- Can be used in conjunction with versioning
- Can be applied to current versions and previous versions
- Use Lifecycle policies to expire incomplete Multipart uploads by automatically removing incomplete multipart uploads and the associated storage after a predefined number of days.
S3 - Life Cycle Management
- Can be used in conjunction with versioning
- Can be applied to current versions and previous versions
- Following actions can now be done:
- Transition to the standard-Infrequent Access Storage Class (128kb and 30 days after the creation date)
- Archive to the Glacier Storage Class (30 days after IA, if relevant)
- Permanently Delete
- Use Lifecycle policies to expire incomplete Multipart uploads by automatically removing incomplete multipart uploads and the associated storage after a predefined number of days.
S3 - Object Lock and Glacier Vault Lock [SAA-C02]
- Use S3 Object Lock to store objects using a write once, read many (WORM) model
- Object locks can be on individual objects or applied across the bucket as a whole
- Objects locks come in two mode
- Governance mode: users can’t overwrite or delete an object version or alter its lock settings unless they have special permissions.
- Compliance mode: a protected object version can’t be overwritten or deleted by any user, including the root user in your AWS account.
S3 - Performance [SAA-C02]
Improve performace in S3:
-
Prefixes: mybucketname/folder1/surbfolder1/myfile.jpg -> /folder1/subfolder1
- The more prefixes you use, the better performance you are going to get.
- Achieve a high number of requests: 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD requests per second per prefix.
- Better performance by spreading your reads across different prefixes.
-
SSE-KMS: if you are using SSE-KMS to encrypt your objects in S3, you must keep in mind the KMS limits
- Uploading/Downloading will count towards the KMS quota
- Currently you cannot request a quota increase for KMS
- Use Multipart uploads to increase performance when uploading files to S3
- Should be used for any files over 100 MB and must be used for any file over 5 Gb
- Use S3 Byte-range fetches to increase performance when downloading files to S3