Storage - S3, EBS, EFS, Cloudfront, Storage GW, Snowball, ++ Flashcards
All things storage.
-Use if you need more than 10,000 IOPS -Can provision up to 20,000 IOPS per volume
- General purpose SSD
- Provisioned IOPS
- Throughput optimized HDD (ST1)
- Cold HDD (SC1)
- Magnetic Standard - Legacy
- Provisioned IOPS
Designed for IO intensive apps such as large relational or NoSQL databases a) General purpose SSD b) Provisioned IOPS c) Throughput optimized HDD (ST1) d) Cold HDD (SC1) e)Magnetic Standard - Legacy
b) Provisioned IOPS
T or F If a spot instance is terminated by EC2, you will not be charged for a partial hour of usage. However, if you terminate the instance yourself, you will be charged for the complete hour in which the instance ran.
True
_____ allows you to create storage volumes and attach them to EC2 instances
EBS - elastic block storage
once attached, you can create a filesystem on top of these volumes, run a database, or use them in any other way you would use a block device.
EBS volumes
____ volumes are placed in a specific availability zone, where they are auto replicated to protect you from the failure of a single component.
EBS volumes
EBS volumes types
-General purpose SSD -Provisioned IOPS -Throughput optimized HDD (ST1) -Cold HDD (SC1) -Magnetic Standard - Legacy
ratio of 3 IOPS per GB with up to 10,000 IOPS and the ability to burst up to 3,000 IOPS for extended period of time for volumes at 3334 GB and above. a) General purpose SSD b) Provisioned IOPS c) Throughput optimized HDD (ST1) d) Cold HDD (SC1) e)Magnetic Standard - Legacy
a) General purpose SSD
Lowest cost per GB of all EBS volume types that is bootable. Ideal for workloads where data is accessed infrequently and applications where the lowest storage cost is important. a) General purpose SSD b) Provisioned IOPS c) Throughput optimized HDD (ST1) d) Cold HDD (SC1) e)Magnetic Standard - Legacy
e) Magnetic Storage
-Big data -data warehouses -log processing -can’t be boot volume a) General purpose SSD b) Provisioned IOPS c) Throughput optimized HDD (ST1) d) Cold HDD (SC1) e)Magnetic Standard - Legacy
c) Throughput optimized HDD (ST1)
- Lowest cost storage for infrequently accessed workloads
- file server
- can’t be boot volume
a) General purpose SSD
b) Provisioned IOPS
c) Throughput optimized HDD (ST1)
d) Cold HDD (SC1)
e) Magnetic Standard - Legacy
d) Cold HDD (SC1)
Types of compliance in AWS
- Service Organization Controls (SOC) 1/International Standard on Assurance Engagements (ISAE) 3402, SOC 2, and SOC 3
- Federal Information Security Management Act (FISMA), Department of Defense Information Assurance Certification and Accreditation Process (DIACAP), and Federal Risk and Authorization Management Program (FedRAMP)
- Payment Card Industry Data Security Standard (PCI DSS) Level 1
- International Organization for Standardization (ISO) 9001, ISO 27001, and ISO 27018
What languages does Elastic Beanstalk support?
PHP, Java, Python, Ruby, Node.js, .NET, and Go.
Name some EBS facts
- persistent block-level storage volumes
- each volume is automatically replicated within its Availability Zone
- low-latency performance
How does storage gateway work?
It provides low-latency performance by maintaining a cache of frequently accessed data on-premises while securely storing all of your data encrypted in Amazon S3 or Amazon Glacier.
Why use Dynamo DB?
fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. great fit for mobile, web, gaming, ad-tech, Internet of Things, and many other applications.
What is CloudTrail
web service that records AWS API calls for an account and delivers log files for audit and review.
Common use cases for S3
Backup and archive for on-premises or cloud data Content, media, and software storage and distribution Big data analytics Static website hosting Cloud-native mobile and Internet application hosting Disaster recovery
S3 storage classes
general purpose, infrequent access, and archive.
How does block storage operate?
Block storage operates at a lower level—the raw storage device level—and manages data as a set of numbered, fixed-size blocks.
How does file storage operate?
File storage operates at a higher level—the operating system level—and manages data as a named hierarchy of files and folders.
What protocols do block storage use? SAN - Storage Area Network
iSCSI or Fiber Channel
What protocols does file storage use? NAS - Network Attached Storage
Common Internet File System (CIFS) Network File System (NFS)
What protocol does S3 use?
Application Program Interface (API) built on standard HTTP verbs
An S3 ______ contains both data and metadata
object
Objects reside in containers called ______
buckets
How are S3 objects identified?
unique user-specified keys (filename)
Amazon S3 objects are automatically replicated on multiple devices in multiple facilities within a region. T or F?
True
Amazon S3 automatically partitions buckets to support very high request rates and simultaneous access by many clients. T or F?
True
Which storage option provides network-attached shared file storage (NAS storage) using the NFS v4 protocol.
Amazon Elastic File System (AWS EFS)
Which storage option provides block level storage for Amazon Elastic Compute Cloud (Amazon EC2) instances.
EBS
Bucket names can contain:
63 lowercase letters, numbers, hyphens, and periods.
How many buckets can you have per account by default?
100
Best practice
It is a best practice to use bucket names that contain your domain name and conform to the rules for DNS names. This ensures that your bucket names are your own, can be used in all regions, and can host static websites.
What sizes can S3 objects be?
0 bytes to 5TB
How many objects can a single bucket store?
Unlimited
What is included in system metadata?
the date last modified, object size, MD5 digest, and HTTP Content-Type.
When can you create user metadata on an object?
Only at the time the object is created.
A S3 key consists of what?
up to 1024 bytes of Unicode UTF-8 characters, including embedded slashes, backslashes, dots, and dashes.
What is the URL format of S3?
http://mybucket.s3.amazonaws.com/jack.doc http://mybucket.s3.amazonaws.com/fee/fi/fo/fum/jack.doc
Is there a file or folder hierarchy in S3?
There is no actual file and folder hierarchy. A key may contain delimiter characters like slashes or backslashes to help you name and logically organize your Amazon S3 objects, but to Amazon S3 it is simply a long key name in a flat namespace. For convenience, the Amazon S3 console and the Prefix and Delimiter feature allow you to navigate within an Amazon S3 bucket as if there were a folder hierarchy. However, remember that a bucket is a single flat namespace of keys with no structure.
The S3 API includes:
Create/delete a bucket Write an object Read an object Delete an object List keys in a bucket
What type of API does S3 use?
REST (Representational State Transfer) API. uses standard HTTP or HTTPS requests to create and delete buckets, list keys, and read and write objects.
How does REST work in S3?
REST maps standard HTTP “verbs” (HTTP methods) to the familiar CRUD (Create, Read, Update, Delete) operations. Create is HTTP PUT (and sometimes POST); read is HTTP GET; delete is HTTP DELETE; and update is HTTP POST (or sometimes PUT).
Best practice
Always use HTTPS for Amazon S3 API requests to ensure that your requests and data are secure.
What are some of the high level interfaces people use to interact with S3 instead of the REST interface itself?
These include the AWS Software Development Kits (SDKs) (wrapper libraries) for iOS, Android, JavaScript, Java, .NET, Node.js, PHP, Python, Ruby, Go, and C++, the AWS Command Line Interface (CLI), and the AWS Management Console.
What does durability mean according to AWS?
Durability addresses the question, “Will my data still be there in the future?”
What does availability mean according to AWS?
Availability addresses the question, “Can I access my data right now?”
how many 9s are Amazon’s S3 storage DURABILITY of objects over a given year designed for?
99.9999999999% - 11 total 9s Amazon S3 achieves high durability by automatically storing data redundantly on multiple devices in multiple facilities within a region. It is designed to sustain the concurrent loss of data in two facilities without loss of user data. Amazon S3 provides a highly durable storage infrastructure designed for mission-critical and primary data storage.
how many 9s are Amazon’s S3 storage AVAILABILITY of objects over a given year designed for?
99.99% - 4 total 9s
If high durability is not required, what is the best storage to use?
RRS - Reduced Redundancy Storage
What durability does RRS offer?
99.99% with a lower cost of storage
Best Practice
Even though Amazon S3 storage offers very high durability at the infrastructure level, it is still a best practice to protect against user-level accidental deletion or overwriting of data by using additional features such as versioning, cross-region replication, and MFA Delete.
Why is S3 considered an eventually consistent system?
your data is automatically replicated across multiple servers and locations within a region, changes in your data may take some time to propagate to all locations. As a result, there are some situations where information that you read immediately after an update may return stale data.
What is meant by an eventually consistent system?
Eventual consistency means that if you PUT new data to an existing key, a subsequent GET might return the old data. Similarly, if you DELETE an object, a subsequent GET for that object might still read the deleted object. In all cases, updates to a single key are atomic—for eventually-consistent reads, you will get the new data or the old data, but never an inconsistent mix of data.
For PUTs to new objects….
Amazon S3 provides read-after-write consistency.
for PUTs to existing objects (object overwrite to an existing key) and for object DELETEs…
Amazon S3 provides eventual consistency.
Types of controls put on S3
coarse-grained access controls (Amazon S3 Access Control Lists [ACLs]), and fine-grained access controls (Amazon S3 bucket policies, AWS Identity and Access Management [IAM] policies, and query-string authentication).
S3 ACLs allow you to grant:
READ, WRITE, or FULL-CONTROL at the object or bucket level. ACLs are a legacy access control mechanism, created before IAM existed. ACLs are best used today for a limited set of use cases, such as enabling bucket logging or making a bucket that hosts a static website be world-readable.
Differences between IAM policies and S3 policies:
S3: They are associated with the bucket resource instead of an IAM principal. They include an explicit reference to the IAM principal in the policy. This principal can be associated with a different AWS account, so Amazon S3 bucket policies allow you to assign cross-account access to Amazon S3 resources.
What does a policy in effect do in s3?
you can specify who can access the bucket, from where (by Classless Inter-Domain Routing [CIDR] block or IP address), and during what time of day.
Can IAM policies be associated directly with IAM principals?
yes
What does a prefix and delimiter parameters do for S3?
lets you organize, browse, and retrieve the objects within a bucket hierarchically. Typically, you would use a slash (/) or backslash () as a delimiter and then use key names with embedded delimiters to emulate a file and folder hierarchy within the flat object key namespace of a bucket.
What are the S3 storage classes?
Standard Intelligent-Tiering (S3 Intelligent-Tiering) Standard – Infrequent Access (Standard-IA) One Zone-Infrequent Access (S3 One Zone-IA) Reduced Redundancy Storage (RRS) Amazon Glacier Glacier Deep Archive (S3 Glacier Deep Archive)
Amazon S3 Standard (S3 Standard)
S3 Standard offers high durability, availability, and performance object storage for frequently accessed data. Because it delivers low latency and high throughput, S3 Standard is appropriate for a wide variety of use cases, including cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics. S3 Storage Classes can be configured at the object level and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can also use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.
Amazon S3 Standard (S3 Standard) Key features
Low latency and high throughput performance Designed for durability of 99.999999999% of objects across multiple Availability Zones Resilient against events that impact an entire Availability Zone Designed for 99.99% availability over a given year Backed with the Amazon S3 Service Level Agreement for availability Supports SSL for data in transit and encryption of data at rest S3 Lifecycle management for automatic migration of objects to other S3 Storage Classes
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering)
The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access. For a small monthly monitoring and automation fee per object, Amazon S3 monitors access patterns of the objects in S3 Intelligent-Tiering, and moves the ones that have not been accessed for 30 consecutive days to the infrequent access tier. If an object in the infrequent access tier is accessed, it is automatically moved back to the frequent access tier. There are no retrieval fees when using the S3 Intelligent-Tiering storage class, and no additional tiering fees when objects are moved between access tiers. It is the ideal storage class for long-lived data with access patterns that are unknown or unpredictable. S3 Storage Classes can be configured at the object level and a single bucket can contain objects stored in S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can upload objects directly to S3 Intelligent-Tiering, or use S3 Lifecycle policies to transfer objects from S3 Standard and S3 Standard-IA to S3 Intelligent-Tiering. You can also archive objects from S3 Intelligent-Tiering to S3 Glacier.
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering) Key features:
Same low latency and high throughput performance of S3 Standard Small monthly monitoring and auto-tiering fee Automatically moves objects between two access tiers based on changing access patterns Designed for durability of 99.999999999% of objects across multiple Availability Zones Resilient against events that impact an entire Availability Zone Designed for 99.9% availability over a given year Backed with the Amazon S3 Service Level Agreement for availability Supports SSL for data in transit and encryption of data at rest S3 Lifecycle management for automatic migration of objects to other S3 Storage Classes
Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
S3 Standard-IA is for data that is accessed less frequently, but requires rapid access when needed. S3 Standard-IA offers the high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. This combination of low cost and high performance make S3 Standard-IA ideal for long-term storage, backups, and as a data store for disaster recovery files. S3 Storage Classes can be configured at the object level and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can also use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.
Amazon S3 Standard-Infrequent Access (S3 Standard-IA) Key features:
Same low latency and high throughput performance of S3 Standard Designed for durability of 99.999999999% of objects across multiple Availability Zones Resilient against events that impact an entire Availability Zone Data is resilient in the event of one entire Availability Zone destruction Designed for 99.9% availability over a given year Backed with the Amazon S3 Service Level Agreement for availability Supports SSL for data in transit and encryption of data at rest S3 Lifecycle management for automatic migration of objects to other S3 Storage Classes
Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
S3 One Zone-IA is for data that is accessed less frequently, but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. S3 One Zone-IA is ideal for customers who want a lower-cost option for infrequently accessed data but do not require the availability and resilience of S3 Standard or S3 Standard-IA. It’s a good choice for storing secondary backup copies of on-premises data or easily re-creatable data. You can also use it as cost-effective storage for data that is replicated from another AWS Region using S3 Cross-Region Replication. S3 One Zone-IA offers the same high durability†, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. S3 Storage Classes can be configured at the object level, and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can also use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.
Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) Key Features:
Same low latency and high throughput performance of S3 Standard Designed for durability of 99.999999999% of objects in a single Availability Zone† Designed for 99.5% availability over a given year Backed with the Amazon S3 Service Level Agreement for availability Supports SSL for data in transit and encryption of data at rest S3 Lifecycle management for automatic migration of objects to other S3 Storage Classes
Amazon S3 Glacier (S3 Glacier)
S3 Glacier is a secure, durable, and low-cost storage class for data archiving. You can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low yet suitable for varying needs, S3 Glacier provides three retrieval options that range from a few minutes to hours. You can upload objects directly to S3 Glacier, or use S3 Lifecycle policies to transfer data between any of the S3 Storage Classes for active data (S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA) and S3 Glacier.
Amazon S3 Glacier (S3 Glacier) Key Features:
Designed for durability of 99.999999999% of objects across multiple Availability Zones Data is resilient in the event of one entire Availability Zone destruction Supports SSL for data in transit and encryption of data at rest Low-cost design is ideal for long-term archive Configurable retrieval times, from minutes to hours S3 PUT API for direct uploads to S3 Glacier, and S3 Lifecycle management for automatic migration of objects
Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive)
S3 Glacier Deep Archive is Amazon S3’s lowest-cost storage class and supports long-term retention and digital preservation for data that won’t be regularly accessed. It is designed for customers — particularly those in highly-regulated industries, such as the Financial Services, Healthcare, and Public Sectors — that retain data sets for 7-10 years or longer to meet regulatory compliance requirements. S3 Glacier Deep Archive can also be used for backup and disaster recovery use cases, and is a cost-effective and easy-to-manage alternative to magnetic tape systems, whether they are on-premises libraries or off-premises services. S3 Glacier Deep Archive complements Amazon S3 Glacier, which is ideal for more active archives where data is regularly retrieved and needed in minutes. All objects stored in S3 Glacier Deep Archive are replicated and stored across at least three geographically-dispersed Availability Zones, protected by 99.999999999% of durability, and can be restored within 12 hours.
Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive) Key features:
Designed for durability of 99.999999999% of objects across multiple Availability Zones Lowest cost storage class designed for long-term retention of data that will be retained for 7-10 years Ideal alternative to magnetic tape libraries Retrieval time within 12 hours S3 Lifecycle management for automatic migration of objects
performance across S3 storage class
Bucket lifecycle
Store backup data initially in Amazon S3 Standard.After 30 days, transition to Amazon Standard-IA.After 90 days, transition to Amazon Glacier.After 3 years, delete.
How do you encrypt S3 data in transit?
Amazon S3 Secure Sockets Layer (SSL) API endpoints. This ensures that all data sent to and from Amazon S3 is encrypted while in transit using the HTTPS protocol.
How do you encrypt data at rest in S3?
Server-Side Encryption (SSE).
All SSE performed by Amazon S3 and AWS Key Management Service (Amazon KMS) uses the 256-bit Advanced Encryption Standard (AES).
SSE-S3 (AWS-Managed Keys)
This is a fully integrated “check-box-style” encryption solution where AWS handles the key management and key protection for Amazon S3. Every object is encrypted with a unique key. The actual object key itself is then further encrypted by a separate master key. A new master key is issued at least monthly, with AWS rotating the keys. Encrypted data, encryption keys, and master keys are all stored separately on secure hosts, further enhancing protection.
SSE-KMS (AWS KMS Keys)
This is a fully integrated solution where Amazon handles your key management and protection for Amazon S3, but where you manage the keys. SSE-KMS offers several additional benefits compared to SSE-S3. Using SSE-KMS, there are separate permissions for using the master key, which provide protection against unauthorized access to your objects stored in Amazon S3 and an additional layer of control. AWS KMS also provides auditing, so you can see who used your key to access which object and when they tried to access this object. AWS KMS also allows you to view any failed attempts to access data from users who did not have permission to decrypt the data.
SSE-C (Customer-Provided Keys)
This is used when you want to maintain your own encryption keys but don’t want to manage or implement your own client-side encryption library. With SSE-C, AWS will do the encryption/decryption of your objects while you maintain full control of the keys used to encrypt/decrypt the objects in Amazon S3.
Client-Side Encryption
Client-side encryption refers to encrypting data on the client side of your application before sending it to Amazon S3.
You have the following two options for using data encryption keys:
:Use an AWS KMS-managed customer master key.
Use a client-side master key.
Best Practice
For maximum simplicity and ease of use, use server-side encryption with AWS-managed keys (SSE-S3 or SSE-KMS).
Versioning
Amazon S3 versioning helps protects your data against accidental or malicious deletion by keeping multiple versions of each object in the bucket, identified by a unique version ID. Versioning allows you to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket. If a user makes an accidental change or even maliciously deletes an object in your S3 bucket, you can restore the object to its original state simply by referencing the version ID in addition to the bucket and object key. Versioning is turned on at the bucket level. Once enabled, versioning cannot be removed from a bucket; it can only be suspended
MFA Delete
MFA Delete adds another layer of data protection on top of bucket versioning. MFA Delete requires additional authentication in order to permanently delete an object version or change the versioning state of a bucket. In addition to your normal security credentials, MFA Delete requires an authentication code (a temporary, one-time password) generated by a hardware or virtual Multi-Factor Authentication (MFA) device. Note that MFA Delete can only be enabled by the root account.
Pre-Signed URLs
All Amazon S3 objects by default are private, meaning that only the owner has access. However, the object owner can optionally share objects with others by creating a pre-signed URL, using their own security credentials to grant time-limited permission to download the objects. When you create a pre-signed URL for your object, you must provide your security credentials and specify a bucket name, an object key, the HTTP method (GET to download the object), and an expiration date and time. The pre-signed URLs are valid only for the specified duration. This is particularly useful to protect against “content scraping” of web content such as media files stored in Amazon S3.
Multipart Upload
better support uploading or copying of large objects
through parallel transfers
the ability to pause and resume, and the ability to upload objects where the size is initially unknown.
Multipart upload is a three-step process: initiation, uploading the parts, and completion (or abort).
Parts can be uploaded independently in arbitrary order, with retransmission if needed. After all of the parts are uploaded, Amazon S3 assembles the parts in order to create an object.
you should use multipart upload for objects larger than 100 Mbytes, and you must use multipart upload for objects larger than 5GB.
s. When using the high-level APIs and the high-level Amazon S3 commands in the AWS CLI (aws s3 cp, aws s3 mv, and aws s3 sync), multipart upload is automatically performed for large objects.
best Practice
You can set an object lifecycle policy on a bucket to abort incomplete multipart uploads after a specified number of days. This will minimize the storage costs associated with multipart uploads that were not completed.
Range Gets
It is possible to download (GET) only a portion of an object in both Amazon S3 and Amazon Glacier by using something called a Range GET. Using the Range HTTP header in the GET request or equivalent parameters in one of the SDK wrapper libraries, you specify a range of bytes of the object. This can be useful in dealing with large objects when you have poor connectivity or to download only a known portion of a large Amazon Glacier backup
Cross-Region Replication
Cross-region replication is a feature of Amazon S3 that allows you to asynchronously replicate all new objects in the source bucket in one AWS region to a target bucket in another region. Any metadata and ACLs associated with the object are also part of the replication. After you set up cross-region replication on your source bucket, any changes to the data, metadata, or ACLs on an object trigger a new replication to the destination bucket. To enable cross-region replication, versioning must be turned on for both source and destination buckets, and you must use an IAM policy to give Amazon S3 permission to replicate objects on your behalf.
used to reduce the latency required to access objects in Amazon S3 by placing objects closer to a set of users or to meet requirements to store backup data at a certain distance from the original source data.
Best Practice
If turned on in an existing bucket, cross-region replication will only replicate new objects. Existing objects will not be replicated and must be copied to the new bucket via a separate command.
Logging
In order to track requests to your Amazon S3 bucket, you can enable Amazon S3 server access logs. Logging is off by default, but it can easily be enabled. When you enable logging for a bucket (the source bucket), you must choose where the logs will be stored (the target bucket). You can store access logs in the same bucket or in a different bucket. Either way, it is optional (but a best practice) to specify a prefix, such as logs/ or yourbucketname/logs/, so that you can more easily identify your logs.
Logs include this information:
- Requestor account and IP address
- Bucket name
- Request time
- Action (GET, PUT, LIST, and so forth)
- Response status or error code
Event Notifications
sent in response to actions taken on objects uploaded or stored in Amazon S3. Event notifications enable you to run workflows, send alerts, or perform other actions in response to changes in your objects stored in Amazon S3. You can use Amazon S3 event notifications to set up triggers to perform actions, such as transcoding media files when they are uploaded, processing data files when they become available, and synchronizing Amazon S3 objects with other data stores.
Event notifications 2
Amazon S3 event notifications are set up at the bucket level, and you can configure them through the Amazon S3 console, through the REST API, or by using an AWS SDK. Amazon S3 can publish notifications when new objects are created (by a PUT, POST, COPY, or multipart upload completion), when objects are removed (by a DELETE), or when Amazon S3 detects that an RRS object was lost. You can also set up event notifications based on object name prefixes and suffixes. Notification messages can be sent through either Amazon Simple Notification Service (Amazon SNS) or Amazon Simple Queue Service (Amazon SQS) or delivered directly to AWS Lambda to invoke AWS Lambda functions.
Best Practice
If you are using Amazon S3 in a GET-intensive mode, such as a static website hosting, for best performance you should consider using an Amazon CloudFront distribution as a caching layer in front of your Amazon S3 bucket.
Glacier Archives
In Amazon Glacier, data is stored in archives. An archive can contain up to 40TB of data, and you can have an unlimited number of archives. Each archive is assigned a unique archive ID at the time of creation. (Unlike an Amazon S3 object key, you cannot specify a user-friendly archive name.) All archives are automatically encrypted, and archives are immutable—after an archive is created, it cannot be modified.
Glacier Vaults
Vaults are containers for archives. Each AWS account can have up to 1,000 vaults. You can control access to your vaults and the actions allowed using IAM policies or vault access policies.
Vaults Locks
You can easily deploy and enforce compliance controls for individual Amazon Glacier vaults with a vault lock policy. You can specify controls such as Write Once Read Many (WORM) in a vault lock policy and lock the policy from future edits. Once locked, the policy can no longer be changed.
Glacier data retrieval
You can retrieve up to 5% of your data stored in Amazon Glacier for free each month, calculated on a daily prorated basis. If you retrieve more than 5%, you will incur retrieval fees based on your maximum retrieval rate. To eliminate or minimize those fees, you can set a data retrieval policy on a vault to limit your retrievals to the free tier or to a specified data rate.
Amazon Glacier versus Amazon Simple Storage Service (Amazon S3)
Amazon Glacier is similar to Amazon S3, but it differs in several key aspects. Amazon Glacier supports 40TB archives versus 5TB objects in Amazon S3. Archives in Amazon Glacier are identified by system-generated archive IDs, while Amazon S3 lets you use “friendly” key names. Amazon Glacier archives are automatically encrypted, while encryption at rest is optional in Amazon S3. However, by using Amazon Glacier as an Amazon S3 storage class together with object lifecycle policies, you can use the Amazon S3 interface to get most of the benefits of Amazon Glacier without learning a new interface.
____ provides developers and IT teams with secure, durable, and highly-scalable object storage.
S3
____ is easy to use with a simple web services interface to store and retrieve any amount of data from anywhere on the web.
S3
This object based storage is a safe place to store your files. This is because data is spread across multiple devices and facilities.
S3
S3 can be from ____ bytes to ____ TB
0 to 5
How much storage do you get with S3?
- 5TB
- 10TB
- 100TB
- unlimited
Unlimited
S3 files are stored in _______
buckets
S3 is a universal namespace. What is meant by this?
Names must be unique globally per bucket
When you upload a file to S3, you will receive a _______ code if the upload was successful
- HTTP 200 code
- HTTP 300 code
- HTTP 400 code
- HTTp 500 code
HTTP 200 code
Read after write consistency for _____ of new objects
- Gets
- Puts
- Deletes
Puts
S3’s data consistency model includes eventual consistency for overwrite ____ and _____ (can take time to propogate).
- Puts and Gets
- Puts and Deletes
- Gets and Deletes
- Post and Gets
Puts and Deletes
T or F
S3 is a simple key value store
True
S3 is object based. Objects consist of the following:
______ - The name of the object
_______ - the data, made of sequence of bytes
key : value
______ is important for versioning
version ID
______ data about data
metadata
_________ - bucket specific configuration: bucket policies, access control lists, cross origin resource sharing (CORS), transfer acceleration
subresources
S3 is built for ______ availability for the S3 platform.
- 9.99%
- 99.9%
- 99.99%
- 99.999999999%
99.9%
Amazon guarantee for S3 is ______ availability.
- 9.99%
- 99.9%
- 99.99%
- 99.999999999%
99.9%
Amazon guarantees _______ durability for S3 info
- 9.99%
- 99.9%
- 99.99%
- 99.999999999%
99.999999999%
Which of the following is a feature of S3?
- Tiered storage available
- lifecycle management
- versioning
- encryption
all of the above
How do you secure your data in S3?
_____ and ______
ACL (access control lists) and bucket policies
S3 has 99.99% availability and 99.999999999% durability, stored redundancy across multiple devices in multiple facilities, and is designed to sustain the loss of 2 facilities concurrently.
- S3
- S3-IA
- S3-One Zone IA
- Reduced Redundancy Storage
- Glacier
S3
For data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee.
- S3
- S3-IA
- S3-One Zone IA
- Reduced Redundancy Storage
- Glacier
S3-IA
Very Cheap. Archive only. Optimized for data that is infrequently accessed and it takes 3-5 hours to restore from glacier.
- S3
- S3-IA
- S3-One Zone IA
- Reduced Redundancy Storage
- Glacier
Glacier
Same as IA, however data is stored in a single AZ only, still 99.999999999% durability, but only 99.5% availability. Cost is 20% less than regular S3-IA.
- S3
- S3-IA
- S3-One Zone IA
- Reduced Redundancy Storage
- Glacier
S3 One Zone IA
Designed to provide 99.99% durability and 99.99 availability of objects over a given year. Used for data that can be recreated if list. (ie: thumbnails). Considered legacy
- S3
- S3-IA
- S3-One Zone IA
- Reduced Redundancy Storage
- Glacier
Reduced Redundancy Storage
99.999999999% durability and 99.99% availability
- standard S3
- standard IA
- one zone IA
- Glacier
- Reduced Redundancy
Standard S3
99.99999999999% durability and 99.9% availability. retrieval fee for objects.
- standard S3
- standard IA
- one zone IA
- Glacier
- Reduced Redundancy
standard IA
99.99% durability and 99.99% availability
- standard S3
- standard IA
- one zone IA
- Glacier
- Reduced Redundancy
Reduced redundancy
99.999999999% durability and 99.95% availability. not resilient to loss of AZ.
- standard S3
- standard IA
- one zone IA
- Glacier
- Reduced Redundancy
one zone IA
99.999999999% durability and 99.99% availability (after restore). no real time access. 4-5 hours to access.
glacier
Automatically moved your data to most cost-effective tier based on how frequently you access each object.
- S3
- S3-IA
- s5
- S3 intelligent tiering
S3 intelligent tiering
What are the 2 tiers in S3 intelligent tiering?
_____ and _____
frequent and infrequent
T or F
S3 intelligent tiering optimizes cost
True
This S3 option has no fees for accessing your data, but a small monthly fee for monitoring and automation. $.025 per 1,000 objects.
- S3-IA
- S3-one zone
- reduced redundancy
- s3 intelligent tiering
S3 intelligent tiering
S3 intelligent tiering has ______ durability and _____ availability
99.999999999% durability and 99.9% availability
T or F
S3 intelligent tiering has unknown or infrequent access patterns
True
S3 is charged for storage per _____
- MB
- KB
- GB
- TB
GB
T or F
S3 is charged for API requests (Get, Put, Copy, etc.)
True
Storage management pricing is handled by which?
- inventory
- analytics
- object tags
all of the above
T or F
You are charged for moving data out of S3.
true
T or F
In S3, you are charged for transfer acceleration which uses cloudfrton to optomize transfers.
true
T or F
By default, all newly created buckets are public
False
By default, all newly created buckets are private
You can setup access control to your buckets using ____ and _____
bucket policies and access control lists.
bucket policies: applied at bucket level
access control lists: applied at object level
T or F
S3 buckets can be configured to create access logs, which by all requests made to S3 bucket. These logs can be written to another bucket.
True
T or F
S3 does encryption in transit with SSL/TLS
True
Types of S3 at rest encryption:
_____ and _____
server side encryption
client side encryption
SSE-S3 is ______
s3 managed keys
SSE-KMS
aws key management service, managed keys
SSE-C
server side encryption with customer provided keys.
Every time a file is uploaded to S3, a ____ request is initiated
Put
What type of request is this?
PUT/myfile HTTP/1.1
Host: mybucket.s3.amazonaws.com
Date: Wed, 25 Apr 2018 09:50:00 GMT
Authorization: authorization string
content-type: text/plain
content-length: 27364
x-amz-meta-author: Faye
expect: 100-continue
[27364 bytes of object data]
Put request
where is the x-amz-server-side-encryption parameter placed?
in the header when a file is to be encrypted at upload time.
What are the two options currently available for x-amz-server-side-encryption?
AES-256 and AWS-KMS
When this parameter is included in the header of the PUT request, it tells S3 to encrypt the object at the time of upload, using the specified encryption method.
x-amz-server-side-encryption
you can enforce the use of server side encryption by using a ____ _____ which denies any S3 PUT request which doesn’t include the x-amz-server-side encryption parameter in the request header
bucket policy
A _____ is a system of distributed servers that delivers webpages and other web content to a user based on the geographic locations of the user, the origin of a webpage, and a content delivery service.
CDN - content delivery network
CDN
This is the location where content is cached and can also be written. Sprate to an AWS region/AZ
edge location
CDN
This is the origin of all the files that the CDN will distribute. Origins can be an S3 bucket, and EC2 instance, an ELB, or Route53.
Origin
This is the name given to the CDN, which consists of a collection of Edge Locations
Distribution
CDN
Typically used for websites
Web distribution
CDN
Used for media streaming
RTMP
T or F
CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations. Requests for you content are auto routed to nearest edge location, so content is delivered with the best possible performance.
True
T or F
Amazon CF is optimized to work with other AWS services like S3, EC2, LB, and Route53. CF also works seamlessly with any non-aws origin server, which stores the original, definitive versions of your files.
True
Cloudfront distribution types:
___ distribution
____ distribution
web - used for websites, HTTP/HTTPS
RTMP - Adobe real time messaging protocol- used for media streaming/Flash multimedia content
________ enables fast, east, and secure tranfers of files over long ditances between your end users and an S3 bucket.
Transfer acceleration
_______ takesw advantage of Cloudformation’s globally distributed edge locations. As the data arrives at an edge location, data is routed to S3 over an optimized network path.
Transfer Acceleration
T or F
S3 is designed to support very high requests rates
True
T or F
if your S3 buckets are routinely receiving > 100 PUT/LIST/Delete or > 300 GET requests per second, then there are some best practive guidelines that will help optimize S3 performance.
True
_____ use cloudformation content deliver service to get best performance. CF will cache your most frequently accessed objects and will reduce latency for your Get requests.
- Get-Intensive workloads
- Mixed request type workloads
Get intensive workloads
A mix of Get, Put, Delete, Get bucket, the key names you use for your objects can impact performance for intensive workloads.
- Get-Intensive workloads
- Mixed request type workloads
MIxed request type workloads
S3 uses the ____ ____ to determine which partition an object will be stored in.
key name
T or F
The use of non-sequential key names
ie: names prefixed with a time stamp or alphabetical sequence decreases the likliehood of having multiple objects stored on the same partition.
False
The use of sequential key names
ie: names prefixed with a time stamp or alphabetical sequence increases the likliehood of having multiple objects stored on the same partition.
T or F
for heavy workloads, sequential key names in S3 can cause IO issues and contention
True
T or F . S3
By using a random prefix to key names, you can force S3 to distribute your key across multiple partitions, distributing the IO workload
True
T or F
in 2018, amazon announced a massive increase in S3 performance.
3,500 put requests per second
5,500 get requests
True
T or F
S3’s new increase in performance negates the previous guidance to randomize your object keynames to achieve faster performance.
True
T or F
S3’s logical and sequential naming patterns can be used without any performance implication.
True