Developing Storage Solutions with Amazon S3 Flashcards
Differents storage options in AWS
Amazon S3 Amazon S3 Glacier Amazon Elastic File System (EFS) Amazon Storage Gateway Amazon Elastic Block Store (EBS)
What is S3 Glacier
Low-cost storage service that provides highly secure, durable, and flexible storage for data archiving and online backup
Type of retrieval in S3 Glacier
- Expedited retrievals : 1–5 minutes
- Standard retrievals : 3–5 hours
- Bulk retrievals : 5–12 hours.
What is Elastic File System (EFS) and when to use it
Network file system as a service to EC2 instances.
It is designed to meet the performance needs of big data and analytics, media processing, content management, web serving, and home directories.
What is Storage Gateway and when to use it
Seamless and secure storage integration between an organization’s on-premises IT environment and the AWS storage infrastructure, such as Amazon S3, Amazon S3 Glacier, and Amazon EBS.
AWS Storage Gateway use cases include the following:
• Corporate file sharing
• Enabling existing on-premises backup applications to store primary backups on Amazon S3
• Disaster recovery
• Mirroring data to cloud-based compute resources and then archiving it to Amazon S3 Glacier
What is Elastic Block Store and when to use it
EBS volumes are network-attached storage that persists independently from the running life of a single EC2 instance. With Amazon EBS, you can also create point-in-time snapshots of volumes, which are stored in Amazon S3.
Amazon EBS typical use cases include the following:
• Big data analytics engines (such as the Hadoop/HDFS ecosystem and Amazon EMR clusters)
• Relational and NoSQL databases (such as Microsoft SQL Server and MySQL or Cassandra and MongoDB)
• Stream and log processing applications (such as Kafka and Splunk)
• Data warehousing applications (like Vertica and Teradata)
What is Amazon S3 and when to use it
Amazon S3 (simple storage service) provides highly secure, durable, and scalable object storage.
You can use Amazon S3 as a storage solution for use cases such as: • Content storage and distribution • Backup and archiving • Big data analytics • Static website hosting • Disaster recovery
Basic components of Amazon S3
The basic components of Amazon S3 are the bucket, objects, keys, and the unique object url.
Different parts of bucket’s URL and object’s URL
https: //[bucket_name].s3.[region endpoint].amazonaws.com
https: //[bucket_name].s3.[region endpoint].amazonaws.com/[object key]
Requirements for S3 bucket name
The bucket name must be unique across Amazon S3.
3-63 characters
Lowercase letters, numbers and hyphens (-)
Do not use period (.) which can cause certificate exceptions when accessed with HTTPS
Do not use underscore (_)
A bucket is associated with an AWS Region
Requirements for object key name
Encoded in UTF-8
Max 1024 bytes
Safe characters : 0-9 a-z A-Z ! - _ . * ‘ ( ) /
Avoid : \ ; : + = @ , ? & $ ` space % < > [ ] # | { } ^ “ ~ non-printable
Two main types of metadata
System-defined metadata includes information such as object creation date, size, and MD5 digest.
User-defined metadata are name-value pairs assigned when an object is uploaded.
The prefix “x-amz-meta-” is automatically added to the metadata name.
How does versioning work in S3
An object’s version ID is part of the system-defined metadata.
By default, versioning is disabled in S3 buckets.
• In versioning-disabled buckets, an object has a version ID of null.
• In versioning-enabled buckets, each version of an object has a unique version ID.
Old path-style vs. Virtual hosted-style URL
Old path-style :
http://[region specific endpoint]/[bucket name]/[object key]
Virtual hosted-style :
http://[bucket name].s3.amazonaws.com/[object key]
Name of operation to upload object and max size of objects uploaded to S3
Upload an object with PUT
You can upload or copy objects of up to 5 GB in a single PUT operation. Larger object => multipart upload