Amazon Simple Storage Service (S3) | Amazon Glacier Flashcards
What is Amazon Redshift Spectrum?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Amazon Redshift Spectrum is a feature of Amazon Redshift that enables you to run queries against exabytes of unstructured data in Amazon S3, with no loading or ETL required. When you issue a query, it goes to the Amazon Redshift SQL endpoint, which generates and optimizes a query plan. Amazon Redshift determines what data is local and what is in Amazon S3, generates a plan to minimize the amount of Amazon S3 data that needs to be read, requests Redshift Spectrum workers out of a shared resource pool to read and process data from Amazon S3.
Redshift Spectrum scales out to thousands of instances if needed, so queries run quickly regardless of data size. And, you can use the exact same SQL for Amazon S3 data as you do for your Amazon Redshift queries today and connect to the same Amazon Redshift endpoint using your same BI tools. Redshift Spectrum lets you separate storage and compute, allowing you to scale each independently. You can setup as many Amazon Redshift clusters as you need to query your Amazon S3 data lake, providing high availability and limitless concurrency. Redshift Spectrum gives you the freedom to store your data where you want, in the format you want, and have it available for processing when you need it.
Does Amazon S3 provide capabilities for archiving objects to lower cost storage options?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Yes, Amazon S3 enables you to utilize Amazon Glacier’s extremely low-cost storage service as storage for data archival. Amazon Glacier stores data for as little as $0.004 per gigabyte per month. To keep costs low yet suitable for varying retrieval needs, Amazon Glacier provides three options for access to archives, from a few minutes to several hours. Some examples of archive uses cases include digital media archives, financial and healthcare records, raw genomic sequence data, long-term database backups, and data that must be retained for regulatory compliance.
How can I store my data using the Amazon Glacier option?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
You can use Lifecycle rules to automatically archive sets of Amazon S3 objects to Amazon Glacier based on lifetime. Use the Amazon S3 Management Console, the AWS SDKs or the Amazon S3 APIs to define rules for archival. Rules specify a prefix and time period. The prefix (e.g. “logs/”) identifies the object(s) subject to the rule. The time period specifies either the number of days from object creation date (e.g. 180 days) or the specified date after which the object(s) should be archived. Any Amazon S3 Standard or Amazon S3 Standard - IA objects which have names beginning with the specified prefix and which have aged past the specified time period are archived to Amazon Glacier. To retrieve Amazon S3 data stored in Amazon Glacier, initiate a retrieval job via the Amazon S3 APIs or Management Console. Once the job is complete, you can access your data through an Amazon S3 GET object request.
For more information on using Lifecycle rules for archival, please refer to the Object Archival topic in the Amazon S3 Developer Guide.
Can I use the Amazon S3 APIs or Management Console to list objects that I’ve archived to Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Yes, like Amazon S3’s other storage options (Standard or Standard - IA), Amazon Glacier objects stored using Amazon S3’s APIs or Management Console have an associated user-defined name. You can get a real-time list of all of your Amazon S3 object names, including those stored using the Amazon Glacier option, using the Amazon S3 LIST API.
Can I use Amazon Glacier APIs to access objects that I’ve archived to Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Because Amazon S3 maintains the mapping between your user-defined object name and Amazon Glacier’s system-defined identifier, Amazon S3 objects that are stored using the Amazon Glacier option are only accessible through the Amazon S3 APIs or the Amazon S3 Management Console.
How can I retrieve my objects that are archived in Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
To retrieve Amazon S3 data stored in Amazon Glacier, initiate a retrieval request using the Amazon S3 APIs or the Amazon S3 Management Console. The retrieval request creates a temporary copy of your data in RRS while leaving the archived data intact in Amazon Glacier. You can specify the amount of time in days for which the temporary copy is stored in RRS. You can then access your temporary copy from RRS through an Amazon S3 GET request on the archived object.
How long will it take to retrieve my objects archived in Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
When processing a retrieval job, Amazon S3 first retrieves the requested data from Amazon Glacier, and then creates a temporary copy of the requested data in RRS (which typically takes on the order of a few minutes). The access time of your request depends on the retrieval option you choose: Expedited, Standard, or Bulk retrievals. For all but the largest objects (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. Objects retrieved using Standard retrievals typically complete between 3 – 5 hours. Lastly, Bulk retrievals typically complete within 5 – 12 hours. For more information about the retrieval options, please refer to the Glacier FAQ.
What am I charged for archiving objects in Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Amazon Glacier storage is priced from $0.004 per gigabyte per month. Lifecycle transition requests into Amazon Glacier cost $0.05 per 1,000 requests. Objects that are archived to Glacier have a minimum of 90 days of storage, and objects deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining.
How is my storage charge calculated for Amazon S3 objects archived to Amazon Glacier?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
The volume of storage billed in a month is based on average storage used throughout the month, measured in gigabyte-months (GB-Months). Amazon S3 calculates the object size as the amount of data you stored plus an additional 32 kilobytes of Glacier data plus an additional 8 KB of S3 standard storage data. Amazon Glacier requires an additional 32 KB of data per object for Glacier’s index and metadata so you can identify and retrieve your data. Amazon S3 requires 8KB to store and maintain the user-defined name and metadata for objects archived to Amazon Glacier. This enables you to get a real-time list of all of your Amazon S3 objects, including those stored using the Amazon Glacier option, using the Amazon S3 LIST API. For example, if you have archived 100,000 objects that are 1GB each, your billable storage would be:
- 000032 gigabytes for each object x 100,000 objects = 100,003.2 gigabytes of Amazon Glacier storage.
- 000008 gigabytes for each object x 100,000 objects = 0.8 gigabytes of Amazon S3 Standard storage.
The fee is calculated based on the current rates for your region on the Amazon S3 Pricing Page.
How much data can I retrieve for free?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
You can retrieve 10 GB of your Amazon Glacier data per month for free. The free tier allowance can be used at any time during the month and applies to Standard retrievals.
How am I charged for deleting objects from Amazon Glacier that are less than 3 months old?
Amazon Glacier
Amazon Simple Storage Service (S3) | Storage
Amazon Glacier is designed for use cases where data is retained for months, years, or decades. Deleting data that is archived to Amazon Glacier is free if the objects being deleted have been archived in Amazon Glacier for three months or longer. If an object archived in Amazon Glacier is deleted or overwritten within three months of being archived then there will be an early deletion fee. This fee is prorated. If you delete 1GB of data 1 month after uploading it, you will be charged an early deletion fee for 2 months of Amazon Glacier storage. If you delete 1 GB after 2 months, you will be charged for 1 month of Amazon Glacier storage.