S3/Glacier Flashcards

1
Q

What is S3?

A

Simple Storage Service. It provides object based (blob) storage. It was one of the first AWS services introduced back in 2006.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why would I use S3?

A

Analytics: Data lakes (Athena, Redshift Spectrum, QuickSight), IoT Streaming Data repository (Kinesis Firehose), AI/ML Storage (Rekognition, Lex, MXNet) , Storage class analysis (S3 Mgmt analytics)

Static Web Hosting: simple and massively scalable static website hosting

BitTorrent: use the BitTorrent protocol to retrieve any publicly available object by automatically generating a .torrent file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the largest S3 object size?

A

5TB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the largest object in a single PUT?

A

5GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When is it recommended to use multi-part uploads?

A

If yourfile is larger than 100MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a key?

A

Is NOT a file path though it looks like one pointing to an object store. It’s the name of the record in the file store

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are S3 storage classes and what is the purpose of each?

A
  1. Standard: frequently accessed data
  2. Standard-IA: long-lived, infrequently access data
  3. One Zone-IA: long-lived, infrequently accessed, non-critical data
  4. Reduced redundancy: frequently accessed, non-critical data
  5. Intelligent-tiering; long-lived data with changing or unknown access patterns
    - –
    i. Glacier: long-term data archiving with retrieval times ranging from minutes to hours
    ii. Glacier Deep Archive: long-term data archiving with retrieval times within 12 hours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is intelligent tiering?

A

Moves files to the next tier based on data type/usage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Does intelligent tiering add to the cost?

A

Yes, it adds some but you will save money due to changing to lower tiers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is intelligent tiering archive?

A

Automatically moves data to Glacier or DeepGlacier after a certain period of time. This is NOT lifecycle management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is S3 Lifecycle Management?

A

It moves data to other storage classes at a set time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the benefits of S3 Lifecycle Management?

A
  1. Optimize storage costs
  2. Adhere to data retention policies using automation
  3. Helps keep S3 volumes well-maintained
  4. Data destruction is one of the more difficult tasks and this helps provide clarity and enforcement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the Lifecycle rules based on?

A
  1. Prefixes
  2. Tags
  3. Current vs previous versions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is storage class analysis?

A

A useful tool within S3 that helps you ensure you are using your storage in the most cost-effective manner. You can run reports to view the frequency at which data is accessed and then potentially change storage type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the “Requester Pays” cost option?

A

The requester rather than the bucket owner pays for requests AND data transfer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Does S3 support tagging?

A

Yes, assign tags to objects for use in costing, billing, security etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an S3 Event?

A

This occurs when an action is taken on a bucket or object and triggers notifications to SNS, SQS, or Lambda

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is transfer acceleration?

A

Speeds up data uploads using CloudFront (PoP locations) in reverse

19
Q

How is S3 secured?

A
  1. Resource-based through object ACL and bucket policies
  2. User-based through IAM policies
  3. Optional multi-factor authentication before delete
  4. Through encryption at rest and in transit
20
Q

How is MFA used in S3?

A
  1. Safeguards against accidental deletion of an object

2. Safeguards against changing the versioning state of your bucket

21
Q

What are the options for encryption at rest?

A
  1. SSE-S3: Use S3’s existing encryption key for AES-256
  2. SSE-C: Upload your own AES-256 key which S3 will use when it writes the objects
  3. SSE-KMS: Use a key generated and managed by AWS KMS
  4. Client-Side: Encrypt objects using own local encryption process before uploading to S3 (PGP, GPG)
22
Q

What is PGP?

A

Pretty Good Privacy (PGP) is an encryption program that provides cryptographic privacy and authentication for data communication. PGP is used for signing, encrypting, and decrypting texts, e-mails, files, directories, and whole disk partitions and to increase the security of e-mail communications. Phil Zimmermann developed PGP in 1991.

PGP encryption uses a serial combination of hashing, data compression, symmetric-key cryptography, and finally public-key cryptography; each step uses one of several supported algorithms. Each public key is bound to a username or an e-mail address.

23
Q

What is GPG?

A

GNU Privacy Guard (GnuPG or GPG) is a free-software replacement for Symantec’s PGP cryptographic software suite.

GnuPG is a hybrid-encryption software program because it uses a combination of conventional symmetric-key cryptography for speed, and public-key cryptography for ease of secure key exchange, typically by using the recipient’s public key to encrypt a session key which is used only once.

24
Q

Why would you use PGP vs GPG?

A

PGP is used by the software of the RSA and the algorithm of IDEA encryption, and, on the other hand, GPG is used in software having advanced encryption of NIST and AES, which are standardized forms of by nature.

PGP has restrictions when it comes to using for personal and commercial use, and on the other hand, GPG can be used in both personal and commercial services by downloading the free digital signature and encrypted program.
PGP is actually owned by a company called Symantec, which is a proprietary solution, while, on the other hand, GPG is a source that is open to all in a standard form.

25
Q

How does S3 protect data?

A
  1. Versioning
  2. Multi-factor authentication
  3. Cross-region replication
26
Q

What is versioning?

A

A means of keeping multiple variants of an object in the same bucket. New version with each write. It enables “roll-back” and “un-delete” capabilities

27
Q

How can you use S3 versioning?

A

To preserve, retrieve, and restore every version of every object stored in an S3 bucket. You can easily recover from both unintended user actions and application failures

28
Q

Do old versions count towards the bill?

A

Yes, until they are permanently deleted

29
Q

Is versioning integrated with lifecycle management?

A

Yes, you can use Lifecycle management to delete old versions automatically after a certain number of days

30
Q

Why would I use Cross-region replication?

A
  1. Security
  2. Compliance
  3. latency
31
Q

What are some characteristics of Glacier?

A
  1. It’s a service by itself with its own API, console, etc
  2. Cheap, slow to respond, and seldom accessed
  3. Used by AWS Storage Gateway Virtual Tape Library (VTL)
  4. Integrated with AWS S3 via Lifecycle Management
  5. Faster retrieval speed options if you pay more, though it is still meant to be long-term storage. It’s not fast enough for online content.
32
Q

What are the components of Glacier?

A
  1. Glacier Vault (like an S3 bucket)
  2. Archive – like an S3 object
  3. Policies with Access
33
Q

What is a glacier policy?

A

It defines what rules the vault must obey

34
Q

What is Glacier Vault Lock?

A
  1. It is different than the vault access policy
  2. It enforces rules like no deletes or MFA
  3. It’s immutable meaning it can’t change though it can be overwritten or deleted.
35
Q

What are the characteristics of an archive?

A
  1. It can be a file including a zip, tar, etc
  2. The max size is 40TB
  3. Immutable
36
Q

How do I create a Vault Lock?

A
  1. Create a Vault Lock
  2. Initiate a vault lock
  3. 24 hour timer starts to confirm vault lock is performing
    i. If it elapses and you don’t confirm it the process aborts
    ii. If you complete the lock it sets permanently
37
Q

What happens if you delete an object in a versioning-enabled S3 bucket?

A

Amazon S3 inserts a delete marker instead of removing the object permanently. The delete marker becomes the current object version

38
Q

Does the SOAP API in S3 support versioning?

A

SOAP support over HTTP is deprecated, but it is still available over HTTPS. New Amazon S3 features are not supported for SOAP

39
Q

How are objects charged in S3 versioning?

A

Normal Amazon S3 rates apply for every version of an object stored and transferred.

40
Q

In what versioning states can an S3 bucket object reside?

A

Unversioned (the default)
Versioning-enabled
Versioning-suspended

After you version-enable a bucket, it can never return to an unversioned state.

41
Q

What happens to objects that existed prior to enabling versioning in S3?

A

Objects that are stored in your bucket before you set the versioning state have a version ID of null. When you enable versioning, existing objects in your bucket do not change. What changes is how Amazon S3 handles the objects in future requests.

42
Q

What happens if you have an object expiration lifecycle policy in your unversioned bucket and you want to maintain the same permanent delete behavior when you enable versioning?

A

You must add a noncurrent expiration policy. The noncurrent expiration lifecycle policy manages the deletes of the noncurrent object versions in the version-enabled bucket.

43
Q

What are the transfer acceleration URLs?

A

xyz. s3-accelerate.amazonaws.com

xyz. s3-accelerate.dualstack.amazonaws.com

44
Q

What is the S3 Transfer Acceleration Speed Comparison tool used for?

A

comparing general upload speed across different AWS regions