S3 Flashcards

1
Q

You have several S3 buckets in your application. Some data is stored in Parquet format while another bucket contains you VPC flow logs. What can you use to analyse this directly?

A

AWS Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can S3 be considered as a database as well as storage?

A

Yes. S3 data can be queried directly with Athena so can be thought of as a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What AWS service can be used to mine S3 access logs?

A

Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can you delete a file from an S3 bucket with MFA via the S3 console?

A

No. It must be deleted via the CLI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 things that must be done to enable cross region replication in S3

A
  • Versioning must be enabled at source
  • Buckets must be in different regions
  • IAM Permissions must be set for R.W
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why would you use a signed URL to access an S3 bucket?

A

So you can generate a URL which is valid for a limited time. This is used for ensuring that only authorised users have access to the bucket - i.e. for a premium video or content service for logged in users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Can an S3 Bucket be accessed from instances within a VPC private subnet - i.e a subnet with no internet access

A

Yes. S3 supports VPC endpoints so buckets can be accessed from a private subnet with no internet access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In S3, who is principal * ?

A
  • is anyone
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When setting up a static website in S3, what do you need to do?

A

You need to enable static hosting and you need to grant public access for getObject requests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the function of Origin Access Identity?

A

OAI limits access to an S3 bucket to cloudfront only. This means that someone can’t access your S3 content from the open web.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

For what requests in S3 do we get read after write consistency - what are the exceptions?

A

Read after write is for puts of new objects in S3. The exception is if we did a GET, and received a 404 before doing a PUT. This PUT will be eventually consistent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What events result in eventual consistency?

A

PUTS and DELETES on existing objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the naming convention on S3 buckets?

A

Lowercase, no underscore, 3-63 chars, not an IP, must start with a lowercase letter or a number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where do you enable MFA delete for S3? In the bucket console, the IAM console or the CLI?

A

In the CLI - as the root user.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

You need to upload a 5.7GB file to S3. What needs to be enabled?

A

Multipart upload must be enabled for file sizes > 5GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which policy takes precedent over the other - IAM or Bucket?

A

IAM policies will take precedent over bucket policies. If the IAM policy has a DENY for writing an object to a bucket - this will override an ALLOW on the bucket policy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In S3 Glacier, what is each item referred to as, where are these stored and what is the max size?

A

Objects are referenced to as Archives. They are stored in vaults and each archive can be 40TB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Can you change the ownership of replicated objects or the storage class for replicated objects for the target in cross region replication?

A

Yes to both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If you looked at an IAM policy for a cross region replication rule, what might you see for source and destination actions?

A

For a source bucket expect to see list actions. For the target expect to see replication actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where are pre-signed URLs generated for S3 and what is the default validity period?

A

pre-signed URLs are generated via the CLI or SDK. The validity period is 3600 seconds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What permissions does a user of a pre-signed URL inherit for PUTs and GETs in an S3 bucket?

A

the user who uses the pre-signed URL inherits the permissions of the user who created it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Are delete markers replicated in cross region replication?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Can cross region replication replicate to a bucket in another account?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

There are 3 options that can be applied for replication in cross region replication to determine what gets replicated. What are they?

A
  1. Replicate the whole bucket
  2. Replicate objects with specific tags
  3. Replicate objects with specific user defined prefix in their name
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Do all encryption types for S3 support interactions via HTTP AND HTTPS?

A

No. SSE-C must use HTTPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

When uploading to an S3 bucket usig SSE-C, how is the client key transmitted? What about for other requests to this bucket?

A

The encryption key is in the header of the request. All requests to S3 SSE-C buckets must contain this key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

When uploading data to S3 using SSE-C, which protocol must you use?

A

HTTPS. The upload must be encrypted in transit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

With respect to S3 encryption, what does KMS-CMK refer to?

A

KMS Customer master key. this is what S3 encrypts data with under SSE-KMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

what are the 4 methods of S3 encryption?

A

SSE-S3: Server side encyption with keys managed by KMS
SSE-KMS: Server side encryption using AWS KMS
SSE-C: Server Side encryption using a customer key
Client Side: Client encrypts data on their side and uploads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

If you enable versioning on a previously un-versioned bucket what will the version numbers be for objects in that bucket?

A

“Null”

31
Q

In S3 what level is versioning enabled at?

A

At the bucket level

32
Q

Does multipart upload support parallel uploads?

A

Yes. this helps with upload throughput

33
Q

Can you pause or resume an upload to S3?

A

Yes, using Multipart uploads

34
Q

What tiers does S3 Intelligent Tiering (IT) operate on?

A

S3 and S3-IA

35
Q

What is the purpose of the following 4 elements in a bucket policy? Resources, Actions, Effect, Principal?

A

Resource: The bucket or objects in the bucket the policy applies to.
Actions: The API actions against the bucket we will allow or deny
Effect: Allow/Deny
Principal: Account of User to apply the policy to,

36
Q

When routing traffic to a static website hosted in S3, does the S3 bucket have to have the same name as the domain or subdomain registered in route 53?

A

Yes

37
Q

What requests are logged via S3 access logging?

A

All requests, from anywhere from any account authorised or denied.

38
Q

Does AWS Glacier encrypt data at rest by default?

A

Yes

39
Q

When using S3 logs, should you store these in the bucket being logged, or in a separate one - why?

A

Seperate buckets always - or else the logging activity gets logged.

40
Q

Are you changed for the amount of data on an EBS volume, or the provisioned space?

A

You are charged for the provisioned space

41
Q

What is the difference between snowball, and snowball edge?

A

Snowball can store 72TB of data. SB Edge can store 83TB and has local processing capabilities.

42
Q

What S3 encryption type requires the following request headers:
X-AMZ-SERVERSIDE-ENCRYPTION: AES256
X-AMZ-SERVERSIDE-ENCRYPTION: AWS-KMS

A

AES256=SSE-S3

AWS:KMS=SSE-KMS

43
Q

Is cross region replication synch or asynch?

A

Asynch

44
Q

There are 2 ways to enforce S3 encryption:
-Use of headers on puts
-Use of default encryption setting on the bucket
Use of headers is enforced by bucket policies. Which order are these evaluated in and which has precedent over the other?

A

Bucket policies are enforced first.
The easy way to enforce encryption is via the default encryption settings. If something goes wrong on upload, check the policy.

45
Q

What would you use to log and track every request to an S3 bucket including the requester, bucket name, request action, referrer, turn-around time and error code info? What service would you use and why?

A

You would use server access logging/ Cloud trail will provide detailed data on API calls but does not capture data on the referrer or turn around time.

46
Q

Can you transition an object from S3 to S3-IA or S31Z IA after 7 days?

A

No transitions to S3IA or 1ZIA can only occur after a minimum of 30 days.

47
Q

What are the minimum and maximum file sizes for S3?

A

0 Bytes. 5TB.

48
Q

For S3-IA, S3-1ZIA and S3-IT what are the minimum retention periods for objects stored in these classes (in days)?

A

30 Days

49
Q

For S3 Glacier and Glacier Deep Archive, what is the minimum retention period for objects (in days)?

A

90 and 180 Days respectively

50
Q

For S3 glacier, what latency can we expect when we retrieve an object for an expedited request?

A

1-5 minutes

51
Q

For S3 Glacier, what latency can we expect when we retrieve an object for a standard request?

A

3-5 Hours

52
Q

For S3 Glacier, what latency can we expect when we retrieve objects for a bulk request?

A

5-10 Hours

53
Q

How many AZ’s is S3 replicated over? What is the exception?

A

at least 3 AZ’s with the exception of 1Z-IA

54
Q

What 2 storage tiers does S3-IT operate on?

A

S3-GP and S3-IA

55
Q

Are the S3 performance baselines per account or per S3 Prefix?

A

Per S3 Prefix

56
Q

If we take the following bucket structure, what is the S3 prefix: Bucket/Folder1/Sub1/file

A

Folder1/Sub1

57
Q

How many S3 prefixes can you have?

A

Practically there is no limit.

58
Q

For an S3 lifecycle rule, what is a transition action?

A

It defines when an object is transitioned from 1 storage tier to another

59
Q

What is the maximum object size in S3 Glacier?

A

40TiB

60
Q

You have enabled cross region replication on a bucket containing some objects. When you check the replication target, none of the objects are there. Why?

A

Only new objects are replicated.

61
Q

In terms of the S3 baseline performance - how can these be impacted by S3 SSE-KMS?

A

If you are using S3 SSE-KMS, these calls have their own limits per region ranging from 5500 to 30000 requests/second. On a bucket with very high request rates KMS may throttle.

62
Q

In terms of GET, HEAD and PUTs what are the baseline request per second performance?

A

3500 for PUTS and 5500 for GET and HEAD

63
Q

You are uploading a file to an S3 bucket using Transfer acceleration. Which AWS component is the file uploaded to FIRST?

A

The file is uploaded to an AWS Edge location and then is transferred over the AWS private network to the bucket. This can significantly increase the upload speed

64
Q

When SHOULD you enable multi-part upload, and when MUST you enable it?

A

you SHOULD use multi-part upload for files over 100MB. You MUST use it for files over 5GB

65
Q

What does an S3 Byte Range Fetch allow?

A

It allows for parallel downloads of a large file by specific a range of bytes for each parallel request to download. Similar to a multi-part upload but in reverse.

66
Q

You are required to ensure that your application is resilient when it comes to downloading large files from s3. What can help you achieve this?

A

S3 Byte Range Fetch

67
Q

You need to download the headers of files in S3. You know that in each case the header is contained in the first 50 bytes of the file. What can you use to achieve this?

A

S3 Byte Range Fetch

68
Q

We have a very large CSV file stored on S3 containing 15 years of sales data. We have built an application that allows the user to filter this set by a date or a product code. Currently, when the application runs it retrieves the entire CSV from S3 and filters it locally. We don’t want to build a database to store the data in, so what S3 technology could we use to offload the filtering task from the local application?

A

S3 Select will allow simple SQL queries to be used to filter data directly on S3 allowing us to return only the data requested to the client (Also, Glacier Select for Glacier)

69
Q

There are 3 targets for S3 Event notifications. What are they?

A

SNS, SQS, Lambda

70
Q

We have an S3 event rule set up in an S3 bucket which does not have versioning enabled. Looking at our event handler we can see that an event has fired when an object has been written to. Checking the S3 server access logs though we see that 2 writes were made to the same object concurrently. Why did we only see one event?

A

On non-versioned objects, if two writes happen at the same time, occasionally only ONE of those may trigger an event. To ensure that every write triggers an event enable versioning.

71
Q

We have an application that allows a user to upload images into an S3 bucket for an album. Part of the use case is that we need to generate a thumbnail every time an image is uploaded. What would we use to trigger this process

A

We can set up an S3 event notification on the bucket to trigger an event which gets fired to SQS, SNS or Lambda (depending on the architecture) when a image object is created.

72
Q

We have our VPC flow logs being shipped to an S3 bucket. We want to perform analysis on these logs, but don’t want to pull them down locally to do this. What AWS tool allows us to analyse data directly on S3 that we could apply to our flow logs?

A

Amazon Athena allows you to analyse data directly on S3. This is different to S3 select.

You can think about AWS S3 Select as a cost-efficient storage optimization that allows retrieving data that matches the predicate in S3 and glacier aka push down filtering.

AWS Athena is fully managed analytical service that allows running arbitrary ANSI SQL compliant queries - group by, having, window and geo functions, SQL DDL and DML.

73
Q

You suspect some of your employees to try to access files in S3 that they don’t have access to. How can you verify this is indeed the case without them noticing?

A

Enable S3 access logging and then analyse using Athena

74
Q

You are looking to provide temporary URLs to a growing list of federated users in order to allow them to perform a file upload on S3 to a specific location. What should you use?

A

S3 pre-signed URL’s