Athena Flashcards

1
Q

use cases for Athena

A
  1. you can do BI, analytics, reporting,

2. you can analyze and query VPC Flow Logs, ELB Logs, CloudTrail trails, S3 acccess Logs, CloudFront Logs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Athena

A

a serverless service and you can perform analytics directly against S3 files. You can use the SQL language. It even has a JDBC or a ODBC driver if you wanted to connect your BI tools to it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Athena charged

A

You get only charged per query and for the amount of data scanned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Athena file support

A

It supports many, many, different types of file formats
such is CSV, JSON, ORC, Avro, Parquet and in the back end it basically runs Presto. Presto if you know is a query engine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

S3 Object Lock

A

you want to a WORM model, write once read many.
You write the file once to your S3 buckets, and then you will block that object version to be deleted for a specific amount of time, so no one can touch it, no one can modify it.

So we have the guarantee that the file will only be written once, and you will not have deletion or modifications happening to that file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Glacier Vault Lock

A

you have the same WORM model, write once read many. You create a lock policy and that lock policy
prevents future edits to that file, so that no longer can be changed. And that policy itself is set in stone,
so once you set it, no one can delete that policy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

use case for S3 Object Lock and Glacier Vault Lock

A

helpful when you have compliance and data retention requirements, so you want to say okay I want to upload an object to S3, or Glacier, and have the guarantee that no one ever will be able to delete that object, so that we can retrieve in seven years time
in case there is an audit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

You have enabled versioning and want to be extra careful when it comes to deleting files on S3. What should you enable to prevent accidental permanent deletions?

A

MFA Delete forces users to use MFA tokens before deleting objects. It’s an extra level of security to prevent accidental deletes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You would like all your files in S3 to be encrypted by default. What is the optimal way of achieving this?

A

enable “Default encryption” on S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

You suspect some of your employees to try to access files in S3 that they don’t have access to. How can you verify this is indeed the case without them noticing?

A

Enable S3 Access Logs, they log all the requests made to buckets, and Athena can then be used to run serverless analytics on top of the logs files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

You are looking for your entire S3 bucket to be available fully in a different region so you can perform data analysis optimally at the lowest possible cost. Which feature should you use?

A

S3 Cross Region Replication is used to replicate data from an S3 bucket to another one in a different region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

You are looking to provide temporary URLs to a growing list of federated users in order to allow them to perform a file upload on S3 to a specific location. What should you use?

A

Pre-Signed URL are temporary and grant time-limited access to some actions in your S3 bucket.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can you automate the transition of S3 objects between their different tiers?

A

use S3 Lifecycle Rules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You are looking to build an index of your files in S3, using Amazon RDS PostgreSQL. To build this index, it is necessary to read the first 250 bytes of each object in S3, which contains some metadata about the content of the file itself. There is over 100,000 files in your S3 bucket, amounting to 50TB of data. how can you build this index efficiently?

A

create an application that will traverse the S3 bucket, issue a Byte Range Fetch for the first 250 bytes, and store the information in RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly