Development (data processing, analytics and big data solutions) Flashcards

1
Q

What service would you use to ANALYSE data using SQL?

A

Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What service would allow you to COLLECT and PROCESS large amounts of data on the availability of flights to display to customers?

A

Kinesis Data Streams because it specializes in handling real-time streaming data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You want to put some data in and search it.

What is a good option for this?

A

Opensearch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If you hear “real time streaming of data” what’s usually the answer?

A

Kinesis because it takes that real time data and has it ready for us whenever we want to process it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What would you use to write standard SQL queries on STREAMING DATA?

A

Kinesis Analytics is best for real time analytics on streaming data using SQL queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A supermarket wants to analyze real time data of users based on their clicks on the web page

What service should you use

A

Kinesis firehose because it can ingest the clickstream data and send it to an analytics service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

You have video camera door bell. You want to STORE the data for potential playback and for ML ANALYTICS

What is a good option?

A

Kinesis video streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

You want to LOAD a video and transform it into DIFFERENT FORMATS

What is a good option?

A

Elemental Media Services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name Amazon’s version of zoom?

A

Chime

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

You want to LOAD streaming data into Kinesis Analytics, S3, Redshift and Open Search

What should you use?

A

Kinesis firehose because it can do it 1 step

If you used streams you would need to add a lambda function or something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What would you do to keep a strong performance if shards are hot?

A

Split the shards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

5MB ingest and 40MB read

What is the min number of shards it needs

A

20 shards

because 1 write, 2 read

so

5/1 for write
40/2 for read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What can you use to simplify data integration and ETL (Extract, Transform, Load) processes

A

Glue

Glue simplifies data integration and ETL processes by automatically cataloging, cleaning, and transforming data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Name the 3 steps for processing real time data using Kinesis?

A

Data (temp sensors/click stream data/load sale info gather & batch so we can process in real time) → Kinesis → Compute (ec2 cluster, lambdas etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name 2 benefits of Kinesis?

A

Duarability
Scalability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a shard?

A

A shard is the capacity of the data stream

17
Q

When would you re-shard?

A

You would re-shard (increase the num of shards) when the data rate increases

18
Q

Does the number of shards directly impact its ability to process incoming data?

A

Yes

19
Q

Name 8 ways Kinesis policies can be implemented?

A
  • IAM policies (users, roles or groups)
  • Resource based policies
  • Fine grained access control (ie., tags)
  • Kinesis Data Streams Access Control
  • Amazon CloudWatch Metrics and Alarms
  • Cross-Account Access
  • Integration with AWS Organizations
  • Key Rotation and Encryption Policies
20
Q

Name 4 types of Kinesis

A

Kinesis streams
Kinesis video
Kinesis analytics
Kinesis firehouse

21
Q

You need to perform log analysis.

What would you use?

A

Opensearch because its used for full-text search & analytics

It is suitable for building search engines, log analysis and monitoring solutions

22
Q

What service would you use if you want to LOAD live stream data into REDSHIFT

A

Kinesis Data Firehose because it simplifies the process of capturing and loading data into data stores like Redshift by automatically handling the data delivery and ensuring that it is efficiently and reliably loaded for analysis.

23
Q

What is the difference between Athena and Aurora?

A

Athena is a serverless query service for s3

Aurora is a relational database service

24
Q

Name the 4 types of Kinesis:

A
  1. Kinesis Data Streams: Collects and processes large streams of data records in real time.
  2. Kinesis Data Firehose: Loads streaming data into AWS data stores for near real-time analytics.
  3. Kinesis Data Analytics: Processes streaming data using SQL for real-time insights.
  4. Kinesis Video Streams: Securely streams video from connected devices to AWS for analysis and processing.