AWS ML Eng Assoc - data storage 2 Flashcards

1
Q

Kinesis Data Streams

A

A real-time data streaming service that can ingest and process large amounts of data in real-time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Shard

A

A unit of throughput capacity in Kinesis Data Streams. Each shard provides 1 MB/s of write capacity and 2 MB/s of read capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Partition Key

A

A key used to group data by shard within a stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

KPL (Kinesis Producer Library)

A

A library that helps you easily and reliably put data into Kinesis Data Streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

KCL (Kinesis Client Library)

A

A library that helps you consume and process data from Kinesis Data Streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Enhanced Fan-Out

A

A feature that allows consumers to receive records from a stream with dedicated throughput of 2 MB/s per shard

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Data Firehose

A

A fully managed service for delivering real-time streaming data to destinations such as S3; Redshift; Elasticsearch; and Splunk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis Data Analytics

A

A service that allows you to process and analyze streaming data using SQL or Apache Flink

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

MSK (Managed Streaming for Apache Kafka)

A

A fully managed Apache Kafka service that allows you to build and run applications that use Apache Kafka to process streaming data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Shard Splitting

A

The process of increasing the number of shards in a Kinesis stream to increase capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Shard Merging

A

The process of combining two shards in a Kinesis stream to decrease capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Hot Shard

A

A shard that receives more data than others; potentially causing throughput issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kinesis Agent

A

A stand-alone Java application that offers an easy way to collect and send data to Kinesis Data Streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Provisioned Mode

A

A capacity mode in Kinesis where you specify the number of shards for your stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

On-Demand Mode

A

A capacity mode in Kinesis where capacity is automatically managed to accommodate your workload

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Random Cut Forest

A

An algorithm used in Kinesis Data Analytics for anomaly detection in streaming data

17
Q

Use case: Streaming ETL

A

Using Kinesis Data Analytics or MSK to perform real-time Extract; Transform; Load operations on streaming data

18
Q

Use case: Real-time Analytics

A

Using Kinesis Data Streams and Kinesis Data Analytics to process and analyze data in real-time for insights

19
Q

Use case: Log and Event Data Processing

A

Using Kinesis to ingest and process log files and event data from various sources in real-time

20
Q

Mnemonic: KPL puts; KCL gets

A

Remember that Kinesis Producer Library (KPL) is used to put data into streams; while Kinesis Client Library (KCL) is used to get data from streams

21
Q

Metaphor: Kinesis as a river

A

Think of Kinesis Data Streams as a river; producers add water (data) upstream; consumers take water out downstream; shards are the width of the river determining how much water can flow