Data Analytics - Collection Flashcards

1
Q

What are the three types of collection?

A

Real Time (immediate actions)
Near Real Time (reactive actions)
Batch (historical analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What services can be used for real time data collection?

A

Kinesis Data Streams (KDS)
Simple Queue Service (SQS)
Internet of Things (IoT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What services can be used for near real time data collection?

A

Kinesis Data Firehose (KDF)
Database Migration Service (DMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What services can be used for batch data collection?

A

Snowball
Data Pipeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Kinesis data streams are made up of _________

A

Shards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two parts of an incoming record in kinesis data streams?

A

Partition key
Data blob

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How large is a data blob in kinesis data streams?

A

Up to 1MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Who sends the record to kinesis data streams?

A

Producers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What speed can records be transferred into kinesis data streams?

A

1 MB/sec OR 1000 msg/sec PER SHARD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Who consumes the data from kinesis data streams?

A

Consumers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three parts of an outgoing record in kinesis data streams?

A

Partition key
Sequence No
Data blob

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What speed can records be consumed from kinesis data streams?

A

2 MB/sec (shared) Per shard all consumers
2 MB/sec (enhanced) Per shard per consumer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the retention in kinesis data streams?

A

Between 1 day to 365 days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kinesis data streams can replay / reprocess data (T/F)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Once data is inserted in kinesis data streams, it can be deleted (T/F)

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which producers can use kinesis data streams?

A

AWS SDK
Kinesis Producer Library (KPL)
Kinesis Agent

17
Q

Which consumers can use kinesis data streams?

A

Write your own (Kinesis Client Library KCL, AWS SDK)
Managed (AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics)

18
Q

Which mode of kinesis data streams should be used if the user knows their capacity in advance?

A

Provisioned mode