Amazon Kinesis Flashcards

1
Q

Amazon Kinesis ?

A

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is data processed and the ingesting rate per second

A

Data is processed in “shards” – with each shard able to ingest 1000 records per second.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the default limit of Shards

A

There is a default limit of 500 shards, but you can request an increase to unlimited shards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a record consist of:

A

A record consists of a partition key, sequence number, and data blob (up to 1 MB).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Kinesis Transitient DataStore

A

Transient data store – default retention of 24 hours but can be configured for up to 7 days.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 4 Types of Kinesis Services

A
  1. Kinesis Video Streams
  2. Kinesis Data Streams
  3. Kinesis Data Analytics
  4. Kinesis Data Firehouse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Video Streams

A

Kinesis Video Streams makes it easy to securely stream video from connected devices to AWS for analytics, machine learning (ML), and other processing.

Durably stores, encrypts, and indexes video data streams, and allows access to data through easy-to-use APIs.

Producers provide data streams.

Stores data for 24 hours by default, up to 7 days.

Stores data in shards – 5 transaction per second for reads, up to a max read rate of 2MB per second and 1000 records per second for writes up to a max of 1MB per second.

Consumers receive and process data.

Can have multiple shards in a stream.

Supports encryption at rest with server-side encryption (KMS) with a customer master key.

Kinesis Video Streams does not appear much on AWS exams.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis Data Streams

A

Kinesis Data Streams enables you to build custom applications that process or analyze streaming data for specialized needs.

Kinesis Data Streams enables real-time processing of streaming big data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Kinesis Data Streams Common Use Cases

A
  • Accelerated log and data feed intake.
  • Real-time metrics and reporting.
  • Real-time data analytics.
  • Complex stream processing.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

high-level architecture of Kinesis Data Streams

A
  • Producers continually push data to Kinesis Data Streams.
  • Consumers process the data in real time.
  • Consumers can store their results using an AWS service such as Amazon DynamoDB, Amazon Redshift, or Amazon S3.
  • Kinesis Streams applications are consumers that run on EC2 instances.
  • Shards are uniquely identified groups or data records in a stream.
  • Records are the data units stored in a Kinesis Stream.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can produces send data to Kinesis ?

A
  • Kinesis Streams API.
  • Kinesis Producer Library (KPL).
  • Kinesis Agent.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Record in Kinesis

A

A record is the unit of data stored in a Amazon Kinesis data stream.

A record is composed of a sequence number, partition key, and data blob.

By default, records of a stream are accessible for up to 24 hours from the time they are added to the stream (can be raised to 7 days by enabling extended data retention).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Data Blob in a Kinesis Stream Record

A

A data blob is the data of interest your data producer adds to a data stream.

The maximum size of a data blob (the data payload before Base64-encoding) within one record is 1 megabyte (MB).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a Shard

A

A shard is the base throughput unit of an Amazon Kinesis data stream.

One shard provides a capacity of 1MB/sec data input and 2MB/sec data output.

Each shard can support up to 1000 PUT records per second.

A stream is composed of one or more shards.

The total capacity of the stream is the sum of the capacities of its shards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two types of resharding

A
  • In a shard split, you divide a single shard into two shards.
  • In a shard merge, you combine two shards into a single shard.

Splitting increases the number of shards in your stream and therefore increases the data capacity of the stream.

Splitting increases the cost of your stream (you pay per-shard).

Merging reduces the number of shards in your stream and therefore decreases the data capacity—and cost—of the stream.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are Kinesis Consumers known As

A

Consumers are known as Amazon Kinesis Streams Applications.

17
Q

Partition keys are used to ….

A

Partition keys are used to group data by shard within a stream.

18
Q

Kinesis Streams uses which service for Encryption ?

A

KMS master keys

To read from or write to an encrypted stream the producer and consumer applications must have permission to access the master key.

19
Q

Kinesis Data Streams Replication Strategy ?

A

Kinesis Data Streams replicates synchronously across three AZs.

20
Q
A
21
Q

Security

A

Control access / authorization using IAM policies.

Encryption in flight using HTTPS endpoints.

Encryption at rest using KMS.

Possible to encrypt / decrypt data on the client side.

VPC endpoints available for Kinesis to access within a VPC.