Kinesis overview and Streams Flashcards
What are the 3 types of Data Collection?
- Real-Time (Immediate Actions)
- Near Real-Time (Reactive Actions)
- Batch (Historical Analysis)
What are 3 examples of Real-Time data collection?
- Kinesis Data Streams (KDS)
- Simple Queue Service (SQS)
- Internet of Things (IoT)
What are 2 examples of Near Real-Time Data Collection?
- Kinesis Data Firehose (KDF)
- Database Migration Service (DMS)
What are 2 examples of Batch Data Collection?
- Snowball
- Data Pipeline
What are the 3 Kinesis Services?
- Kinesis Streams
- Kinesis Analytics
- Kinesis Firehose
What does Kinesis Streams allow you to do?
Low latency streaming ingest at scale
What does Kinesis Analytics allow you to do?
Perform real-time analytics on streams using SQL
What does Kinesis Firehose allow you to do?
Load streams into S3, Redshift, ElasticSearch & Splunk
How do Kinesis Streams work?
- Producers update data in shards/partitions
- Consumers read from partitions
What is the data retention in Kinesis Streams?
- Default is 24 hours
- Can go up to 7 days
How many applications can consume the same stream?
Multiple applications can consume the same stream
Can an application update data in a Kinesis stream?
No, the data is immutable. It is append-only and will remain until the data retention period has been reached
How are records ordered?
They are ordered per shard
What is produced to a shard?
Records and Record Keys are produced to shards
What is a record made up of?
A record is made up of a Data Blob
What is a Data Blob?
It is the data being sent in a stream, serialized as bytes up to 1MB
What is a Record Key?
It helps to group records in shards.
- Same key = dame shard
How do you avoid a “hot partition” problem?
Use a highly distributed record key
What is a Sequence Number?
It is the unique id given to the record by Kinesis
How many MBs per second can a producer write per shard?
1MB per second
How many messages per second can a producer write per shard?
1000 messages per second
What happens if I go over the 1MB/s limit or the 1000 messages/s limit?
I get the “ProvisionedThroughoutException”
What is the read data limit per shard for Classic Consumers?
2MBs per second across all consumers
What is the API call limit per shard for Classic Consumers?
5 API calls per second per shard across all consumers
What is the read data limit per shard for Enhanced Fan-Out Consumers?
2MBs per second, per shard, per Enhanced Consumer
What is the API call limit per shard for Enhanced Fan-Out Consumers?
It is a push model, no API calls needed
What is the default Data Retention for Kinesis Streams?
24 hours
What can the Data Retention for Kinesis Streams be extended to?
7 days