[DEVELOPER] Advanced Kinesis Flashcards
What is the retention period for Kinesis Data Streams?
1 - 365 days
How can you delete data in Kinesis Data Streams without processing it?
You can’t. Kinesis Data Streams data is immutable
How can you ensure record ordering in Kinesis Data Streams?
Use the same partition key. Data that shares the same parition key go to the same shard.
What is the I/O performance of Kinesis Data Streams Provisioned Capacity Mode?
Each shard gets:
- 1 MB/s in (or 1000 records per second)
- 2 MB/s out
What is the pricing model for Kinesis Data Streams Provisioned Capacity Mode?
You pay per shard provisioned per hour
What info does Kinesis Data Streams On-Demand Capacity Mode use to determine how it sets the capacity?
You get automatic scaling based on observed throughput peak during the last 30 days.
Default is 4 MB/s
What is pricing model for Kinesis Data Streams On-Demand Capacity Mode?
Pay per stream per hour & data in/out per GB
In what use case would you use Kinesis Data Streams Provisioned Mode over On-Demand Mode?
When you know your capacity ahead of time.
Suppose you want to send streaming data from a VPC endpoint but you don’t want to go through the internet. How can you accomplish this?
Use Kinesis! VPC endpoints are available for Kinesis to access within VPC
How does encryption work for Kinesis Data Streams?
Encryption at rest using KMS
Encryption in flight using HTTPS
What is the API Call for a producer to send a record to Kinesis Data Streams?
PutRecord
You are using Kinesis Data Streams with multiple producers and multiple shards and repeatedly get ProvisionedThroughputExceeded
errors on an individual shard. What can you do to address the problem?
- Use highly distributed partition keys, maybe you have a hot partition getting too many messages
- Implement exponential backoff with retries
- Increase the number of shard (shard splitting)
You are using Kinesis Data Streams with 4 consumers all reading from the same shard in the Shared (Classic) Fan-out consumer pattern. What is the read throughput of each consumer?
0.5 MB/sec
Classic KDS Fan-out is 2MB/s per shard across all consumers
You are using Kinesis Data Streams with 4 consumers all reading from the same shard in the Enhanced Fan-out consumer pattern. What is the read throughput of each consumer?
2 MB/secc
Enhanced KDS fan-out is 2MB/s per shard per consumer
How is data transferred from shard to consumer in the Kinesis Data Streams Standard (Classic) Fan-out consumer pattern?
Consumers poll data from Kinesis using GetRecords
API call
How is data transferred from shard to consumer in the Kinesis Data Streams Enhanced Fan-out consumer pattern?
Consumers use SubscribeToShard
API and Kinesis pushes data to consumers over HHTP/2
When would you prefer the Enhanced Fan-Out Consumer pattern over the Standard Fan-Out Consumer pattern for Kinesis Data Streams?
Enhanced is better for
- Lots of consuming applications for the same shard
- Lower Latency (70ms vs. 200ms for standard)
What is the default limit for the number of consumer applications for Kinesis Data Streams Enhanced Fan-Out Consumer pattern?
5 consumers per stream BUT you can raise this with an AWS support ticket.
Does the Standard Fan-Out pattern for Kinesis Data Streams support Lambda consumers?
Yes
Does the Enhanced Fan-Out pattern for Kinesis Data Streams support Lambda consumers?
Yes
Does the Enhanced Fan-Out pattern for Kinesis Data Streams support batch reads for Lambda consumers?
Yes
What does KCL stand for?
Kinesis Client Library
You are using KCL to read from a Kinesis Data Stream with 4 shards into DynamoDB. What is the maximum number of KCL instances you can use?
4 (same as the number of shards)
In the context of Kinesis Data Streams, what is shard splitting?
A method to increase KDS capacity (and cost) by splitting traffic to a shard into 2 new shards.