Kinesis Flashcards
This deck aims to help retain concepts related to the Kinesis service.
Which AWS serverless streaming data service simplifies capturing, processing, and storing data streams at any scale?
Kinesis Data Streams - public, regional, and highly available AWS service
What is the minimum and maximum retention period for data in Kinesis Data Streams?
By default, Kinesis Data Streams retains data for 24 hours, but this can be extended to up to 365 days for an additional cost
When using Kinesis Data Streams, is it possible to configure more than one producer and consumer?
Yes, Kinesis Data Streams supports multiple producers for data ingestion and multiple consumers for data reading
Which Kinesis Data Streams component is responsible for scaling?
Shards
Which Kinesis Data Streams components can impact pricing?
The number of shards and the configured data retention window (24 hours to 365 days)
Where is data stored in Kinesis Data Streams?
Data is stored in Kinesis Data Records, with each record up to 1 MB in size
Which AWS service is ideal for large-scale data ingestion by numerous producers, with multiple consumers processing data at varying rates, for analytics and monitoring application clicks?
Amazon Kinesis Data Streams
Which AWS service is used to transfer data from Kinesis Data Streams to other AWS services?
Amazon Kinesis Data Firehose
How can Kinesis Video Streams performance be improved?
By increasing the number of shards, each shard supports up to 1 MB/s for ingestion and 2 MB/s for consumption
Which AWS serverless service is used to capture, transform, and load large volumes of streaming data from hundreds of thousands of sources into AWS services like S3, Redshift, or OpenSearch?
Kinesis Data Firehose
What is the primary purpose of Kinesis Data Firehose?
To load data into data lakes, storage solutions, and analytics services
Is Kinesis Data Firehose a real-time service?
No, it operates in near-real-time, typically delivering data within ~60 seconds
Can Kinesis Data Firehose perform on-the-fly data transformations?
Yes, it can use AWS Lambda for transformations, though this may introduce some latency
How is billing calculated for Kinesis Data Firehose?
It is based on the volume of data processed through the service
What are common destinations for Kinesis Data Firehose?
- S3
- Redshift
- OpenSearch
- HTTP Endpoints
- Datadog
- Splunk
- ElasticSearch
How does scaling differ between Kinesis Data Firehose and Kinesis Data Streams?
- Kinesis Data Firehose scales automatically
- Kinesis Data Streams requires manual scaling through shards
What are the primary use cases for Kinesis Data Firehose?
- Persisting data from Kinesis Data Streams
- Transforming and storing data in different formats
- Delivering data to supported destinations
How do replay capabilities differ between Kinesis Data Firehose and Kinesis Data Streams?
- Kinesis Data Firehose doesn’t support replay
- Kinesis Data Streams allows data replay
What distinguishes Kinesis Data Firehose and Kinesis Data Streams in terms of data persistence?
- Kinesis Data Firehose doesn’t support data persistence
- Kinesis Data Streams can retain data from 24 hours to 365 days
How do consumers differ between Kinesis Data Firehose and Kinesis Data Streams?
- Kinesis Data Firehose is closed-ended with a single destination
- Kinesis Data Streams is open-ended, supporting multiple producers and consumers
Can unmodified data be saved when using Lambda for transformation with Kinesis Data Firehose?
Yes, unmodified data can optionally be delivered to S3
Which service is suitable if Kinesis Data Streams features are unnecessary?
Kinesis Data Firehose
What AWS service combination supports real-time data transformation?
Kinesis Data Streams with Lambda
Which AWS service enables real-time analysis of streaming data for actionable insights?
Kinesis Data Analytics
Which AWS service allows real-time transformations, filtering, and enrichment of streaming data using SQL?
Kinesis Data Analytics
Is Kinesis Data Analytics a real-time service?
Yes, it processes data in real-time
What are the main use cases for Kinesis Data Analytics?
- Processing streaming data in real-time with SQL queries (e.g., time-series data, dashboards, or security metrics)
- Performing complex data manipulations in real-time
What data sources are supported by Kinesis Data Analytics?
- Kinesis Data Streams
- Kinesis Data Firehose
- S3 for static reference data
What are the supported destinations for Kinesis Data Analytics?
- Kinesis Data Streams (real-time)
- Kinesis Data Firehose (near real-time)
How date is retrieved by consumers from Kinesis Video Streams?
Consumers retrieve data frame-by-frame for further analysis
What are typical producers for Kinesis Video Streams?
Devices such as security cameras, smartphones, cars, drones, and sources of time-serialized data like thermal, depth, and RADAR streams
Which AWS service automatically provisions and scales infrastructure for ingesting live video streams from millions of devices?
Kinesis Video Streams
Does Kinesis Video Streams integrate with other AWS services?
Yes, it integrates with a range of AWS services, commonly used with Rekognition for facial recognition and S3 for data storage
Can Kinesis Video Streams handle secure data persistence?
Yes, it securely persists data, with encryption both in transit and at rest
Can data captured by Kinesis Video Streams be accessed directly from storage?
No, data is indexed and structured, so it can only be accessed via APIs
How would you configure Kinesis Video Streams for two IP cameras?
Each camera streams data to its own video stream
When dealing with live video streams, Real-Time Streaming Protocol (RTSP), or GStreamer, which AWS service comes to mind?
Kinesis Video Streams
Which AWS service supports large-scale data ingestion by multiple producers and consumption by multiple consumers in REAL-TIME?
Kinesis Data Streams
Which AWS managed service offers streaming data delivery and transformation capabilities in near-real-time?
Kinesis Data Firehose
Which service processes streaming data using SQL in real-time?
Kinesis Data Analytics
Can the order of data be guaranteed across multiple shards in AWS Kinesis Data Streams?
No, data order can only be guaranteed within a single shard
What is the formula to calculate the initial number of shards for AWS Kinesis Data Streams?
num_of_shards = max(incoming_write_bandwidth_in_KiB / 1024, outgoing_read_bandwidth_in_KiB / 2048)
What is the purpose of the partition key in AWS Kinesis Data Streams?
The partition key is used to distribute data records across multiple shards in a stream
What stand-alone Java software application offers an easy way to collect and send data to Kinesis Data Streams?
Kinesis Agent
How does the Kinesis Client Library (KCL) manage shard processing among multiple EC2 instances?
Each shard is assigned to one worker via a lease, with 10 shards, a maximum of 10 EC2 instances (workers) can be used