Scaling - Kinesis Flashcards
What is Kinesis?
custom enabled application that eases collection, processing, analyzation of streaming data and video real time
What problem does Kinesis Solve?
don’t worry about provision, deployment, maintenance of hardware and software of data streams; collect data real time such as video, audio, machine learning, analytics for further processing
What are the kinesis components?
producer (any kinesis data source), kinesis video stream (transport, consumption and storage of live video data), and consumer (takes data for view, processing and analyzing)
What is a kinesis producer?
any source that puts data into a kinesis data stream
What is a kinesis video stream?
enable you to securly transport live video data, store it, and make it available for consumption both real time or ad hoc to AWS
What is a kinesis consumer?
retrieves data such as frames and fragments from a Kinesis video stream to view, process or analyze it
What does kinesis video streams provide?
APIs for creation/management of media data to and from a stream, console that supports live and on-demand playback
What is shard?
the base throughput unit of a kinesis data stream; a sequence of data records and provides a fixed unit of capacity, each shard supports 1000 PUT records ps, 1MB/sec data input and 2MB/sec data output
How long are streams stored?
24 hours by default, to 7 days of extended retention and 365 long-term retention
What are Kinesis streams?
streams are made up of shards
How fast are shards processed?
five reads per sec at max total read rate is 2 MBps/1000 writes per second at total of 1 MBps
How can capacity be increased?
by increasing the number of shards;
What is Kinesis Data Firehose?
capture, transform, load data streams into AWS data stores to enable real-time analytics with BI tools; no shards or streams; data can be saved to S3, Redshift
What is Kinesis Data Analytics?
analyze, query and transform streamed data in real-time using standard SQL; data in ingested into Kinesis streams/data analytics and will SQL queries run on that data, data is then moved to Redshift, S3
What are the limits to shards and streams?
no upper limit to shard in a stream or streams themselves, single shad can ingest up to 1Mbps (1000 shards per second, 5 reads per second, default shard limit is 500 and 200 shards for certain AWS regions