Kinesis Flashcards
_____ is a platform on AWS to send your streaming data to.
kinesis
____ makes it easy to load and analyze streaming data, and also providing the ability for you to build your own custom apps for your business needs.
kinesis
______ data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of KB).
streaming
which are Types of streaming data:
- purchases online
- stock prices
- game data (as the game plays)
- social network data
- geospacial data (think uber)
- IOT sensor data
all of the above
the 3 kinesis services are:
kinesis ____
kinesis ____
kinesis ____
streams, firehose, analytics
kinesis streams consist of ____
shards
___ transactions / second for reads, up to a max total data read rate of ___MB/second and UP to _____ records/second for writes, up to a max total data write rate of __MB/second (including partition keys).
5,
2,
1,000,
1
The data capacity of the stream is a function of the number of _____that you specify for the stream. The rotal capacity of the stream is the sum of the capacities of its _____.
shards
Kiniesis Firehose has shards
T or F
False, only steams has shards
Firehose is completely automated
T or F
T
kinesis consumers
kinesis client library runs on the consumer instances
tracks teh number of shards in your stream
discovers new shards when you reshard
Kinesis client library
the kcl ensures that for every shard tehre is a record processor.
kcl manages the number of recod processors relative to teh number of shards and consumers.
if you have only one consumer, the KCL will create all the record processors on a single consumer.
if you have two consumers it will load balance and create half the processors on one instance and half on another.
yes
with KCL, generally you should ensure that the number of instances does not exceed the number of shards (except for failure or standby purposes)
- you never need multiple instances to handle the processing load of one shard.
- however, one worker can process multiple shards
yes
it’s fine if the number of shards exceeds the number of instances
don’t think that justbecause you reshard, that you need to add more instances.
instead, CPU utilization is what should drive the quantity of consumer instances you have, NOT the number of shards in your Kinesis stream.
-use an auto scaling group, and base scaling decisions on CPU load on your consumers
yes