Data Analytics - Collection Flashcards
What are the three types of collection?
Real Time (immediate actions)
Near Real Time (reactive actions)
Batch (historical analysis)
What services can be used for real time data collection?
Kinesis Data Streams (KDS)
Simple Queue Service (SQS)
Internet of Things (IoT)
What services can be used for near real time data collection?
Kinesis Data Firehose (KDF)
Database Migration Service (DMS)
What services can be used for batch data collection?
Snowball
Data Pipeline
Kinesis data streams are made up of _________
Shards
What are the two parts of an incoming record in kinesis data streams?
Partition key
Data blob
How large is a data blob in kinesis data streams?
Up to 1MB
Who sends the record to kinesis data streams?
Producers
What speed can records be transferred into kinesis data streams?
1 MB/sec OR 1000 msg/sec PER SHARD
Who consumes the data from kinesis data streams?
Consumers
What are the three parts of an outgoing record in kinesis data streams?
Partition key
Sequence No
Data blob
What speed can records be consumed from kinesis data streams?
2 MB/sec (shared) Per shard all consumers
2 MB/sec (enhanced) Per shard per consumer
What is the retention in kinesis data streams?
Between 1 day to 365 days
Kinesis data streams can replay / reprocess data (T/F)
True
Once data is inserted in kinesis data streams, it can be deleted (T/F)
False