Data Processing Services Flashcards
Amazon Kinesis
collect, buffer, process, and analyze real-time, streaming data
How is data processed in Kinesis Data Streams
How much data is able to be processed
Data is processed in “shards”
1000 records per second per shard
GBs of data per second from thousands of sources
What does a kinesis record consist of?
a partition key
sequence number
data blob (up to 1 MB)
Does Kinesis Data Streams Store Data? If so, how? If not, why?
Transient data store – default retention of 24 hours, but can be configured for up to 7 days.
What are the types of Kinesis services?
Kinesis Video Streams
Kinesis Data Streams
Kineisis Firehose
Kinesis Data Analytics
Kinesis Video Streams
Durably stores, encrypts, and indexes video data streams, and allows access to data through APIs
Supports encryption at rest with server-side encryption (KMS) with a customer master key
Kinesis Video Streams - max read rate, max write rate?
5 transaction per second for reads, up to a max read rate of 2MB per second and 1000 records per second for writes up to a max of 1MB per second
Kinesis Data Streams
enables real-time processing of streaming data
stores data for later processing by applications
ingest/collect and process large streams of data records in real time at large scales
Amazon Kinesis Data Firehose
fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration to loading data streams into AWS data stores
What data operations can be performed on data in Amazon Kinesis Firehose
batch, compress, transform, and encrypt the data
Load Targets for Amazon Kinesis Firehose
Amazon S3, Splunk, ElasticSearch, and RedShift
Primary Use Case for AWS Kinesis Services
Handling Streaming Data
Does Kinesis Data Streams or Firehose provide ability to store streaming data? If so, for how long?
KDS provides ability to store streaming data for 1-7 days, Firehose doesn’t provide any facility for storing streaming data
Key Features of Amazon Kinesis Data Streams (KDS)?
Data collected available in milliseconds Enables real-time analytics Provides ordering of records Read or replay of records in the same order Transient data store
Amazon SQS
Simple Queue Service is a fully managed queuing service - no need to configure, install, or acquire software/hardware, queues dynamically created and scale automatically, no need to provision capacity
Amazon SQS Features/Impacts
decouple and scale micro services, distributed systems and serverless applications
Buffer messages to smooth out temporary volume spikes to handle temporary volume spikes or increased latency
built in mechanism for retry?