SQS, SNS, Kinesis Flashcards
SQS
- producer/consumer queue model
- standard queue (oldest aws service)
- unlimited throughput, unlimited number of messages
- encryption
- in flight - https
- at rest - kms
- client side encryption if client wants to decrypt itself
- max message size = 256kb
- max 10 messages consumed at a time
SQS consumers
- Consumer - ec2, servers, or lambda
- receive up to 10 messages at a time
- DeleteMessage API to take off the queue
- message visibility timeout
- after message is polled, it becomes invisible to other consumers
- if not processed within visibility timeout (30 seconds Default) it becomes visible to other consumers again
- if consumer needs more time, have it use ChangeMessageVisibility API
- scale consumers horizontally using ASG + queue length cloudwatch metric
SQS Long polling
- wait for messages to arrive if there are none in the queue
- 1 second to 20 seconds, recommended to long poll to avoid api calls
- queue level or api level using waitTimeSeconds parameter
Delay queue
delay messages for up to 15 minutes before being read
SQS Extended client
send larger messages (2GB)
SQS API
- CreateQueue (MessageRetentionPeriod), DeleteQueue
- PurgeQueue
- SendMessage (DelaySeconds), Receive Messages, DeleteMessages
- MaxNumberOfMessages - default 1, max 10 to receive a batch of messages at once
- ReceiveMessageWaitTime - long polling
- ChangeMessageVisibility
SNS
- pub/sub model (fan out)
- each subscriber gets allll the messages
- sns fifo - same as sqs - can only have sqs fifo subs
Kinesis
- real time streaming (big data)
- data streams
- shards
- 1mb/s or 1k at a time
- shard splitting - increase stream capacity by dividing when a shard is ‘hot’
- merge shards - decrease capacity and cost
- data partition key determines to which shard the data goes to
- shards
- data streams
Kinesis client library (KCL)
- java lib that reads record from kinesis data stream with distributed apps sharing the read workload
- each shard is to be read by only ONE KCL instance
Kinesis Data Stream birds eye view
Kinesis data Firehose
- takes data (usually from a data stream), transforms it with lambda and performs batch writes to (S3, Redshift, elastic search)
- fully managed by aws
- near real time - 60 seconds latency min for non full batches
- easiest way to reliably load streaming data into data lakes, data stores, and analytics services. It can capture, transform, and deliver streaming data to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, generic HTTP endpoints, and service providers like Datadog, New Relic, MongoDB, and Splunk
- least cost compared to kinesis data stream
Kinesis Data Firehose sources
- kinesis data stream
- aws redshift
- opensearch
Kinesis Data Analytics
- source is either data stream or data firehose
- fully managed
- use case - real time dashboard, metrics, etc.
SQS vs SNS vs Kinesis
Kinesis capacity limits
The capacity limits of a Kinesis data stream are defined by the number of shards within the data stream. The limits can be exceeded by either data throughput or the number of reading data calls. Each shard allows for 1 MB/s incoming data and 2 MB/s outgoing data. You should increase the number of shards within your data stream to provide enough capacity.