SQS, SNS Flashcards
In SQS, what is message visibility timeout and what happens if this is too low or too high?
Message visibility timeout is the time in which the message will be “invisibile” to other consumers while being processed.
If it is too low, the message can be processed twice. If it is too high, re-processing messages will take time.
Consumers can call ChangeMessageVisibility if it knows the processing will take so much time
What is a Dead Letter Queue and when a message will be pushed there?
A dead letter queue is a SQS queue that will store messages that failed processing by the consumers. It is useful for debugging.
A message will be pushed to it when MaximumReceives threshold is reached. So, the message visibility timeout is reached, the message went back to the queue the number of times equal the MaximumReceives threshold, and then goes to the DLQ.
What is redrive to source SQS functionality?
Redrive to source functionality enables you to move the messages from DLQ to the source queue without much effort and without writing custom code.
What is long polling and why it is useful?
When long polling is enabled, the behavior of consumer polling messages from the queue changes. Instead of requesting many times when the queue is empty, if it requests one time and the queue is empty, it will wait until a message arrive and it is processed.
It is useful to reduce the amount of API calls made to the queue and latency of the application
What is extended client?
It is a library to support sending large mesasges to SQS. It stores the message on S3 and sends only a metadata to SQS. The consumer reads this metadata and retrieve the message from S3.
What is a FIFO queue and what is the difference between it and the standard queue?
On a FIFO queue, unlike the standard queue, the messages are processed in order by the consumer. It also has exactly-once send capability - removes duplicates and the throughput is limited.
What is deduplication on SQS and how it happens?
Deduplication can remove duplicated messages on a FIFO queue. It can check the message content (hash) and compare with incoming messages or it can provide a deduplication ID to be compared with.
What is message grouping on SQS?
Message grouping is a feature on SQS FIFO queue that enables you to create groups of messages that will be processed separately and each group will have its messages ordered and one consumer each.
It is useful when you need ordering on a subset of messages of a queue.
Explain the SNS + SQS fanout pattern
On this pattern, messages are published on SNS and SQS queues are subscribers of the SNS topic. (Similar to exchange - queue fanout rabbitmq pattern)
Use case - send S3 event to many SQS queues.
What is “hot partition” in Kinesis Data Stream?
A hot partition happens when a shard has much higher throughput than others. To avoid, you can choose a highly distributed partition key.
Why ProvisionedThroughputExceeded error happens in Kinesis Data Streams and how to fix it?
It happens when the input throughput of a shard exceeds the maximum. To fix it, ensure to have a highly distributed partition key, use retries with exponential backoff and increase shards (scale)
In Kinesis Data Stream, what is the difference between shared fan out consumer and enhanced fan out consumer?
On shared fan out consumer, the throughput of a shard is divided across all consumers. Adding consumers will decrease the throughput of other consumers.
On enhanced fanout consumers, each consumer of a shard has all the throughput available for itself. It is useful when you have a high number of consumers on a shard, but is expensive.
What is shard splitting and merging shards and why it is useful?
Shard splitting is used to increase stream capacity and divide a hot shard.
Merging shards is used to decrease stream capacity and group shard with low traffic
What is the difference between Kinesis Data Streams and Kinesis Firehose?
Kinesis Data Stream
* ingest data at scale
* write custom code of the producer and consumer
* data retention
Kinesis Data Firehose:
* load streaming data into S3 / redshift / opensearch, etc…
* fully managed
* no data storage
What is Kinesis Data Analytics?
Performs SQL statements on real time from Kinesis data stream or firehose data