Section 12: AWS Integration & Messaging: SQS, SNS & Kinesis Flashcards
What are the two patterns of application communication?
Synchronous communications (application to application) Asynchronous / Event based communications (application to queue to application)
Why can synchronous communications between applications be problematic?
If there are sudden spikes of traffic, the destination probably wonโt be able to handle everyhting at once (Ex: An app usually encode 10 videos/hour but now suddenly receives 1000 videos in a short amount time)
What are the three integration and messaging fully managed services by AWS?
SQS
SNS
Kinesis
What is the model of SQS?
Queue model
What is the model of SNS?
Pup/Sub model
What is the model of Kinesis
Real-time streaming model
What are the two types of queue available in SQS?
Standard Queue
FIFO Queue
What name defines the entities sending messages to an SQS queue?
Producers
What name defines the entities consuming messages of an SQS queue?
Consumers
How are messages consumed from an SQS queue?
They are polled by the consumers
How old is the AWS SQS standard queue?
Over 10 years old
What do you have to do in order to scale your standard AWS queue?
Nothing, it scales automatically to 10,000s messages per second
What is the default retention rate of messages in a standard SQS queue?
4 days
What is the maximum retention configurable for messages in an SQS queue?
14 days
What is the limit of how many messages can be in a standard SQS queue?
No limit
What is the latency for SQS queues
< 10ms
Can the number of consumers of an SQS queue scale? If so, on which axis?
Yes, it can scale horizontally
Can a standard SQS queue have duplicate messages?
It can occasionally
Are messages in order in a standard SQS queue?
Not necessarily (best effort ordering is built into the service)
How can you get messages 100% in order in SQS?
Use a FIFO Queue rather than a standard SQS queue
What is the maximum size of messages in an SQS queue?
256KB
How can you add a delay between the moment a message is sent and the moment consumers see the message in an SQS queue?
By adding a delivery delay at the queue level
How can you override the default delivery delay for a certain message sent in an SQS standard queue?
By overriding the default DelaySeconds parameter when sending a message
How can you override the default delivery delay for a certain message sent in an SQS FIFO queue?
You canโt, DelaySeconds is only available in standard queues (It make sense, other FIFO would not be respected)
What consists of a message (sent to an SQS queue) ?
Body (String, up to 256KB)
Attributes (Metadata)
What does the producer gets back when sending a message to an SQS queue?
Message ID
MD5 hash of the body
How many messages at a time can a consumer receive when polling an SQS queue?
Up to 10
What is called the time period within which a message is hidden from the consumers because it has been consumed?
The visibility timeout
What happens during the visibility timeout?
The message received by the consumer is still in the SQS queue but is considered โin flightโ therefore it canโt be received by other consumers
What is the responsibility of the consumer when he successfully finishes to process a message?
He needs to delete the message from the queue using the message ID and the receipt handle
What is the default time period of the visibility timeout in an SQS queue?
30 seconds
What is the maximum time period of the visibility timeout in an SQS queue?
12 hours
What can happen if the time period of the visibility timout of an SQS queue is too high?
If the consumer fails to process the message, there will be a long delay before trying to process the message again
What can happen if the time period of the visibility timout of an SQS queue is too low?
If the consumer needs time to process the message, another consumer will receive the message and the message will be processed more than once
What can you do if you set the visibility timeout of your SQS queue too low and a consumer needs more time to process a message?
Use the ChangeMessageVisibility API to increase the length of the visibility timeout of the message being processed at the moment
What is the API that a consumer of an SQS queue needs to call when successfully finishing to process a message?
The DeleteMessage API
(SQS queue) Where should messages that fail to get processed multiple times in a row be transferred to?
To a DLQ
What does DLQ stand for?
Dead Letter Queue
What is the redrive policy?
The redrive policy specifies the source queue, the dead-letter queue, and the conditions under which Amazon SQS moves messages from the former to the latter if the consumer of the source queue fails to process a message a specified number of times.
What strategy for our consumers allows us to save costs when using SQS queues
Long polling
What is the long polling?
A consumer requests message from queue and โwaitโ for message if there are none at the moment
What is the maximum wait time when doing long polling to as SQS queue?
20 seconds
What is the preferred wait time when doing long polling to as SQS queue?
20 seconds
At what level can long polling be enable?
At the queue level or at the API level
What is the name of the parameter which allows us to set the time for long polling?
WaitTimeSeconds
What happens to messages which donโt get deleted within the visibility timeout period?
They become visible again in the SQS queue, up to the defined treshold (redrive policy), when they will therefore be transferred to the DLR
What is the particular naming rule for FIFO queues in SQS
They must end with .fifo
What is the maximum number of messages per second (with / without batching) for FIFO queues?
3000 messages/sec with batching
300 messages/sec without batching
Can there be duplicates in a FIFO queue?
No
Can there be โper message delayโ in SQS FIFO queues?
No, only per queue delay
What are the two features exclusive to FIFO queues?
Deduplication
Sequencing
How can you get deduplication in FIFO queues?
By providing a MessageDeduplicationId with your message
Do you have to provide your own MessageDeduplicationId when using deduplication in a FIFO queue or is there a better way to do it?
You can use content based deduplication (the MessageDeduplicationId is generated as the SHA-256 of the message body (not the attributes))
What is the deduplication interval in FIFO queues?
A 5-minute period where the queue will track for duplicate messages
How to get sequencing in FIFO queue?
By specifying the same MessageGroupId to messages which you absolutely want to get processed in order
Are messages with different MessageGroupId sure to get processed in order in a FIFO queue?
No, different consumers can poll and receive messages with different MessageGroupId. Only messages with the same MessageGroupId are sure to get processed one after the other
Letโs say you have user which executes some actions like โAdds X to cartโ, โPurchase Xโ, โCancels Xโ, what could be the MessageGroupId for the messages (that will get added to a FIFO queue) related to that user?
user_id
What can you use if you need to send messages that are larger than 256KB in a SQS queue?
Use the SQS Extended Client for Java or a custom solution for other environments
What does the SQS Extended Client do?
Producer sends large message to S3
Producer sends small metadata message to SQS queue
Consumer polls/receive the small message
Consumer retrieves the large message from S3
What encryption do you get with SQS?
In flight using the HTTPS endpoint
SSE can be enabled using KMS
What does SSE stand for?
Server Side Encryption
What is encrypted when using SSE in SQS?
Only the messages body, not the metadata
What might be wrong if we canโt make our applications work with SQS
We probably have a problem with our IAM policies attached to the Roles of our applications
How to get finer grained control over IP when working with SQS?
Using SQS queue access policy
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-overview-of-managing-access.html
Which of these SQS API donโt have a batch API
SendMessage
ReceiveMessage
DeleteMessage
ChangeMessageVisibility
ReceiveMessage
Because you call already receive up to 10 messages at a time
What does PurgeQueue does (SQS)?
Delete all the messages in a queue
What are the three most common use cases for SQS?
Decouple applications
(for example to handle payments asynchronously)
Buffer writes to a database
(for example a voting application)
Handle large loads of messages coming in
(for example an email sender)
Can SQS be integrated with Auto Scaling? If so, how?
Yes, through CloudWatch
What can you use if you want to send a message to multiple receivers in a single call?
SNS
Where does an event producer sends messages to with SNS?
To one SNS topic
What is the maximum amount of receivers per topic in SNS?
10,000,000
What is the maximum number of topics in an AWS account?
100,000
What can SNS subscribers be?
SQS HTTP/HTTPS Lambda Emails SMS messages Mobile Notifications
How does these services use SNS? CloudWatch ASG Amazon S3 CloudFormation
CloudWatch: Alarms
ASG: Notifications from alarms to trigger auto scalling
Amazon S3: On bucket events
CloudFormation: State changes, failed to build, etc.
What are the steps to publish in a topic in SNS?
Create a topic
Create a subscription
Publish to the topic
What are the steps to direct publish in SNS (mobile apps sdk)?
Create a platform application
Create a platform endpoint
Publish to the platform endpoint
What is the SNS + SQS: Fan out strategy?
Push once in SNS, receive in many SQS
What is AWS Kinesis?
Kinesis is a managed alternative to Apache Kafka
What is Kinesis great for?
Application logs, metrics, IoT, clickstreams
โReal-timeโ big data
Streaming processing frameworks (Spark, NiFi, etc.)
Is there data replication with Kinesis?
Yes, data is automatically replicated to 3 AZ.
What are the three compoenents of Kinesis?
Kinesis Streams
Kinesis Analytics
Kinesis Firehose
What is Kinesis stream?
Low latency streaming ingest at scale
What does Kinesis Analytics do?
Performs real-time analytics
What language does Kinesis Analytics leverages?
SQL
What does Kinesis Firehose do?
Loads streams into S3, Redshift, ElasticSearch, etc.
Of what are composed Kinesis Streams?
Shards/Partitions
What is the default data retention period in Kinesis Streams?
1 day
What is the max data retention period in Kinesis Streams?
7 days
Do you have the ability to reprocess / replay data in Streams?
Yes
Can many applications consume the same stream?
Yes
Once data is inserted in Kinesis, can it be deletted?
No
What is the write capacity of a shard?
1 MB/s or 1000 messages/sec
What is the read capacity of a shard?
2 MB/s
How are Kinesis Streams billed?
They are billed per hour per shard provisioned
How many shards can a stream have?
As many as you want
Can the number of shards of a stream evolve over time?
Yes (reshard/merge)
When records enters a stream, do they stay in order?
Records in the same shard are in order
How can you make sure that records that need to stay in order get in the same shard/partition?
By providing the same PartitionKey
How can you avoid the โhot partitionโ ?
By providing highly distributed Partition Keys
What do messages received by Kinesis stream get?
A sequence number
What API can you use to send messages to Kinesis
PutRecord (without batching)
PutRecords (with batching)
Can you send messages to a Kinesis Stream from the Console?
No, you have to use the CLI, SKSs, or producer libraries from various frameworks
What error will we get if we go over the limit from our Kinesis Stream? (exceeding MB/s or TPS for any shard)
ProvisionedThroughputExceeded
What might be the cause of a ProvisionedThroughputExceeded error?
A โhot partition/shardโ
How can you solve ProvisionedThroughputExceeded ?
Retries with backoff
Increase shards (scaling)
Ensure your partion keys are distributed enough not to get a โhot partitionโ
What can you use to consume a Kinesis Stream?
CLI
SDK
or the Kinesis Client Library
What does KCL stand for?
Kinesis Client Library
What does KCL uses to checkpoint offsets?
DynamoDB
What does KCL uses to track other workers and share the work amongst shards?
DynamoDB
How many shards can a KCL read?
Many
You canโt have more ___ than ___
Words to place:
KCL
shards
You canโt have more KCL than shards
What can KCL run on?
EC2, EB, even on premise applications
How is Kinesis secure?
IAM policies (control access / authorization)
Encryption in flight (HTTPS)
Encryption at rest (KMS)
Can you encrypt/decrypt client side when using Kinesis?
Yes but itโs harder
Are VPC endpoints available for Kinesis?
Yes!
What does AWS Kinesis Data Analytics do?
Perform real-time analytics on Kinesis Streams using SQL
What do you have to do to make Kinesis Data Analytics scale?
Nothing, itโs a fully managed, you pay for what you use
What can you load data into when using AWS Kinesis Firehose?
Redshift, S3, ElasticSearch, Splunk
What do you have to do to make AWS Kinesis Firehose scale?
Nothing, itโs a fully managed service, you pay for the amount of data going through Firehose
What is AWS Kinesis meant for?
For real-time big data, analytics and ETL
What does ETL stand for?
โExtract, Transform, Loadโ