SQS Flashcards
two patterns of application communication
synchronous and asynchronous communication
synchronous communication
your application will be directly connecting to another application of yours.
buying service –> shipping service
when something is bought, we need to talk to the shipping service to send that item that was just bought.
asynchronous communication
or event based.
There will be a middleware called a queue that will connect your applications.
So this time the buying service says, hey, someone bought something, and so I’m gonna put that into a queue, and that’s it. And the shipping service says, hey queue, is there something that got bought recently? And the queue will return that element
and the shipping service can do whatever it wants.
So the buying service and the shipping service are not directly connected.
buying service –> queue –> shipping service
synchronous between applications can be a little bit problematic sometimes,
if one service overwhelms the other because there is a sudden spike of purchases or whatever,
for example we have a video encoding service, and we need to encode 1,000 videos but usually it’s 10.
Well, our encoding service is going to be overwhelmed and we’re going to have outages.
So, when you have these sudden spikes of traffic or you can’t predict anything, then it’s usually better to decouple your application and have the decoupling layer scale for you.
decoupling layer
SQS for a queue model,
SNS for a pub/sub model,
Kinesis, if you do real time streaming and you have like big data.
SQS
a queuing service
a buffer to decouple between your producers and your consumers, to decouple applications
producer
whatever sends a message into our SQS queue
can be one or more
consumers
something needs to process the messages from the queue and receive them
consumers will poll the messages from the queue
you may have multiple consumers consuming messages from an SQS queue
throughputs
you can send as many messages per second as you want, and the queue can have as many messages as you want as well.
So there’s no limit on throughputs and no limits in the number of messages in the queue.
by default, the message will stay in the queue
for four days and the maximum amount of time
that a message can be in a queue is 14 days.
as soon as you send a message to a queue, it has to be read by consumer and deleted from the queue
after being processed within that retention period,
otherwise, it will be lost.
low-latency
whenever you send a message or read a message from SQS, you will get a response very quickly, less than 10 milliseconds on publish and receive.
message size
messages in SQS have to be small. They have to be less than 256 kilobytes per messages sent.
duplication
it is possible to have duplicated messages
In rare occasions, the retry mechanism might cause duplicate notifications for the same object event.
this is true for Standard SQS, and you have to take this into account in your application
the problem is solved in FIFO SQS and there is also message deduplication solution
message order
for standard SQS: It can have out of the order messages, which means it’s best effort ordering,
there is another type of offering from SQS that can deal with that limitation,
the producers will send the messages to SQS using
an SDK, software development kits. And the API to send a message to SQS is called SendMessage.
consumers can be running on
- EC2 instances
- virtual servers on AWS
- your own on-premises servers
- Lambda functions on AWS Lambda
consumer may receive up to
10 messages at a time
the consumer, it’s your code, has a responsibility to
process these messages. For example, insert some orders into an RDS database. something that you have to write with your code.
and then because these messages have been processed, your consumer will go ahead and delete these messages from the queue using the DeleteMessage API.
And this will guarantee that no other consumer will be able to see these messages and therefore, the message processing is complete.
multiple consumers will
receive and process these messages in parallel
if somehow a message is not processed fast enough by a consumer, it will be received by other consumers,
and so this is why we have at least once delivery.
And this is also why we have best-effort message ordering.
if we need to increase the throughputs because we have more messages,
then we can add consumers and do horizontal scaling
this is a perfect use case for using SQS with your Auto Scaling groups, or ASG.
your consumers will be running on EC2 instances inside of an Auto Scaling group and they will be polling for messages from the SQS queue.
But now your Auto Scaling group has to be scaling on some kind of metric, and a metric that is available to us is the Queue Length. It’s called ApproximateNumberOfMessages.
It is a CloudWatch Metric that’s available in any SQS queue. And we could set up an alarm, such as whenever the queue length go over a certain level, then please set up a CloudWatch Alarm, and this alarm should increase the capacity of my Auto Scaling group by X amount.
And this will guarantee that the more messages you have in your SQS queue, maybe because there’s a surge of orders on your websites, the more EC2 instances will be provided by your Auto Scaling group, and you will accordingly process these messages at a higher throughputs.
we can put producers on EC2 instances inside one ASG and put consumers on EC2 instances in the second ASG = complete decoupling and autoscaling
use case for decoupling using SQS
an application that processes videos.
We could have just one big application that’s called a front-end that will take the request and whenever a video needs to be processed, it will do the processing and then insert that into an S3 bucket. But the problem is that processing may be very, very long to do and it may just slow down your websites if you do this in the front-end here.
So instead, you can decouple your application here so that the request of processing a file and the actual processing of a file can happen in two different applications.
Now, when you do the request to process, that file will be in the SQS queue and you can create a second processing tier called the back-end processing application that will be in its own Auto-Scaling group to receive these messages, process these videos, and insert them into an S3 bucket.
So as we can see here with this architecture, we can scale the front-end accordingly, and we can scale the back-end accordingly as well, but independently.
And because the SQS queue has unlimited throughputs and it has unlimited number of messages
in terms of the queue, then you are really safe, and this is a robust and scalable type of architecture. And also, for your front-end, you can use the optimal type of EC2 instances or architecture for your front-end. And for the back-end, maybe if you’re doing some video processing, you can use some EC2 instances
that have a GPU, a graphical processing unit, because you know that these type of instances will be optimal for doing this kind of workload.
SQS security
- encryption in-flight - by sending and producing messages using the HTTPS API
- at-rest encryption using KMS keys
- if we wanted to, we can do client-side encryption, but that means that the client has to perform
the encryption and the decryption itself. - For access controls, IAM policies are going to be able to regulate access to the SQS API
- we also have SQS access policies helpful when you want to do cross-account access to SQS queue, or when you want you to allow other services,
such as SNS, or Amazon S3, to write to an SQS queue,
for example, with S3 events.
message visibility timeout
when a message is polled by a consumer, it becomes invisible to other consumers. A consumer does a ReceiveMessage request, a message will be returned from the queue. Now the visibility timeout begins.
by default, 30 seconds. During these 30 seconds, the message has to be processed. If the same or other consumers do a message request API call, then the message will not be returned. During the visibility timeout, that message is invisible to other consumers.
after the visibility timeout is elapsed
if the message has not been deleted, then the message will be put back into the queue and therefore, another consumer or the same consumer doing a receive message API call will receive the message again, the same message as before.
so if we don’t process a message within the visibility timeout window, then it will be processed twice potentially. Because it will be received by two different consumers, or twice by the same consumer.