SNS, SQS, Kinesis and Integration patterns Flashcards
What services can consume from an SQS Queue (3)? When they consume a queue, how many messages can be pulled at any one time
Lambda, EC2, On premise machines can consume SQS and can pull up to 10 messages at a time.
In SNS, how many topics can a producer send a message to? How many subscribers can a topic have?
A producer can send a message to only one topic in SNS, and that topic can have 10,000,000 subscribers
What is the maximum data buffer size in Kinesis Firehose? (hint: its a magic number) What is the maximum time that data can be buffered?
128 MB and 900sec (15 min)
For Kinesis, how many KCL instances can you have per shard? What order are records read in and where can KCL be deployed?
One KCL instance per shard. Records are read in order at the shard level. KCL can be deployed on EC2 instances, on prem or as Elastic Beanstalk.
We have a queue with a visibility timeout set to 10 seconds. What will happen to a message on that queue if the consumer takes more than 10 seconds to successfully process it? Can you do anything to prevent this behaviour without increasing the visibility timeout?
If the message has not been processed within the 10 second window, it will become visible again and another consumer can pick it up and process it. This means that a message can be processed more than once. There is a ChangeMessageVisibility API call which will allow your consumer to change the visibility timeout if processing takes longer than allocated.
What is the purpose of an SQS access policy?
These are simmilar to an S3 bucket access policy. An SQS access policy allows for cross account access to SQS as well as defining what other AWS services can write to SQS
If I have a kinesis stream set up with one shard and one producer - how many consumers can I have?
You can have as many consumers as possible BUT they must consume all less than 2MB/5TPS of data from the shard as this is the shard limit.
If I needed to load data into Redshift, S3, ElasticSearch or Splunk from Kinesis analytics, what would I use? Is it real time, if not what is the latency?
Kinesis Firehose - Firehose is near realtime as there is a 60 second latency.
For SNS, what mechanisms are used for encryption inflight, and at rest (3)? What mechanism is used to regulate access control to SNS? What would allow cross account access to SNS or to define which AWS services can write to an SNS topic (hint policies)?
Inflight via HTTPS API
At rest using KMS keys
Or you can encrypt client side
Access control is regulated by IAM policies
SNS access policies define which services can write to SNS and allow for cross account access.
How is Kinesis Firehose billed? Do you need to provision capacity?
Billing is based on the amount of data going through Firehose (and for data conversion). You don’t need to provision capacity as Firehose scales automatically.
I have a situation where I need to trigger jobs based on an S3 object create event type where the object is prefixed with images/. These events will need to be sent to multiple destinations consisting of SQS and Lamda. What pattern would you use and WHY is it appropriate in this instance?
Object create event types can write to only one SQS queue meaning we can’t send an event notification to multiple destinations. In this case we would use SNS fanout and send the notification to SNS where it can be consumed by multiple SQS queues and lamda functions.
I have a situation where I have multiple messages coming into a FIFO queue for different customers. I need to ensure that each set of messages for each customer is processed IN ORDER, for instance customer A has 4 messages which must be processed in order, and customer B has 2 messages which must be processed in order. How can I do this in SQS and will the messages for customer A be processed before customer B and is there a difference in ordering if I have one SQS consumer versus multiple ?
SQS FIFO has a MessageGroupID setting. We can specify different values per customer (i.e GroupA, B) which will ensure that messages within that group are processed in order.
Ordering across groups is not guaranteed so customer B might start processing before customer A.
If you only have one consumer then the messages are processed in standard FIFO order. If you have multiple consumers then individual groups are assigned to each.
How many AZ’s is Kinesis replicated to?
3
We have a system using an SQS queue to integrate with several EC2 consumer instances. We notice though that several messages fail processing and constantly get sent back to the queue where they are picked up by other consumers fail again in which case the loop starts again. What would you do to ensure that a message is only sent back to the queue 3 times?
You would setup a dead letter queue and and define a MaximumRecieves threshold of 3. After 3 attempts, the message will be sent to the dead letter queue.
In terms of kinesis, What is the write and read rate per shard and how are records ordered? Do you need to provision capacity or does Kinesis scale automatically?
How long is data retained in a shard (default and max in days)?
Writes=1MB/Sec or 1000 messages.
Reads=2MB/Sec
Records are ordered within a shard and you need to provision capacity.
Data is retained for 1 day be default, 7 days max.
I have an application set up with a front end web tier and some back end applications responsible for generating shipping orders. These are decoupled using SQS. When there is a surge in load, I notice more messages in the queue waiting to be processed. Currently my back end only has 2 EC2 instances. What can I do to elastically scale when I have surges in orders being received?
You would ideally use an autoscaling group for the backend. Scale events can be triggered by using cloudwatch to monitor the number of messages in the queue, and trigger a scale out when it breaches the threshold using the ApproximateNumberOfMessages API call.
How many shards would I need to support 5MB/sec writes and 6MB Sec reads?
5 and 3 respectively - 1MB/Sec write, 2MB/Sec reads
If I needed to transform click stream data before delivering it to my consuming application, would I use Kinesis Streams or Kinesis Firehose? What underlying technology supports this?
Kinesis Firehose allows you to transform data in the stream. It uses Lambda synchronous invocations to operated against a buffered batch of stream data (up to 3MB)
What are the impacts if the visibility timeout is set at a high value (hours) or a very low value (a few seconds)?
If the timeout is set very high, then messages will take a very long time to reappear - for instance if the consumer crashes the other consumers won’t pick up the message for the period of the timeout. If the value is set to low, then we will likely get a lot of duplicates being processed.
What are the min, max and default values for the VISIBILITY timeout in SQS?
0 sec min
12 hrs max
30 sec default
Assume we have a courier business with 100 drivers nation wide. Each van has its own unique ID and uses this to stream in gps data. For an SQS FIFO FiFo queue how many group ID’s could you create and how many consumers could you have?
We would have 1 fifo queue with 100 group ID’s and up to 100 consumers
For Kinesis, what is used to control access and authorization? How is encryption in transit and at rest handled?
Access and authorization is handled by IAM. Encryption at rest is handled by KMS and encryption in flight is via HTTPS.