SQS SNS KINESIS Flashcards
SQS
Standard Queue overview
default attributes
Oldest offering
Unlimited throughput
Short lived: DEFAULT 4 days, MAX 14 days
Low latency, 10ms response on publish/receive
Less than 256KB per message
May have duplicate messages, AT LEAST ONCE DELIVERY
Can be out of order, BEST EFFORT ordering
SQS messages
Producers
API to send message
Producers: 256KB max Produced to SQS via SDK API: *SendMessage* Default 4 days, max 14 days in queue
SQS messages Consumers types how it retrieves messages, how many at a time what happens to messages API for deletion, receiving also
Consumers need to be applications that run code
- On premise
- AWS lambda
- EC2
Polls- Request messages from SQS, can receive up to 10 messages at a time per poll
API: ReceiveMessages
Consumer will process message, then delete message in Queue
API: DeleteMessageAPI
SQS Multiple Consumers
AND
ASG and Sqs
-ASG horizontal scaling via which alarm.
Can have multiple consumer receive message on parallel.
At least once delivery and best message ordering is applicable due to this parallel organization
Horizontal scaling is done via adding EC2 to consumer group.
Consumers inside ASG, EC2 instances inside ASG will scale according to factor related to Queue length
Queue Length, Attached to CW alarm will allow EC2 scaling to occur via ASG to handle Queue length.
SQS Decouple Application tiers
Front end and back end decouple
Request goes to front end
Front end should send SendMessage to SQS queue
Backend should be created to poll the SQS queue for messages independently, ReceiveMessages will load the tasks onto the backend which will scale according to SQS queue length.
Backend sends final processed files to final destination.
SQS security
Encryption flight
Encryption rest
Client side
Access control
SQS access policies
what is it like
what is it used for
Encryption flight- HTTPS API
Encryption rest - KMS
Client side- Client must do this themselves
Access control- IAM policies regulate access to SQS api
SQS access policies (like bucket policies)
Cross account account access to SQS queues
Allows other services to write to SQS queue.
SQS: Message Visibility timeout purpose how long it lasts what other consumers see/ can do after it expires
API to extend
consequence of long / short sqs
Messages that are pulled from the Queue to be processed become indivisible so other consumers cannot pull it.
VISIBILITY TIMEOUT: default is 30 seconds, this allows 30 seconds to be processed before it is processed by another consumer.
after 30 seconds it becomes VISIBLE
*** IF NOT PROCESSED IN TIME, IT MAY BE PROCESSED TWICE!
API: ChangeMessageVisibility allows you to allot more time for the message to be processed after a consumer retrieves it via API: ReceiveMessage
Consequences
Long: Reprocessing takes time if crash occurs
Short: May get duplicates.
SQS Dead Letter Queue
DLQ
A separate queue is created, this queue can be designated as a dead letter queue.
After multiple failures to be processed in the timeframe of the visibility timeout we trigger additional times in which the same message reaches the Queue.
API: MaximumReceives sets the threshold of the message going back to the queue before it then gets files to a separate DLQ
DLQ will be used for debugging, used for later processing.
These messages will expire like a normal SQS queue, good to set retention of 14 days of this DLQ
SQS Delay Queue purpose Default max API: what is it , and what is it used for?
Delay messages so consumers cannot see the messages immediately Default 0 MAX 15 min ^This is for all messages API: DelaySeconds- amount to delay API call
SQS: developer concepts LONG POLLING what kind of polling is default where is it set what api
Not like Delay Queue, this on consumer side.
Long polling doesn’t return a response until a message arrives in the message queue
Allows less API Load, less expensive
Time can be 1 - 20 seconds
Readrequest will be paused until this time above is passed, then it will ask if messages are on the queue via polling.
Long is preferrable over short
Enabled at
Queue level
API level: WaitTimeSeconds.
SQS: developer concepts
Extended Client
Size limit is 256KB, this helps sending bigger messages.
USE: SQS extended client (java library)
Producer wants to send large message, instead you can send a SMALL METADATA MESSAGE which will reference a large message in the amazon S3 Bucket.
Consumer will read from SQS queue, will consume small metadata message which will direct to read data from S3.
ex: Video file processing
You can use the Amazon SQS Extended Client Library for Java to do the following:
Specify whether messages are always stored in Amazon S3 or only when the size of a message exceeds 256 KB
Send a message that references a single message object stored in an S3 bucket
Retrieve the message object from an S3 bucket
Delete the message object from an S3 bucket
SQS: developer concepts
API
CreateQueue
(MessageRetentionPeriod)
DeleteQueue
PurgeQueue
SendMessage
(DelaySeconds)
RecieveMessage
DeleteMessage
ReceiveMessageWaitTimeSeconds
ChangeMessageVisibility
BATCH: SendMessage, DeleteMessage, ChangeMessageVisibility
MaximumReceives
ReceiveMessages
DeleteMessageAPI
CreateQueue - create quue
(MessageRetentionPeriod)- set how long messages are retained
DeleteQueue- Delete entire queue contents and name
PurgeQueue- delete messages in queue
SendMessage- as producer send message
(DelaySeconds)- delay for each message
RecieveMessage - consumer can use this to receive
DeleteMessage - consumer wants to delete processed message
ReceiveMessageWaitTimeSeconds: For long polling, wait for receiving messages if queue empty
ChangeMessageVisibility: more time to process, change visibility timeout
BATCH: SendMessage, DeleteMessage, ChangeMessageVisibility. Help decrease costs, batch for request .
AWS SQS FIFO QUEUE
overview
speed
First in First Out
Ordering of Messages exact
Batch 3000, unbatched 300 per second
Exactly sent once
Processed in order.
Decouple, also need to maintain order with throughput constraint.
AWS SQS FIFO QUEUE
Deduplication
interval
2 types of rejection.
Deduplication interval 5 minutes
- Same message twice within 5 minutes will cause 2nd to be rejected
two methods
- Content based: SHA-256 Hash of body will match and be rejected
- Message dedup ID, if same ID is encountered in 5 minutes then message is dropped
AWS SQS FIFO QUEUE
Message Grouping
one consumer
groupings
MessageGroupID: Mandatory paramter, if you specify one value then all messages will be sent to one consumer
Grouping level Subset of messages:
Specify different values for MessageGroupID
-grouped by same MessagegroupID
-Each separate ID will have sep consumer
-ORDERING IS NOT GARUNTEED IN THIS BETWEEN GROUPS
SNS overview Event Producer: Event Receiver: Subscriptions Subscribers can be sent messages via
One Message many receivers
Direct integration: one to many is cumbersome
Pub/Sub: One to a topic, people will subscribe to this topic and adding more just needs to allow more subscriptions
Event Producer: Send message to SNS topic
Event Receiver: Subscriptions, listen into SNS topic notifications, VERY highly scalable.
Subscribers can be SQS HTTP HTTPS Lambda Email SMS Mobile
SNS Integration with services
examples of some
Cloudwatch: alarms Autoscaling group: notification of changes S3: bucket events Cloudformation: state changes etc
SNS publishing
TOPIC publish
DIRECT publish
TOPIC publish to SNS (USE SDK)
- create a topic
- create subscription, or many
- publish to topic
DIRECT publish (For mobile apps SDK) Create platform application create platform endpoint publish to platform endpoint
Works with third party tools to receive notifications.
SNS Security
encryption types
access control
SNS ACCESS POLICIES:
Similar to SQS
In flight by default HTTPS
At rest KMS
Client side , operations need to be done themselves
Access controls: IAM policies to regulate access to SNS API
SNS ACCESS POLICIES: like s3 bucket policies
- good for cross account access to SNS topics
- good for access to other services like S3 to write to SNS topic.
SNS SQS fanout pattern
process
why is it used
what kind of SQS queues can this NOT work for.
Send same message to many different SQS queues:
-> Push message to SNS, have SQS queues subscribe to service so one message reaches multiple parallel SQS queues.
Full Decoupled, NO DATA LOSS
- Used for : DATA persistence, delay processing, delayed processing and retries
- CAN add more SQS subs over time.
- SQS queue needs access policy for SNS to write
SNS CANNOT SEND MESSAGES TO FIFO QUEUE
KINESIS OVERVIEW
Good for
what kind of data
availability
Great for Application Logs, Metric, IOT, clickstream
REAL TIME BIG DATA
Good for streaming processing framework
AUTOMATICALLY replicated to 3 AZ’s
Kinesis products overview
- Kinesis stream
- Kinesis Analytics
- Kinsesis Firehose
main focus on 1.
- **1. Kinesis stream- Low latency streaming INGEST
2. Kinesis Analytics: Real time analytics on streams with SQL
3. Kinsesis Firehose: Load streams into S3, DynamoDB, and ElasticSearch
Kinesis diagram overview
flow of data to storage
- Data flows into Kinesis streams
- streams loads into Kinesis analytics to process
- after processing, the end product is loaded into kinesis firehose to endpoint data storage
KINESIS STREAMS
Shards
Default Shard time, MAX time
IMMUTABLE data
Producers of data that are using Streams will load data into a scalable shard system. This shard system can add more shards for more data
*Shards can only last 1 day DEFAULT, 7 Day MAX
you should want to process this data quick
*Ability to REPROCESS and REPLAY data
*Multiple applications can consume this stream
SCALABLE CONSUMERS
*IMMUTABLE, ONCE inserted , the data cannot be deleted