SQS, SNS, Kinesis Flashcards
- Oldest offering (over 10 years old)
- Fully managed service, used to decouple applications
- Attributes:
- Unlimited throughput, unlimited number of messages in queue
- Default retention of messages: 4 days, maximum of 14 days
- Low latency (<10 ms on publish and receive)
- Limitation of 256KB per message sent
- Can have duplicate messages (at least once delivery, occasionally)
- Can have out of order messages (best effort ordering)
Amazon SQS – Standard Queue
Produced to SQS using the ______?
SDK (SendMessage API)
The message is ___________ in SQS until a consumer deletes it
persisted
SQS – Producing Messages - Message retention
default 4 days, up to 14 days
What is the SQS standard
unlimited throughput
SQS Consumers run on ____?
EC2 instances
On-premise servers
AWS Lambda
How many messages can a SQS Consumer receive at a time?
10
SQS Consumer - Delete the messages using the __________ API
DeleteMessage
- Consumers receive and process messages in parallel
- At least once delivery
- Best-effort message ordering
- Consumers delete messages after processing them
- We can scale consumers horizontally to improve throughput of processing
SQS – Multiple EC2 Instances Consumers
Amazon SQS - Encryption
- In-flight encryption using HTTPS API
- At-rest encryption using KMS keys
- Client-side encryption if the client wants to perform encryption/decryption i
Amazon SQS - Access Controls
IAM policies to regulate access to the SQS API
2 uses for SQS Access Policies?
- Useful for cross-account access to SQS queues
- Useful for allowing other services (SNS, S3…) to write to an SQS queue
After a message is polled by a consumer, it becomes _______ to other consumers
invisible
By default, the “message visibility timeout” is __________?
30 seconds
If a message is not processed within the visibility timeout, it will be processed ___________?
TWICE
A consumer could call the ____________ API to get more time
ChangeMessageVisibility
What happens if a visibility timeout is high (hours)?
consumer crashes, re-processing will take time
What happens if visibility timeout is too low (seconds)?
we may get duplicates
If someone wants to decrease the latency and increase efficiency and decrease there API calls to a SQS Queue .. what should they do?
Long Polling
- Limited throughput: 300 msg/s without batching, 3000 msg/s with
- Exactly-once send capability (by removing duplicates)
- Messages are processed in order by the consumer
Amazon SQS – FIFO Queue
What if you want to send one message to many receivers?
Amazon SNS
- The “event producer” only sends message to one SNS topic
- As many “event receivers” (subscriptions) as we want to listen to the SNS topic notifications
- Each subscriber to the topic will get all the messages (note: new feature to filter messages)
- Up to 12,500,000 subscriptions per topic
- 100,000 topics limit
Amazon SNS
What are the 2 types of publishing for AWS SNS?
Topic Publish (using the SDK)
Direct Publish (for mobile apps SDK)
Amazon SNS –
* Create a topic
* Create a subscription (or many)
* Publish to the topic
Topic Publish (using the SDK)
Amazon SNS –
* Create a platform application
* Create a platform endpoint
* Publish to the platform endpoint
* Works with Google GCM, Apple APNS, Amazon ADM…
Direct Publish (for mobile apps SDK)
What is SNS Encryption?
- In-flight encryption using HTTPS API
- At-rest encryption using KMS keys
- Client-side encryption if the client wants to perform encryption/decryption itself
2 uses for the SNS Access Policies
- Useful for cross-account access to SNS topics
- Useful for allowing other services ( S3…) to write to an SNS topic
- Push once in SNS, receive in all SQS queues that are subscribers
- Fully decoupled, no data loss
- SQS allows for: data persistence, delayed processing and retries of work
- Ability to add more SQS subscribers over time
- Make sure your SQS queue access policy allows for SNS to write
- Cross-Region Delivery: works with SQS Queues in other regions
SNS + SQS: Fan Out
For the same combination of: event type (e.g. object create) and prefix (e.g. images/) you can only have one S3 Event rule.
What do you use if you want to send the same S3 event to many SQS queues?
Fan Out Method
With SNS - FIFO Topics, what type of SQS queues can you have as subcribers?
SQS FIFO
Does Amazon SNS – FIFOTopic have limited throughput?
YES! Just like SQS FIFO
- JSON policy used to filter messages sent to SNS topic’s subscriptions
- If a subscription doesn’t have a filter policy, it receives every message
SNS – Message Filtering
- Makes it easy to collect, process, and analyze streaming data in real-time
- Ingest real-time data such as: Application logs, Metrics, Website clickstreams, IoT telemetry data…
Kinesis
capture, process, and store data streams
Kinesis Data Streams
load data streams into AWS data stores
Kinesis Data Firehose
analyze data streams with SQL or Apache Flink
Kinesis Data Analytics
capture, process, and store video streams
Kinesis Video Streams
What are the 4 Kinesis Products?
Kinesis Data Streams
Kinesis Data Firehose
Kinesis Data Analytics
Kinesis Video Streams
Kinesis Data Streams - Retention
between 1 day to 365 days
Does Kinesis Data Streams have the ability to reprocess (replay) data??
YES
Once data is inserted in Kinesis, it ________?
can’t be deleted (immutability)
Does data that share the same partition go to the same shard?
YES
What is it called when data shares the same partition and goes to the same shard?
Ordering
What are 3 examples of Kinesis Data Streams Producers?
AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent
What are the 2 different types and they’re examples of Kinesis Data Streams Consumers?
Write your own: Kinesis Client Library (KCL), AWS SDK
Managed: AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics,
Kinesis Data Streams – Capacity Modes (2)
Provisioned mode
On-demand mode
Kinesis Data Streams – Capacity Modes:
* No need to provision or manage the capacity
* Default capacity provisioned (4 MB/s in or 4000 records per second)
* Scales automatically based on observed throughput peak during the last 30 days
* Pay per stream per hour & data in/out per GB
On-demand mode
Kinesis Data Streams – Capacity Modes:
* You choose the number of shards provisioned, scale manually or using API
* Each shard gets 1MB/s in (or 1000 records per second)
* Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
* You pay per shard provisioned per hour
Provisioned mode
- Control access / authorization using IAM policies
- Encryption in flight using HTTPS endpoints
- Encryption at rest using KMS
- You can implement encryption/decryption of data on client side (harder)
- VPC Endpoints available for Kinesis to access within VPC
- Monitor API calls using CloudTrail
Kinesis Data Streams Security
Fully Managed Service, no administration, automatic scaling, serverless
Kinesis Data Firehose
What are 3 consumers (and examples) for Kinesis Data Firehose
- AWS: Redshift / Amazon S3 / OpenSearch
- 3rd party partner: Splunk / MongoDB / DataDog / NewRelic / …
- Custom: send to any HTTP endpoint
How do you get charged for using Kinesis Data Firehose
Pay for data going through Firehose
What is Kinesis Data Firehose latency?
- Near Real Time
- 60 seconds latency minimum for non full batches
- Or minimum 1MB of data at a time
Does Kinesis Data Firehose support many data formats, conversions, transformations, compression
YES
Does Kinesis Data Firehose support custom data transformations using AWS Lambda
YES
Where can Kinesis Data Firehose send failed or all data?
a backup S3 bucket
Kinesis Data Streams vs Firehose
Kinesis Data Streams:
* Streaming ser vice for ingest at scale
* Write custom code (producer / consumer)
* Real-time (~200 ms)
* Manage scaling (shard splitting / merging)
* Data storage for 1 to 365 days
* Supports replay capability
Kinesis Data Firehose:
* Load streaming data into S3 / Redshift / OpenSearch / 3rd party / custom HTTP
* Fully managed
* Near real-time (buffer time min. 60 sec) * Automatic scaling
* No data storage
* Doesn’t support replay capability
How is data sent into Kinesis?
using a Partition Key
Does the same key always go to the same shard?
YES
What is similar to Partition Key in SQS?
Group ID
- Consumer “pull data”
- Data is deleted after being consumed
- Can have as many workers (consumers) as we want
- No need to provision throughput
- Ordering guarantees only on FIFO queues
- Individual message delay capability
SQS
- Push data to many subscribers
- Up to 12,500,000 subscribers
- Data is not persisted (lost if not delivered)
- Pub/Sub
- Up to 100,000 topics
- No need to provision throughput
- Integrates with SQS for fan- out architecture pattern
- FIFO capability for SQS FIFO
SNS