AWS Integration & Messaging Flashcards
There are two patterns of application communication
1) Synchronous communications (application to application)
2) Asynchronous / Event based (application to queue to application)
name all three different models for decoupling your application
- using SQS: queue model
- using SNS: pub/sub model
- using Kinesis: real-time streaming model
SQS - How many messages can be in the queue?
Unlimited
SQS - What’s the size lmitation per message sent?
256kb
What’s SQS – Delay Queue
• Delay a message (consumers don’t see it immediately) up to 15 minutes • Default is 0 seconds (message is available right away)
What parameter overrides the default SQS – Delay Queue?
DelaySeconds parameter
What can you expect to get back from SQS
- Message identifier
* MD5 hash of the body
What’s the lifecycle of SQS - Consuming Messages?
- Poll SQS for messages (receive up to 10 messages at a time) • Process the message within the visibility timeout
- Delete the message using the message ID & receipt handle
What is SQS - Visibility timeout?
When a consumer polls a message from a queue, the message is “invisible” to other
consumers for a defined period… the Visibility Timeout:
What is the limits and default for SQS - Visibility timeout?
• Set between 0 seconds and 12 hours (default 30 seconds)
SQS - What’s the API to change the visibility while processing a message?
ChangeMessageVisibility
What’s the API to tell SQS the message was successfully processed?
DeleteMessage
What is a Dead Letter Queue?
-If a consumer fails to process a message within the Visibility Timeout…
the message goes back to the queue!
• We can set a threshold of how many times a message can go back to the queue – it’s called a “redrive policy”
• After the threshold is exceeded, the message goes into a dead letter queue (DLQ)
• We have to create a DLQ first and then designate it dead letter queue
-Make sure to process the messages in the Failure DLQ before they expire!
Define AWS SQS - Long Polling.
-When a consumer requests message from the queue, it can optionally “wait” for messages to arrive if there are none in the queue
• This is called Long Polling
• LongPolling decreases the number of API calls made to SQS while increasing the efficiency and latency of your application.
-The wait time can be between 1 sec to 20 sec (20 sec preferable)
SQS Long polling can be enabled at the queue level or at the API level using what param?
WaitTimeSeconds
SQS - FIFO has higher or lower throughput?
Lower throughput (up to 3,000 per second with batching, 300/s without)
SQS - FIFO Message groups allow for what?
Possibility to group messages for FIFO ordering using “Message GroupID”
SNS - Up to ___ subs per topic
Up to 10,000,000 subscriptions per topic
SNS - What’s the topics limit?
100,000 topics limit
SNS - Who can subscribe?
- SQS
- HTTP / HTTPS (with delivery retries – how many times)
- Lambda
- Emails
- SMS messages
- Mobile Notifications
SNS - How do you publish a topic (within your AWS Server – using the SDK)?
- Create a topic
- Create a subscription (or many)
- Publish to the topic
Explain AWS Kinesis
- Kinesis is a managed alternative to Apache Kafka
- Great for application logs, metrics, IoT, clickstreams
- Great for “real-time” big data
- Great for streaming processing frameworks (Spark, NiFi, etc…)
Kineses Data is automatically replicated to how many AZ’s?
Data is automatically replicated to 3 AZ
What are the three types of Kinesis?
- Kinesis Streams: low latency streaming ingest at scale
- Kinesis Analytics: perform real-time analytics on streams using SQL
- Kinesis Firehose: load streams into S3, Redshift, ElasticSearch…
Kinesis Streams are ordered into?
Shards / Partitions
Kinesis Streams Data retention is what by default and can go up to how many days?
Data retention is 1 day by default, can go up to 7 days
Kinesis Streams give you the ability to reprocess / replay data?
True
Kineses Streams - Only a single app can consume a single stream. T / F?
False - Multiple applications can consume the same stream
Kineses Streams - Once data is inserted in Kinesis, it can’t be deleted (immutability) T/F
True
Kineses Streams Shards - What size or how many messages per second?
1MB/s or 1000 messages/s at write PER SHARD
Kineses Streams - How many MB per read?
2MB/s at read PER SHARD
Kinese Streams Billing?
Billing is per shard provisioned, can have as many shards as you want
Kinesis Streams shard records are ___ per shard?
Ordered
How to reduce costs and increase throughput of kineses streams?
Use Batching with PutRecords
AWS Kinesis API - ProvisionedThroughputExceeded Exceptions. Explain this
Happens when sending more data (exceeding MB/s or TPS for any shard)
• Make sure you don’t have a hot shard (such as your partition key is bad and too much data goes to that partition)
Solution
• Retries with backoff
• Increase shards (scaling)
• Ensure your partition key is a good one
AWS Kinesis API – Consumers have two options in terms of use.. Normal consumer (CLI, SDK..) and…
- Can use a normal consumer (CLI, SDK, etc…)
• Can use Kinesis Client Library (in Java, Node, Python, Ruby, .Net)
• KCL uses DynamoDB to checkpoint offsets
• KCL uses DynamoDB to track other workers and share the work amongst shards
Kineses Security - Control access / authorization using..
IAM policies
Kineses Security - Encryption in flight using…
HTTPS endpoints
Kineses Security - Encryption at rest using…
KMS
Is the possibility to encrypt / decrypt data client side there?
Yes but it’s harder
What support is there for VPC access?
VPC Endpoints available for Kinesis to access within VPC
Explain AWS Kinesis Data Analytics
Perform real-time analytics on Kinesis Streams using SQL
• Kinesis Data Analytics: • Auto Scaling
• Managed: no servers to provision • Continuous: real time
• Pay for actual consumption rate
• Can create streams out of the real-time queries
Explain AWS Kinesis Firehose
- Fully Managed Service, no administration
- Near Real Time (60 seconds latency)
- Load data into Redshift / Amazon S3 / ElasticSearch / Splunk • Automatic scaling
- Support many data format (pay for conversion)
- Pay for the amount of data going through Firehose
What is Amazon MQ
managed Apache ActiveMQ. When migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ
- Amazon MQ doesn’t “scale” as much as SQS / SNS
- Amazon MQ runs on a dedicated machine, can run in HA with failover
- Amazon MQ has both queue feature (~SQS) and topic features (~SNS)