Module 11 - Serverless & Messaging Flashcards
What are the advantages of serverless architecture?
- No infrastructure to provision or manage
- No servers to provision, operate, or patch
- Scales automatically by unit of consumption rather than by server unit
- Pay-for-value billing model (pay for the unit rather than by server unit)
- Built-in availability and fault tolerance
- No need to architect for availability because it is built into the service
What can you do with the API gateway?
you can create, publish, maintain, monitor, and secure APIs
you can connect your applications to AWS services and other public or private websites.
- Host and use multiple versions and stages of your APIs.
- Create and distribute API keys to developers.
- Use Signature Version 4 to authorize access to APIs.
- Use RESTful or WebSocket API
What tasks does the API gateway handle?
traffic management, authorization and access control, monitoring, and API version management
How does the API gateway handle logging and metrics?
It integrates with CloudWatch, sends metrics and log messages there.
What metrics can the API gateway send to CloudWatch?
- Number of API calls
- Latency
- Integration latency
- HTTP 400 and 500 error
- (also who has accessed your API and how it was accessed)
What is Amazon SQS? What does it do?
Amazon Simple Queue Service.
It’s a fully managed service that processes messages. It stores all message queues and messages within a single, highly available AWS Region with multiple redundant Availability Zones
What is the benefit of an SQS queue?
Unlimited throughput, unlimited messages
isolates the processing logic into its own component and runs it in a separate process from the web application. This makes the system more resilient to spikes in traffic.
you can decouple preprocessing steps from compute steps and post-processing steps –> scalability and reliability.
What are some ways to use SQS queues?
Work queues: regular or FIFO
Buffering and batch operations: smooth out temporary volume spikes without losing messages or increasing latency.
Request offloading: Move slow operations off interactive request paths by enqueueing the request.
Auto-scaling instances: use the queue to determine the load, combine with autoscaling.
What are the SQS queue types?
Standard -
• at-least-once message delivery
• best-effort ordering
• nearly unlimited number of API calls per second
FIFO - when the order of operations and events is critical or where duplicates can’t be tolerated
• exactly-once processing
• a limited number of API calls per second (300)
• only 1 consumer (you can have more if you use GroupID to separate them)
What are some features of SQS queues?
distributed queue system
super low latency (<10ms response)
keeps messages from 1 to 14 days, but the default is 4 days
messages must be small: up to 256 KB of text in any format
supports multiple producers and consumers interacting with the same queue
How is an SQS message consumed?
The consumer polls for the message; the message is still in queue during processing.
Consumer can receive up to 10 messages at once.
Amazon SQS sets a visibility timeout so that other consumers don’t grab the same message (default 30 sec; max 12 hours)
When processed, the CONSUMER deletes the message. If processing fails, the message becomes visible again.
What is short polling?
When call or queue attribute:
ReceiveMessage.WaitTimeSeconds = 0
Default queue behavior.
What happens when you consume from a queue with short polling?
SQS samples a subset of servers and returns only those messages (so maybe you won’t get all messages)
If you have < 1000 messages OR you keep consuming, you will eventually get all your messages.
What happens when you consume from a queue with long polling?
Amazon SQS samples ALL servers. It can wait up to 20 seconds for messages to arrive before responding.
Reduces the COST of using Amazon SQS by reducing the number of calls.
Reduces latency, increases efficiency.
Can be enabled at the queue level, or at the API level using WaitTimeSeconds.
What should you do if you implement long polling with multiple queues?
Use one thread for each queue so your application can process the messages in each queue as they become available. Otherwise, your application will be blocked from the other queues.
How does a dead-letter queue work?
After trying to process a message a few times (when the MaximumReceives threshold is exceeded), the message goes to the DLQ.
Works like any other queue. It must be in the same AWS account and region as the queues that use it.
You should set a high retention so you have time to debug.
What are some use cases that DON’T work for queues?
Selecting specifics messages
Large messages
What if I want to pass giant messages?
Store them in S3 and just pass a reference to the message.
What is Amazon SNS?
Amazon Simple Notification Service is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
Provides a low-cost infrastructure for the mass delivery of messages, predominantly to mobile users.
Pub-Sub model.
Subscribers get all messages (unless using a filter feature)
How do you set up SNS?
Create a topic and policies that indicate who can publish or subscribe.
SNS matches the topic to a list of subscribers for that topic and delivers the message to each subscriber.
Each topic has a unique name that identifies the Amazon SNS endpoint where publishers post messages, and where subscribers register for notifications.
Supports encrypted topics using KMS keys.
What are some use cases for SNS?
Alerts when events occur, like autoscaling.
Push SMS or email to news subscribers
Notifications to an app to indicate an update is available
What are the SNS notification types?
- HTTP or HTTPS
- SMS clients
- SQS queues
- Lambda function
- Kinesis Data Firehose
Characteristics of SNS
- best-effort order
- can’t recall a message after successful delivery
- You can use an Amazon SNS Delivery Policy to control the retry pattern
- To prevent messages from being lost, all messages are stored redundantly across multiple servers and data centers.
- an unlimited number of messages at any time.
- applications and end-users on different devices can receive notifications by Mobile Push notification
- access control mechanisms to ensure that topics and messages are secured against unauthorized access
What are some SNS retry policies?
linear geometric exponential backoff maximum and minimum retry delays other patterns...
How do you set up SNS to deliver to an SQS for use in microservice architecture?
The SQS queue must subscribe to the topic. If you own the topic and the queue then that’s all.
If the queue owner doesn’t own the topic, then you need an explicit confirmation to the subscription request.
What is an SNS fan-out scenario and when would you want that?
SNS message is sent to a topic and then replicated and pushed to multiple SQS queues, HTTP endpoints, or email addresses. This allows for parallel asynchronous processing.
Do this instead of sending the same message or request multiple times.
What are the differences between SNS and SQS?
SNS messages are not persistent (messages are discarded after retries are exhausted)
SNS pushes one to many, SQS is one-to-one.
SNS consumers are passive, SQS consumers are active (polling)
SQS allows messages to be delivered even when each component might not be available at the same time.
What is Kinesis?
A serverless streaming data service that makes it easy to capture, process, and store data streams in real-time at any scale
What Kinesis products are there?
Data streams (capture/process/store data streams)
Firehose (load into stores)
Data analytics (analyze streams with SQL or Flink)
Video streams (capture/process/store video streams)
How does Data Streams work?
You first create a stream and specify the number of shards (units of read/write capacity; max 1MBps read, 2MBps write). More shards = more throughput.
Producers (EC2 instance, IoT device, etc.) write to the stream. Records = Partition key + data blob (<=1MB)
Consumers read from a shard (more than one consumer can read from a shard)
Lambda can be a consumer
How does Firehose work?
Processes data in NEAR real-time. (60 seconds latency for non-full batches, or min 32MB data in each batch)
Serverless, no administration, auto-scales. (cf. Data Streams)
Producers send data
Data is batched and compressed
Firehose writes to the destination (applications, storage, analytics, HTTP endpoint you own, 3rd party services)
You can send failed or all data to a backup S3 bucket.
How does Firehose deliver data to RedShift?
- Delivers incoming data to your S3 bucket
2. Issues an Amazon Redshift COPY command to load the data from your S3 bucket to your Amazon Redshift cluster
What is Amazon ES?
Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch APIs and real-time analytics capabilities. Firehose buffers incoming records, based on the buffering configuration of your delivery stream. Then it generates an Elasticsearch bulk request to index multiple records to your Elasticsearch cluster.
What are Step Functions?
AWS Step Functions is a low-code, visual workflow service that you use to build distributed applications, automate IT and business processes, and build data and machine learning pipelines.
You can directly call API actions from the Amazon States Language in Step Functions and pass parameters to the APIs of other services
What is a state machine?
A state machine is an object that has a set number of operating conditions that depend on its previous condition to determine output.
How can you create a state machine with Step Functions?
Use a JSON-based Amazon States Language, which contains a structure made of various states, tasks, choices, error handling, and more. Like Activiti.
What are the different states and functions your state machine can perform?
- do work (a Task state)
- Make a choice between different branches to run (a Choice state)
- Stop with a failure or success (a Fail or Succeed state)
- Pass its input to its output or inject some fixed data (a Pass state)
- Provide a delay for a certain amount of time or until a specified time or date (a Wait state)
- Begin parallel branches (a Parallel state)
- Dynamically iterate steps (a Map state)
What are Express Workflows good for?
High-volume, event-processing workloads such as IoT data ingestion, streaming data processing and transformation, and mobile application backends.
At-least-once model; ideal for orchestrating idempotent actions such as transforming input data and storing using PUT in DynamoDB.
What is Amazon MQ?
A managed message broker service for Apache ActiveMQ and RabbitMQ. Industry-standard, so you can migrate to AWS without having to rewrite code.
Has queue and topic (SNS) features
Good for hybrid cloud and when modernizing applications.
NOT serverless. It’s an open-source message broker, written in Java.
Does not scale as much as SNS/SQS
Runs on dedicated machine, can run in HA with failover
Data stored redundantly across AZs.
What are the different types of brokers in Amazon MQ?
Single-instance
Active/standby for high availability - two brokers in two different Availability Zones, configured in a redundant pair. One is active, the other is on standby.
When should you choose MQ over SQS and SNS?
You are migrating to the cloud
You need other protocols besides https
You need more features, not as concerned about throughput
What are some ways to decouple applications?
SQS: queue
SNS: pub/sub
Kinesis: real-time streaming data
These can all scale independently of the applications
Describe the security for SQS (and SNS).
In-flight encryption with HTTPS API
At rest encryption with KMS keys
Optional client-side encryption
IAM policies control access to the SQS API
SQS policies (work like bucket policies)
• good for cross-account access
• other services can write to the queue
A consumer is processing a message from an SQS queue, but it will take longer than the visibility timeout allows. What should the consumer do?
Call the API ChangeMessageVisibility to get more time.
I have some messages in a dead letter queue. I fixed the problem, and now I want to process those messages. How do I do that?
“Redrive to source” feature.
What is a delay in SQS?
A setting where consumers don’t see the message immediately (up to 15 mins). Default delay is 0.
You can override the delay on sending the message using the DelaySeconds parameter.
How do you implement the Request-Response System?
Use the SQS Temporary Queue Client. It creates virtual queues (cheaper) to implement the pattern.
How do you set up autoscaling for your SQS consumers?
You have to create a CUSTOM metric in cloud watch that monitors queue length / # of instances. Alarm triggers scaling.
Can SNS handle FIFO?
Yes, same as SQS. The subscriber can only be a FIFO SQS queue.
What is SNS message filtering?
A JSON policy is used to filter messages sent to a subscriber. If the subscriber doesn’t have a policy, then it receives every message.
What are the characteristics of Kinesis Data Streams?
1 day to 365-day retention.
You can reprocess/replay data.
Once data is added, it can’t be deleted.
Data with the same partition key goes to same shard.
Who can produce to Kinesis Data Streams?
Who can consume?
Produce: AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent
Consume:
• Kinesis Client Library (KCL), AWS SDK
• Lambda
• Firehose or Data Analytics
What are the capacity modes for Data Streams?
Provisioned: if you know your capacity.
On-Demand: charge on use
What are common destinations for Kinesis Firehose?
S3, Redshift (via S3 COPY command), ElasticSearch
3rd party (Datadog, Splunk, etc.) HTTP endpoints
What is the difference between Firehose and Data Streams?
Data Streams: • ingest data at scale • write custom code to produce/consume • real time • you manage scaling (add/reduce shards) • data persists 1-365 days • you can replay data
Firehose: • load streaming data to somewhere else • auto-scaling • NEAR real time • fully managed • no data storage
What is Kinesis Data Analytics?
SQL for your Data Streams. Fully managed, autoscaled
Source: Data Streams or Firehose
You write SQL to process data in real-time
Result of analysis goes to Data Streams or Firehose.