Decoupling Workflows Flashcards
What is meant by tight coupling?
One application (or instance) talking directly to another application (or instance).
What is meant by loose coupling?
Not having one application (or instance) talk directly to another. Instead you want highly available, scalable, managed service in between resources (like ELB)
What is Simple Queue Service (SQS)?
SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems and serverless applications
What is Simple Notification Service (SNS)?
SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication
What is API Gateway?
API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor and secure APIs at any scale
Which is preferred: tight or loose coupling?
Loose coupling is ALWAYS preferred. Don’t select answers that include instance-to-instance communication.
Where is important for applications to be loosely coupled?
Every layer, both internal and external
What is poll-based messaging?
A producer of messages writes a message into a queue (SQS) and a consumer can retrieve the message whenever it is ready
What is Simple Queue Service (SQS)?
Simple Queue Service is a messaging queue that allows asynchronous processing of work. One resource will write a message to an SQS queue, and then another resource will retrieve that message from SQS.
What is the delivery delay in SQS?
The queue hides messages for this period of time before making them visible. Default is 0 and maximum is 15 minutes.
What is the message size in SQS?
Messages can be up to 256KB in any format
What encryption is provided by default in SQS?
Messages are encrypting in-transit by default, but you can add at-rest
What is message retention setting in SQS?
The length of time that a message can be kept in SQSl. Default is 4 days; can be set between 1 minute and 14 days.
What is the difference between long and short polling in SQS?
Short polling checks for messages and exits if none exists, which burns extra CPU cycles. Long polling isn’t the default in SQS, but it should be (setting the connection time window).
What is the queue depth in SQS?
This is a value, not a setting. This can be a trigger for autoscaling.
What is the visibility timeout in SQS?
A lock on a message (default 30 seconds) where no other consumer can receive the message while a consumer is processing it and reaches out to say it is done.
If the message is successfully processed within the visibility timeout, the message is deleted from the queue.
If you are asked a question about what to look for if you are burning too much CPU (cash) with an SQS queue, what setting would you recommend looking at?
The connection time window (set it for long polling)
What is a dead letter queue in SQS?
An SQS queue that you can temporarily sideline messages into when they can’t be processed successfully from the main queue.
What is the dead letter queue setting, maximum receives, used for?
The maximum number of times the message can be read before it gets sidelined to the dead letter queue
What is an important metric to monitor on a dead letter queue in SQS?
Queue depth
Can you use dead letter queues with SNS?
Yes, on SNS topics
What type of message ordering is guaranteed with Standard SQS queues?
Best effort ordering
Is there a guarantee around not receiving duplicate messages with Standard SQS queue?
No
What type of SQS queue guarantees messages are delivered in order?
FIFO queue (and also guarantees no duplicates)
What is the guaranteed number of transactions per second processing in a Standard SQS queue vs a FIFO SQS queue?
Standard queue has nearly unlimited transactions per second where the FIFO queue guarantees 300 messages per second
What is the message deduplication Id in FIFO SQS queues?
A unique value that ensures that all of your messages are unique. The deduplication interval is a five minute window, where if we see the same deduplication id multiple times, it will receive the message successfully but will not pass it along to the consumer
If you are given a scenario where it is important that messages are processed in order, or they not be duplicates, what type of SQS queue would you recommend?
FIFO
If you are given a scenario where you must have thousands and thousands of transactions per second processing in an SQS queue, what type of queue would you recommend?
Standard (FIFO maxes out at 300 messages per second)
What is the message group id used for in FIFO queues in SQS?
It is the identifier that ensures that all messages with that same id are received and processed in order.
Which costs more, standard for FIFO queues in SQS?
FIFO
What is push-based messaging?
A producer of messages writes a message into a topic (SNS) and the topic delivers it to its subscribers whenever it is ready; the subscribers must be ready and waiting to receive the message
What services can subscribe to an SNS topic?
Kinesis Data Firehose, SQS, Lambda, email, HTTP(S), SMS (text), platform application endpoint
What is the maximum message size allowed in SNS?
256KB of text in any format
Does SNS support a Dead Letter Queue?
Yes, messages that fail to be delivered can be stored in an SQS DLQ
Does SNS offer retries of deliveries?
No, except for HTTP(S) option
What are the two types of topics in SNS?
Standard and FIFO (only supports SQS as a subscriber)
Is encryption offered by default for SNS?
Yes, messages are encrypted in-transit by default, but you can add at-rest
Can access be restricted to an SNS topic?
Yes, a resource policy can be added to an SNS topic
Can an SNS topic have multiple subscriptions?
Yes
If you set up an email subscription to a topic in SNS, will it work immediately?
No, the email address must accept delivery before messages will be sent
If you given a scenario relating to alerts or notifications, what service would you recommend?
SNS
If you are given a scenario requiring push based notifications, what service would you recommend?
SNS
If you are given a scenario about sending emails to customers or subscribers, what service would you recommend?
SES (based around marketing emails)
If you are given a scenario about sending notifications from a CloudWatch alarm, what service would you recommend?
SNS
What is API Gateway?
Amazon API Gateway is a fully managed service that allows you to easily publish, create, maintain, monitor and secure your API. It allows you to put a safe “front door” on your application.
What are the key features of API Gateway?
- Security - allows you to easily protect your endpoints by attaching a web application firewall (WAF)
- Stop abuse - can easily implement DDoS protection and rate limiting to curb abuse of their endpoints
- Ease of use
What are stages used for in API Gateway?
They are different versions of an API
What is the main purpose of API Gateway?
It is useful to front applications in AWS (aka “front door”)
If you are given an option of whether to hardcode secret access keys or use API Gateway, which would you choose?
API Gateway
How can you configure an API Gateway to prevent a DDoS attack?
Configure a web application firewall (WAF)
What is the AWS Batch Service?
An AWS service that helps you run batch computing workloads within AWS (on EC2 or ECS/Fargate)
It automatically provisions and scales workloads, making it simple to configure and maintain
No installation is required!
What are the four main components of AWS Batch?
- Job - unit of work that is submitted to AWS batch (shell scripts, executables, Docker images, etc)
- Job Definitions - specify how your jobs are to be run (blueprint for resources)
- Job Queues - jobs get submitted to specific queues and reside there until scheduled to run in a compute environment
- Compute Environment - set of managed or unmanaged compute resources used to run your jobs
What type of compute is recommended for AWS Batch?
Fargate (because it launches and scales better)
When is EC2 the best choice for a compute environment in AWS Batch?
- When you need a custom AMI
- When you have specific vCPU requirements (anything needing more than 16 vCPUs)
- When you have memory requirements where you need more than 30GB of memory
- If you need a GPU or an Arm-Based Graviton-based CPU
- When you need Linux parameters
- For large numbers of jobs
When would you use AWS Batch of AWS Lambda?
- AWS Lambda currently has a 15 minute timeout, so if your job takes longer, AWS Batch does not have this
- AWS Lambda has limited disk space
- AWS Lambda has limited runtimes (AWS Batch uses Docker so any runtime can be used)
What is the difference between a managed compute environment and an unmanaged?
Managed
- AWS manages capacity and instance types
- Compute resource specs are defined when an environment is created
- ECS instances are launched into a VPC subnet
- Default is the most recent and approved Amazon ECS AMI
- You can use your own AMI, but it has to meet the requirements of an Amazon ECS optimized image
- You can leverage Fargate, Fargate Spot, and regular Spot instances
Unmanaged
- You manage your own resources entirely
- AMI must meet ECS AMI specs
- Less commonly used
- Great for complex or specific requirements
If you are given a scenario about running a long-running, event driven workload, what service would you recommend (more than 15 minutes)?
AWS Batch
What is the Amazon MQ Service?
A managed message broker allowing easier migration of existing application to the cloud
What programming languages, operating systems and messaging protocols can be used in Amazon MQ Service?
A variety of programming languages, operating systems and messaging protocols can be used
What engine types are available in Amazon MQ Service?
ApacheMQ and RabbitMQ engine types
Why would you use Amazon MQ vs SNS/SQS?
Topics and queues are offered in both that allow for one-to-one or one-to-many messaging designs. Therefore, if creating new applications, look at SNS/SQS because they are simpler.
And if you need public accessibility, you might consider SNS/SQS because they are publicly accessible by default, and Amazon MQ requires private networking like VPC, Direct Connect or VPN.
Also, Amazon MQ has NO default AWS integrations.
But, if migrating existing applications with messaging systems in place, you might consider AmazonMQ.
What are the different ways to configure brokers in Amazon MQ?
- Single Broker Instance - one broker lives within one availability zone, which would be perfect for a dev environment (RabbitMQ has a network load balancer in front)
- Highly Available Architectures - depends on the broker engine type
- Amazon MQ for ActiveMQ - with active/standby deployments, one instance will remain available at all times (configure a network of brokers with separate maintenance windows)
- Amazon MQ for RabbitMQ - has cluster deployments and they are logical groupings of three broker nodes across multiple AZs sitting behind a network load balancer
If you are given a scenario where you must use JMS (or messaging protocols like AMQP 0-9-1, AMQP 1.0, MQTT, OpenWire or STOMP, what service would you recommend?
Amazon MQ
What is AWS Step Functions?
It is a serverless orchestration service combining different AWS services for business applications
What is a State Machine in AWS Step Functions?
A particular workflow with different event-driven steps
What is a Task in AWS Step Functions?
Specific states within a workflow (state machine) representing a single unit of work
What is a State in AWS Step Functions?
Every single step in a workflow (state machine) is considered a state
Leverage states to either make decisions based on input, perform certain actions or pass output
Elements within your state machines and they are referred to by a name
What are the two different types of workflows in AWS Step Functions?
Standard
- Have an exactly once execution
- Can run for up to a year
- Useful for long-running workflows that need to have an auditable history
- Rates up to 2,000 executions per second
- Pricing is based per state-transition
Express
- At-least-once workflow execution
- Can run for up to 5 minutes
- Useful for high event-rate workloads (IoT data streaming and ingestion)
- Pricing is based on number of executions, durations and memory consumed
What are Executions in AWS Step Functions?
Instances where you run your workflow in order to perform your tasks
What language are States and State Machines defined in Amazon Step Functions?
Amazon States Language (similar to JSON)
What services integrate nicely with AWS Step Functions?
AWS Lambda, AWS Batch, Amazon DynamoDB, Amazon ECS/AWS Fargate, Amazon SQS, Amazon SNS, Amazon API Gateway, Amazon EventBridge, AWS Step Functions, and many more
What are the different states in AWS Step Functions
- Pass - Passes any input directly to its output- no work done
- Task - Single unit of work
- Choice - branching logic (yes/no, etc.)
- Wait - creates a time delay
- Succeed - stops execution successfully
- Fail - stops execution and marks them as failures
- Parallel - run parallel branches of executions within state machines
- Map - runs a set of steps based on elements of an input array
If you are given a scenario where you need to add a wait into a workflow, what service would you recommend?
AWS Step Functions
What is Amazon AppFlow?
Fully managed service that allows you to exchange data between SaaS apps and AWS services (Salesforce, etc.)
Allows you to pull data from third-party SaaS vendors and stores them in S3
It is bi-directional with limited combinations
What is a Flow in Amazon AppFlow?
Flows transfer data between sources and destinations; a variety of SaaS applications are supported
What is a Data Mapping in Amazon AppFlow?
Determines how your source data is stored within your destinations
What are Filters in Amazon AppFlow?
Criteria to control which data is transferred from a source to a destination
What is a Trigger in Amazon AppFlow?
How the flow is started
Supported types:
1. Run on demand
2. Run on event
3. Run on schedule
What are use cases for using Amazon AppFlow?
- Transferring Salesforce records to Amazon Redshift
- Ingesting and analyzing Slack conversations in S3
- Migrating ZenDesk and other help desk support tickets to Snowflake
- Transferring aggregate data on a scheduled basis to S3 (100GB per flow)
If you are given a scenario where an application needs to reference large amounts of SaaS data regularly, and the data needs to be accessed within S3, what service would you recommend?
Amazon AppFlow
What are 4 questions to ask when answering questions regarding workloads?
- Are the workloads synchronous or asynchronous?
- What type of decoupling makes sense (pub/sub like SNS/SQS, or ordering of workflows like Step Functions, etc)
- Does the order of messages matter?
- What type of application load are we going to see?
If you are given a scenario where you are seeing a large number of duplicate messages in a queue, what setting would you recommend looking at?
The queue visibility timeout or the developer is not deleting the message after processing
If you are given a scenario about batch workloads needing queues, what service would you recommend?
AWS Batch
If you are asked about an alternative solution to AWS Lambda due to runtime requirements or execution timeouts, what service would you recommend?
AWS Batch
If you are given a scenario that requires different states or logic during workflows (e.g. condition checks, failure catches or especially wait periods), what service would you recommend?
AWS Step Functions