Decoupling Workflows Flashcards

1
Q

What is SQS?

A

Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.

  • Allows asynchronous processing of work. One resource will write a message to an SQS queue, and then another resource will retrieve that message from SQS.
  • At least once delivery for each message in the queue.
  • Supports resource policies.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is SNS?

A

Simple Notification Service (SNS) is a fully managed messaging service for
both application-to-application (A2A) and application-to-person (A2P) communication.

  • is a push-based messaging service. It will proactively deliver
    messages to the endpoints subscribed to it. This can be used to alert a system or a person.
  • Delivery Retries - Reliable Delivery.
  • Cross Account via TOPIC POLICY.
  • Supports cross region replication.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is API Gateway?

A

API Gateway is a fully managed service that makes it easy for developers to
create, publish, maintain, monitor, and secure APIs at any scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the SQS Settings?

A
  • Delivery Delay: Default is 0; can be set up to 15 minutes.
  • Message Size: Messages can be up to 256 KB of text in any format.
  • Encryption: Messages are encrypted in transit by default, but you can add at-rest.
  • Message Retention: Default is 4 days; can be set between 1 minute and 14 days.
  • Long vs. Short: Long polling isn’t the default, but it should be.
  • Queue Depth: This can be a trigger for autoscaling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between long and short polling in SQS?

A
  • Short polling returns a response immediately, even if the queue is empty. This means that if there are no messages in the queue, the consumer will receive an empty response and will need to poll again.
  • Long polling waits until a message arrives in the queue before returning a response(max 20 seconds). This means that the consumer will not receive an empty response, even if there are no messages in the queue at the time of the poll.

With batching, 1 request = (0)1-10 messages up to 64KB total.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a Dead Letter Queue?

A
  • A dead letter queue (DLQ) is used to hold messages that were not successfully processed. These messages might have failed due to errors, invalid data, or other issues. The purpose of a dead letter queue is to provide a way to review and troubleshoot these problematic messages.
  • When RecieceCount > maxRecieveCount and the message is not deleted, it is moved to a DLQ.
  • The retention period OF DLQ should be longer than other queues because the enqueue timestamp is unchanged when a message enters a DLQ(it keeps the old timestamp).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are FIFO Queues?

A
  • FIFO queues do not have the same level of performance.
  • You can order messages with SQS standard, but it’s on you to do it.
  • Message Group ID, ensures messages are processed one by one.
  • It costs more since AWS must spend computing power to deduplicate messages.
  • Exactly once delivery for each message in the queue.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What topic types are supported in SNS?

A

FIFO or Standard:
- FIFO only supports SQS as a subscriber
- Standard supports: Kinesis Data Firehose, SQS,
Lambda, email, HTTP(S), SMS, platform application endpoint.

  • There is also DLQ Support.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

API Gateway features.

A
  • Security: This service allows you to easily protect your endpoints by attaching a web application firewall (WAF).
  • Stop Abuse: Users can easily implement DDoS protection and rate
    limiting to curb abuse of their endpoints.
  • Ease of Use: API Gateway is simple to get started with. Easily build out the calls that will kick off other AWS services in your account.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the components of AWS Batch?

A
  • Jobs: Units of work that are submitted to AWS Batch (e.g.,
    shell scripts, executables, and Docker images).
  • Job Definitions: Specify how your jobs are to be run(essentially, the
    blueprint for the resources in the job).
  • Job Queues: Jobs get submitted to specific queues and reside there until scheduled to run in a compute environment.
  • Compute Environment: Set of managed or unmanaged compute
    resources used to run your jobs
    .
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fargate or EC2 Compute Environments for AWS Batch?

A

Fargate is the recommended way of launching most batch jobs.
Fargate or EC2 Compute Environments.
Sometimes, EC2 is the best choice!

  • Custom AMIs can only be ran via EC2
  • Anything needing more than four vCPUs needs to use EC2.
  • EC2 is recommended for anything needing more than 30 GiB of memory.
  • If your jobs require a GPU, then it must be on EC2! Arm-based
    Graviton CPU can only be leveraged via EC2 for AWS Batch.
  • When using linuxParameters parameters, you must run on EC2 compute.
  • For a large number of jobs, it’s best to run on EC2. Dispatched at
    a higher rate (more concurrency) than Fargate!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

AWS Batch or AWS Lambda?

A
  • AWS Lambda currently has a 15-minute execution time limit. Batch does not have this.
  • AWS Lambda has limited disk space, and EFS requires functions live within a VPC.
  • Lambda is fully serverless, but it has natively limited runtimes! Batch uses Docker, so any runtime can be used

-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Amazon MQ?

A
  • Message broker service allowing easier migration of existing applications to the AWS Cloud.
  • Leverages multiple programming languages, operating systems, and messaging protocols.
  • Currently supports both Apache ActiveMQ orRabbitMQ engine types.
  • Allows you to easily leverage existing apps without managing and maintaining your own system.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SNS with SQSvs.Amazon MQ

A
  • Each offers architectures with topics and queues. Allows for one-to-one
    or one-to-many messaging designs.
  • If migrating existing applications with messaging systems in place, you
    likely want to consider Amazon MQ.
  • If creating new applications, look at SNS and SQS simpler to use, highly
    scalable, and simple APIs. Good fit for most new use cases!
  • Amazon MQ REQUIRES private networking like VPC, Direct Connect, or VPN. SNS and SQS are publicly accessible by default.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are step functions?

A
  • Comes with a graphical console for easier application workflow views and flows.
  • Main components are state machines and tasks.
  • Specific states within a workflow (state machine) representing a single unit of work
  • Every single step within a workflow is considered a state
  • Standard workflow(default) has maximum duration 1 year. Express workflow is for high IO with 5 minutes max duration.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 2 types of workflow that AWS Step Functions support?

A

Each workflow has executions.
Executions are instances where you run your workflows in order to perform your tasks.

STANDARD
- Have an exactly-once execution
- Can run for up to one year
- Useful for long-running workflows that need to have an auditable history
- Rates up to 2,000 executions per second
- Pricing based per state transition

EXPRESS
- At-least-once workflow execution
- Can run for up to five minutes
- Useful for high-event-rate workloads Example use is IoT data streaming and ingestion
- Pricing based on number of executions, durations, and memory consume
- Think about anonline pickup order: Each step in that workflow is considered a state.

17
Q

What are the different states of step functions?

A
  • Pass: Passes any input directly to its output — no work done
  • Task: Single unit of work performed (e.g., Lambda, Batch, and SNS
  • Choice: Adds branching logic to state machines
  • Wait: Creates a specified time delay within the state machine
  • Succeed: Stops executions successfully
  • Fail: Stops executions and marks them as failures
  • Parallel: Runs parallel branches of executions within state machines
  • Map: Runs a set of steps based on elements of an input array
18
Q

What Is AppFlow?

A
  • Fully managed integration service for exchanging data between SaaS apps and AWS services
  • Pulls data records from third-party SaaS vendors and stores them in
    Amazon S3
  • Bi-directional data transfers with limited combinations
  • Can Run on-demand Run on event Run on schedule.
19
Q

What Is Redshift?

A

Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. It’s a very large relational database traditionally used in big data applications.

  • It can hold up to 16 PB of data.
20
Q

What is EMR?

A

EMR (Elastic Map Reduce) is a managed big data platform that allows you to process vast amounts of data using open-source tools, such as Spark, Hive, HBase, Flink, Hudi, and Presto.

  • It is AWS’s ETL tool.
  • It’s an Open-Source Cluster. EMR is a managed fleet of EC2 instances running open-source tools.
21
Q

What is ETL?

A

Extract Transform Load.

22
Q

What Is Kinesis Data Streams?

A

Kinesis Data Streams allow to ingest, process, and analyze real-time(200ms) streaming data(ingestion of data). You can think of it as a huge data highway connected to your AWS account. Great for analytics and dashboards.

  • Streams store a 24-hour moving window of data that can be increased to a maximum of 365 days at an additional cost.
  • Supports multiple producers and consumers(you must configure the consumer). Consumers can access the data in different ways from the moving window(per second, per hour, etc)
  • To improve the performance of a kinesis stream, the Stream Shards need to be changed.
23
Q

What Is Kinesis Data Firehose?

A
  • Data transfer tool to get Kinesis Data Streams to S3(or direct data from the producers), into Redshift(uses s3 as intermediate), Elasticsearch, or Splunk(or HTTP meaning to 3rd party applications).
  • Offers persistence above the moving window of Data Streams.
  • Always Near real-time(within 60 seconds). Even when the producers are directly connected to Firehose(and not to the Streams).
  • Supports transformation of the data on the fly(lambda).
24
Q

What Is Athena?

A

Athena is an interactive query service that makes it easy to analyze data in S3 using SQL. This allows you to directly query data in your S3 bucket without loading it into a database (schema on read).

  • It supports all AWS logs(CLoudTrail, VPC flow logs, ELB logs, cost reports etc)
  • AWS Glue Data Catalog & Web Server logs
  • Athena Federated Query supports also other sources than s3(new feature uses lambda)
25
Q

What Is Glue?

A
  • Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data. It allows you to perform ETL(Extract, Transform, Load) workloads without managing underlying servers.
  • It is also a data catalog service (Data Catalog) across all the data stored within an organization.

Data source: Stores(S3, RDS, JDBC compatible, DynamoDB), Streams(Kinesis Data Streams, Apach Kafka)

Data targets: S3, RDS, JDBC

26
Q

What Is QuickSight?

A

Amazon QuickSight is a fully managed business intelligence (BI) data visualization service. It allows you to easily create dashboards and share them within your company.

27
Q

What is AWS Data Pipeline?

A

AWS Data Pipeline is a managed Extract, Transform, Load (ETL) service for automating movement and transformation of your data.

  • It uses servers. It creates EMR clusters to perform the tasks.
28
Q

What is Amazon MSK Overview?

A

Amazon MSK stands for Amazon Managed Streaming for Apache Kafka

  • Cluster type within Amazon MSK offering serverless cluster management. Automatic provisioning and scaling.
  • MSK Serverless is fully compatible with Apache Kafka. Use the same client apps for producing and consuming data.
  • Allows developers to easily stream data to and from Apache Kafka clusters.
29
Q

What is OpenSearch?

A

OpenSearch is a managed service that allows you to run search and analytics engines for various use cases. It is the successor to AmazonElasticsearch Service.

  • Amazon OpenSearch Service ❤ Logs
  • It is used as a managed analytics and visualization service.
30
Q

What are the different states in step functions?

A
  • Pass: This state can be used to pass data from one state to another. For example, you could use a pass state to pass the output of a task state to a choice state.
  • Task: This state can be used to perform any task that can be implemented as a Lambda function. For example, you could use a task state to send an email, start a workflow in another service, or perform a database operation.
  • Choice: This state can be used to make a decision based on the input. For example, you could use a choice state to decide whether to send an email or start a workflow based on the value of a variable.
  • Wait: This state can be used to delay the execution of a state machine for a specified amount of time. For example, you could use a wait state to delay the execution of a task state until a certain time of day.
  • Succeed: This state is used to terminate a state machine with a success. For example, you could use a succeed state to terminate a state machine after a task state has completed successfully.
  • Fail: This state is used to terminate a state machine with a failure. For example, you could use a fail state to terminate a state machine if a task state fails.
  • Parallel: This state is used to execute multiple states in parallel. For example, you could use a parallel state to execute two task states simultaneously.
  • Map: This state is used to execute a state multiple times with different inputs. For example, you could use a map state to send an email to each member of a list.
31
Q

Basic API Gateway Errors.

A
  • 400: Bad request- Generic
  • 403: Access denied - Authorized denies.. WAF Filtered
  • 429: API Gateway can throttle - this means you have exceed that amount
  • 502: Bad Gateway Exception - bad output returned by Lambda
  • 503: Service unvailable - backing endpoint offline? Major service issue
  • 504: Integration failure/timeout - 29s limit
32
Q

What is visibility timeout in SQS?

A

Visibility timeout is the amount of time that SQS prevents other consumers from receiving and processing a message. This is to ensure that the message is only processed once. The default visibility timeout is 30 seconds, but it can be configured up to 12 hours.

33
Q

Why is SNS being used to implement a fanout architecture with SQS?

A

Because it allows multiple subscribers to receive a copy of every message published to a topic. SQS queues can subscribe to SNS topics and receive a copy of every message published to that topic. This is not possible with only SQS because SQS queues can only receive messages from one producer group.

34
Q

What is High throughput for FIFO queues?

A

High throughput for FIFO queues is a feature that allows you to send and receive messages at a higher rate than standard FIFO queues. With high throughput for FIFO queues, you can send up to 3000 messages per second per API action(like standard SQS). FIFO queues have a limit of 300 messages per second per API action.

35
Q

What are SQS Delay Queues?

A
  • Delay queues provide an initial period of invisibility for messages. Predefine periods can ensure that the processing of messages doesn’t begin until this period has expired.
  • Message timers allow a per message invisibility overriding any default setting. Not supported on FIFO Queues.

Min=0, Max=15min.

36
Q

What is Amazon Kinesis Data Analytics?

A
  • Is a fully managed service that makes it easy to process and analyze streaming data that needs real-time SQL-based processing.
  • Can take source/destination streams from Amazon Kinesis Streams and Amazon Kinesis Data Firehose, and also reference Amazon S3 for static data.
  • When using Amazon Kinesis Data Firehose becomes near real-time.
37
Q

What is Amazon Kinesis Video Streams?

A

Amazon Kinesis Video Streams makes it easy to securely stream video (and timed series like audio, radar, thermal, etc)from connected devices to AWS for analytics, machine learning (ML), playback, and other processing. Kinesis Video Streams automatically provisions and elastically scales all the infrastructure needed to ingest streaming video data from millions of devices.

  • Data cannot be accessed through a storage, only through the API.