03 - Application Integration Flashcards

1
Q

Amazon SQS – Standard Queue

• Fully managed service, used to decouple applications

A
  • Attributes:
    • Unlimited throughput, unlimited number of messages in queue
    • Default retention of messages: 4 days, maximum of 14 days
    • Low latency (<10 ms on publish and receive)
    • Limitation of 256KB per message sent
  • Can have duplicate messages (at least once delivery, occasionally)
  • Can have out of order messages (best effort ordering)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SQS – Message Visibility Timeout

• After a message is polled by a consumer, it becomes invisible to other consumers

A
  • By default, the “message visibility timeout” is 30 seconds
  • That means the message has 30 seconds to be processed
  • After the message visibility timeout is over, the message is “visible” in SQS
  • If a message is not processed within the visibility timeout, it will be processed twice
  • A consumer could call the ChangeMessageVisibility API to get more time
  • If visibility timeout is high (hours), and consumer crashes, re-processing will take time
  • If visibility timeout is too low (seconds), we may get duplicates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Amazon SQS – Dead Letter Queue

A
  • If a consumer fails to process a message within the Visibility Timeout … the message goes back to the queue!
  • We can set a threshold of how many times a message can go back to the queue
  • After the MaximumReceives threshold is exceeded, the message goes into a dead letter queue (DLQ)
  • Useful for debugging!
  • Make sure to process the messages in the DLQ before they expire:
    • Good to set a retention of 14 days in the DLQ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon SQS – Delay Queue

A
  • Delay a message (consumers don’t see it immediately) up to 15 minutes
  • Default is 0 seconds (message is available right away)
  • Can set a default at queue level
  • Can override the default on send using the DelaySeconds parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Amazon SQS - Long Polling
• When a consumer requests messages from the queue, it can optionally “wait” for messages to arrive if there are none in the queue

A
  • LongPolling decreases the number of API calls made to SQS while increasing the efficiency and latency of your application.
  • The wait time can be between 1 sec to 20 sec (20 sec preferable)
  • Long Polling is preferable to Short Polling
  • Long polling can be enabled at the queue level or at the API level using WaitTimeSeconds
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SQS – Request-Response Systems

A
  • To implement this pattern: use the SQS Temporary Queue Client
  • It leverages virtual queues instead of creating / deleting SQS queues (cost effective)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon SNS – How to publish

A

Topic Publish (using the SDK)
• Create a topic
• Create a subscription (or many)
• Publish to the topic

Direct Publish (for mobile apps SDK)
• Create a platform application
• Create a platform endpoint
• Publish to the platform endpoint
• Works with Google GCM, Apple APNS, Amazon ADM…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SNS + SQS: Fan Out

A
  • Push once in SNS, receive in all SQS queues that are subscribers
  • Fully decoupled, no data loss
  • SQS allows for: data persistence, delayed processing and retries of work
  • Ability to add more SQS subscribers over time
  • Make sure your SQS queue access policy allows for SNS to write
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SNS – Message Filtering

A
  • JSON policy used to filter messages sent to SNS topic’s subscriptions
  • If a subscription doesn’t have a filter policy, it receives every message
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Kinesis Overview

• Makes it easy to collect, process, and analyze streaming data in real-time

A
  • Ingest real-time data such as: Application logs, Metrics, Website clickstreams, IoT telemetry data…
  • Kinesis Data Streams: capture, process, and store data streams
  • Kinesis Data Firehose: load data streams into AWS data stores
  • Kinesis Data Analytics: analyze data streams with SQL or Apache Flink
  • Kinesis Video Streams: capture, process, and store video streams
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Kinesis Data Streams

A
  • Billing is per shard provisioned, can have as many shards as you want
  • Retention between 1 day (default) to 365 days
  • Ability to reprocess (replay) data
  • Once data is inserted in Kinesis, it can’t be deleted (immutability)
  • Data that shares the same partition goes to the same shard (ordering)
  • Producers: AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent
  • Consumers:
    • Write your own: Kinesis Client Library (KCL), AWS SDK
    • Managed: AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics,
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kinesis Data Firehose

A
  • Fully Managed Service, no administration, automatic scaling, serverless
    • AWS: Redshift / Amazon S3 / ElasticSearch
    • 3rd party partner: Splunk / MongoDB / DataDog / NewRelic / …
    • Custom: send to any HTTP endpoint
  • Pay for data going through Firehose
  • Near Real Time
    • 60 seconds latency minimum for non full batches
    • Or minimum 32 MB of data at a time
  • Supports many data formats, conversions, transformations, compression
  • Supports custom data transformations using AWS Lambda
  • Can send failed or all data to a backup S3 bucket
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kinesis Data Streams vs Firehose

A

Kinesis Data Streams
• Streaming service for ingest at scale
• Write custom code (producer / consumer)
• Real-time (~200 ms)
• Manage scaling (shard splitting / merging)
• Data storage for 1 to 365 days
• Supports replay capability

Kinesis Data Firehose
• Load streaming data into S3 / Redshift / ES / 3rd party / custom HTTP
• Fully managed
• Near real-time (buffer time min. 60 sec)
• Automatic scaling
• No data storage
• Doesn’t support replay capability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kinesis Data Analytics (SQL application)

A
  • Perform real-time analytics on Kinesis Streams using SQL
  • Fully managed, no servers to provision
  • Automatic scaling
  • Real-time analytics
  • Pay for actual consumption rate
  • Can create streams out of the real-time queries
  • Use cases:
    • Time-series analytics
    • Real-time dashboards
    • Real-time metrics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SQS vs SNS vs Kinesis

A

SQS
• Consumer “pull data”
• Data is deleted after being consumed
• Can have as many workers (consumers) as we want
• No need to provision throughput
• Ordering guarantees only on FIFO queues
• Individual message delay capability

SNS
• Push data to many subscribers
• Up to 12,500,000 subscribers
• Data is not persisted (lost if not delivered)
• Pub/Sub
• Up to 100,000 topics
• No need to provision throughput
• Integrates with SQS for fanout architecture pattern
• FIFO capability for SQS FIFO
Kinesis
• Standard: pull data
• 2 MB per shard
• Enhanced-fan out: push data
• 2 MB per shard per consumer
• Possibility to replay data
• Meant for real-time big data, analytics and ETL
• Ordering at the shard level
• Data expires after X days
• Must provision throughput
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Amazon MQ

• Managed Apache ActiveMQ

A
  • SQS, SNS are “cloud-native” services, and they’re using proprietary protocols from AWS.
  • Traditional applications running from on-premise may use open protocols such as: MQTT, AMQP, STOMP, Openwire, WSS
  • When migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ
17
Q

AWS AppSync

A
  • Store and sync data across mobile and web apps in real-time
  • Makes use of GraphQL (mobile technology from Facebook)
  • Client Code can be generated automatically
  • Integrations with DynamoDB / Lambda
  • Real-time subscriptions
  • Offline data synchronization (replaces Cognito Sync)
  • Fine Grained Security