Data Processing Services Flashcards

1
Q

Amazon Kinesis

A

collect, buffer, process, and analyze real-time, streaming data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is data processed in Kinesis Data Streams

How much data is able to be processed

A

Data is processed in “shards”

1000 records per second per shard

GBs of data per second from thousands of sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does a kinesis record consist of?

A

a partition key
sequence number
data blob (up to 1 MB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Does Kinesis Data Streams Store Data? If so, how? If not, why?

A

Transient data store – default retention of 24 hours, but can be configured for up to 7 days.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the types of Kinesis services?

A

Kinesis Video Streams
Kinesis Data Streams
Kineisis Firehose
Kinesis Data Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Kinesis Video Streams

A

Durably stores, encrypts, and indexes video data streams, and allows access to data through APIs

Supports encryption at rest with server-side encryption (KMS) with a customer master key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Video Streams - max read rate, max write rate?

A

5 transaction per second for reads, up to a max read rate of 2MB per second and 1000 records per second for writes up to a max of 1MB per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis Data Streams

A

enables real-time processing of streaming data

stores data for later processing by applications

ingest/collect and process large streams of data records in real time at large scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Amazon Kinesis Data Firehose

A

fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration to loading data streams into AWS data stores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What data operations can be performed on data in Amazon Kinesis Firehose

A

batch, compress, transform, and encrypt the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Load Targets for Amazon Kinesis Firehose

A

Amazon S3, Splunk, ElasticSearch, and RedShift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Primary Use Case for AWS Kinesis Services

A

Handling Streaming Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does Kinesis Data Streams or Firehose provide ability to store streaming data? If so, for how long?

A

KDS provides ability to store streaming data for 1-7 days, Firehose doesn’t provide any facility for storing streaming data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Key Features of Amazon Kinesis Data Streams (KDS)?

A
Data collected available in milliseconds
Enables real-time analytics
Provides ordering of records
Read or replay of records in the same order
Transient data store
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amazon SQS

A

Simple Queue Service is a fully managed queuing service - no need to configure, install, or acquire software/hardware, queues dynamically created and scale automatically, no need to provision capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Amazon SQS Features/Impacts

A

decouple and scale micro services, distributed systems and serverless applications
Buffer messages to smooth out temporary volume spikes to handle temporary volume spikes or increased latency
built in mechanism for retry?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the two types of SQS Queues?

A

FIFO

Standard

18
Q

Features of SQS standard queue

A

maximum throughput
best effort ordering
at least once delivery

19
Q

Features of SQS FIFO queue

A

guarantee messages are processed exactly once - message remains until consumer processes and deletes it

no duplicate messages

process messages in the order they are sent

20
Q

Disadvantage of KDS

A

It is not fully managed service, you must manually provision capacity/shards for it to scale

21
Q

Difference between SNS and SQS

A

SNS push messages to multiple subscribers

SQS clients poll for messages - SQS distributes messages, and is used to decouple apps

22
Q

When should you use Amazon MQ over SQS?

A

If existing app is being migrated to the cloud use Amazon MQ because it supports industry standard APIs and protocols

If starting from scratch use SQS

23
Q

How is SQS billed?

A

Only pay for what you use

billed per request, plus data transfer out of SQS unless transfer is to EC2 or Lambda in the same region

Free tier provides 1M request per month at no charge

24
Q

SQS Visibility Timeout? Default time? Min and max? Is behavior of time-out different for queue types?

A

a period of time during which Amazon SQS prevents other consumers from receiving and processing the message previously picked up by initial consumer

The default visibility timeout for a message is 30 seconds. The minimum is 0 seconds. The maximum is 12 hours.

For standard queues, the visibility timeout isn’t a guarantee against receiving a message twice.

25
Q

SQS Long Polling?
Difference between long and short polling?
What is more costly?

A

Long polling doesn’t return a response until a message arrives or the long poll times out

Short polling returns immediately regardless of if there is a message or not

Short polling is more costly because of the number of potential empty responses and repeat request

26
Q

How reliable is the storage of data in Amazon SQS?

A

SQS stores all message queues and messages in a single, highly available AWS region with multiple redundant AZs

No single computer, network or AZ failure will make messages inaccessible

27
Q

Can you encrypt messages in SQS, if so how, and if not why not?

A

Yes, use SSE-KMS or you can manage encryption yourself

28
Q

What is default SQS message retention period and what is min and max?

A

Default period is 4 days and can be set from 1 min to 12 days

29
Q

What happens when SQS message reaches retention period?

A

Message is automatically deleted

30
Q

Dead letter queue? Is behavior different for queue types?

A

An Amazon SQS queue to which a source queue can send messages if the source queues consumer application is unable to consume the message successfully

FIFO queue must use FIFO dead letter queue and Standard queue must use Standard dead letter queue

31
Q

SNS

A

Amazon Simple Notification Service (SNS) is a highly available, durable, secure, fully managed pub/sub messaging service

32
Q

Amazon SNS

A

Amazon Simple Notification Service (Amazon SNS) is a fully managed pub/sub messaging service
provides topics for high-throughput, push-based, many-to-many messaging

33
Q

Features of SNS

A

SNS Message Filtering - subscribers only receive messages of interest

SNS Message Batching - Batch up to 10 messages in a single API request

Ordering - Use SNS FIFO with SQS FIFO to ensure once delivered and processed, no duplication

Message Encryption with KMS

Traffic Privacy with PrivateLink and VPC

Delivery Retry, Dead Letter Queue - when subscribers are not available

Message Archiving - Firehos, S3 subscriptions

Integration with other AWS services (over 60)

Send A2P notifications via SMS, mobile push and email

34
Q

Amazon Athena? How is it billed?

A

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

35
Q

What is the naming requirement of a FIFO SQS queue?

A

The name of a FIFO queue must end with the .fifo suffix. The suffix counts towards the 80-character queue name limit.

36
Q

How do you convert standard queue into FIFO?

A

To make the move, you must either create a new FIFO queue for your application or delete your existing standard queue and recreate it as a FIFO queue.

37
Q

Amazon Macie

A

fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS S3

38
Q

Amazon Macie Features

A

automatically provides an inventory of Amazon S3 buckets including a list of unencrypted buckets, publicly accessible buckets, and buckets shared with AWS accounts outside those you have defined in AWS Organizations

identify and alert you to sensitive data, such as personally identifiable information (PII)

39
Q

Amazon SQS message timers

A

Message timers let you specify an initial invisibility period for a message added to a queue.

The default (minimum) delay for a message is 0 seconds. The maximum is 15 minutes.

40
Q

Amazon SQS delay queues

A

Delay queues let you postpone the delivery of new messages to a queue for a number of seconds

any messages that you send to the queue remain invisible to consumers for the duration of the delay period

The default (minimum) delay for a queue is 0 seconds. The maximum is 15 minutes

For standard queues, the per-queue delay setting is not retroactive

For FIFO queues, the per-queue delay setting is retroactive

41
Q

Amazon DocumentDB

A

fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data