Applications Flashcards

1
Q

Describe SQS and their queues and features?

A

It is the queueing system AWS offers. It is used to decouple components. In a nutshell, a producer puts the message which is durable. One or many readers keep asking for messages.

SQS is pull-based. It means you need an application to ask for a message. It is not pushed-based as SNS.
The messages are 256KB in size, but if needed, an increase can be requested
The TTL is 1min to 14 days. The default retention period is 4 days
Visibility timeout: By default, it is 12 hours. It is the time the message is invisible before timeout. Otherwise, in case the message is not processed, it will be available for another reader
The SQS long pooling can be used to reduce the billing. It does not return until a message arrives or the long poll times out
There are two types of queues:
Standard queue: unlimited number of transactions per second. But there is a possibility of duplications and out of order messages
FIFO: It is limited to 300 transactions per second. The message remains available until a consumer processes and deletes it. There is no duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What Simple Work Flow Service (SWS) is and what are the actors?

A

It is a task-oriented system that coordinates the execution of tasks (sync/async). It also supports human-task interaction.
It has the following actors:
Workflow starters: An application that initiates a workflow;
Deciders: It decides what to do in case of something finishes/fails;
Activity workers: It processes/carry out the activity tasks;

An workflow execution can last up to 1 year.

Before creating a workflow, one needs to register a domain on which you will define the retention period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple Notification Service

A

It supports push notifications to mobile devices, SMS, email, SQS queues, or any HTTP endpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Elastic Transcoder

A

It is a workflow that allows format conversion with low cost. A simple use case it to record in 4K and converts to optimal formats depending on the device type (smartphones, tables, PCs, etc).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why you’d use AWS API Gateway?

A

To scale and expose HTTPS RESTful API which usage is tracked, throttled, logged via API key. It maintain multiple versions of API for staging/production

You can configure one by defining the API, the (nested) resources and their HTTP methods, security, target, and request/response transformation

It provides free SSL/TLS in case it is used with AWS Certificate Manager

It supports cache for endpoints responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What CORS?

A

It is the mechanism to allow restricted resources on a website to be requested from another domain outside the domain from which the first resources was served.

How it works?

1: The browser makes an HTTP Options call for an URL
2: The responds with “These other domains are approved to GET this URL”
3: Error-“Origin policy can’t be read at the remote resource”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe Kinesis

A

Streaming data is generated by thousand of data sources such as stock prices, game data (as the gamer plays), social network, geospatial, IoT sensor data, etc

Kineses has 1-7 days retention.

There are three types of Kinesis:
Streams: The producers put the data into shards (topics). The consumers get the shards data to process it. 1000 writes/second. 5 reads/second and 2 MB/second and outputs to DynamoDB/EMR/S3/Redshift
Firehose: Similar to Streams, but has no persistence/shards. The data needs to processed/converted right away (with lambda) and outputs to S3/Elasticseach
Analytics: It analyses (SQL or Flink) the data on the fly and outputs to S3/Redshift/Elasticseach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe Cognito

A

Web Identify Federation that is recommended for all mobile applications.

There are flavours:

User pools: Sign-in/up functionality
Identity pools: The user enters his credentials into the social media (Web Identity Provider), Cognito receives an authentication code which is exchanged for AWS credentials allowing them (users) to assume an IAM role.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the size limit an item in DynamoDB?

A

DynamoDB supports single-digit millisecond latency at any scale because it is stored in SSD across three 3 AZs
It supports both document and key-value pair
The maximum item size in DynamoDB is 400 KB, which includes both attribute name binary length (UTF-8 length) and attribute value lengths (again binary length).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain Eventual Consistency Read VS Strongly Consistency Read

A

Eventual means that the written data will take up to 1s to be available on the replicas
Strong consistency read: This is recommended to be enabled in case the application needs to read in less than 1s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is DynamoDB Accelerator (DAX)?

A

It is the in-memory cache in front of DynamoDB that improves read/write performance up to 10x while not requiring any code change as it is fully compatible with the DynamoDB API

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How DynamoDB supports transactions?

A

It prepares and then commits, so your are charged for more capacity
It supports up to 25 items or 4MB of data per transaction
Interesting to know that DMS (Database Migration Service) supports DynamoDB/Aurora/Kafka/S3/Kinesis/Elasticsearch as target

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How DynamoDB is charged?

A

Provisioned: Per request and storage
OnDemand: Pay-per-request, but requests are more expansive than provisioned. As the application settle down, it is a good idea to migrate to provisioned, and use VPC Endpoints to save money

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How DynamoDB backup works?

A

Same region full backups anytime with zero impact. The retention is up to 35 days;
Point-in-time recovery is not enabled by default, but it permits RPO of 5min.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How DynamoDB global tables work?

A
Multi-master;
Multi-region;
Less than 1s replication latency over regions for new items;
Based on DynamoDB streams;
No need to rewrite the application;
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe DynamoDB stream

A

It is a time ordered running changing log of items. This is the technology behind DynamoDB replication

17
Q

Describe Redshift

A

Petabyte-scale data warehousing solution with low cost $1000/terabyte/year;

It compresses data per column by identifying the best algorithm based on the initial sample data;

It doesn’t require indexes or materialised views;

It distributes the data and query across all nodes via Massive Parallel Processing

1d retention by default. Up to 35 days;

3 replicas of the data: Compute node, Replica node and S3;

Only available in one AZ, but we can use the snapshots in another AZ for DR;

18
Q

Describe Aurora

A

It is the AWS cost effective and high scalable relational db that is compatible with MySQL and Postgres

Up to 5x better performance than MySQL and 3x better performance than Postgres

From 10GB to 64TB (autoscaling)
Up to 32vCPU and 244GB RAM

2 copies of data in each AZ with a minimum of 3 AZs providing 3 copies of data

Single-region only;

It does not support multiple schemas;

Automated backups are always enabled;

Manual snapshots can be taken and shared with other accounts;

19
Q

Elastic Map-Reduce

A

Industry-leading solution for big data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi and Presto;

Petabyte-scale processing with half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark;

EMR is a cluster of EC2 instances. Each instance has a role (master-tracks the health of the cluster, core-run task and stores data, task node-run tasks, but it does not store data);

The ms