Scaling - Exam Topics Flashcards

1
Q

When developing a Amazon Kinesis Data Stream application, what is the recommended method to read data from a shard?

A

Although data can be read (or consumed) from shards within Kinesis Streams using either the Kinesis Data Streams API or the Kinesis Client Library (KCL), AWS always recommend using the KCL. The KPL (Kinesis Producer Library) will only allow writing to Kinesis Streams and not reading from them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In CloudFront, Behaviors permit which of the following scenarios?

A

Delivery of different origins based on URL path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You are designing a DynamoDB datastore to record electric meter readings from millions of homes once a week. We share on our website weekly live electric consumption charts based of this data so the week must be part of the primary key. How might we design our datastore for optimal efficiency?

A

Use a table per week to store the data.

General design principles in Amazon DynamoDB recommend that you keep the number of tables you use to a minimum. For most applications, a single table is all you need. However, for time series data, you can often best handle it by using one table per application per period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

We have set up an autoscaling group using Dynamic scaling based on CPU utilization. During times of heavy spikes in demand, our fleet is unable to keep up with demand initially but eventually settles in. How might we address this most cost-effectively?

A

Reduce the cooldown time to allow scaling to be more dramatic and responsive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Based on past statistics of our web traffic, we observe that we sometimes get traffic spikes on Monday morning. What is the most cost-effective type of scaling should we use for this scenario?

A

Dynamic

You might be tempted to use Scheduled given the traffic patterns but this might scale needlessly if we do not get that traffic spike. The most efficient way would be Dynamic based on some metric like connections, CPU or network I/O.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

After an EMR cluster is terminated, what happens to the data stored as HDFS?

A

It is deleted.

Data stored on HDFS in an EMR cluster is ephemeral so it will be deleted when a cluster is terminated. If persistance is required, S3 might be an option using the EMRFS file system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the main uses of Kinesis Data Streams?

Possible Correct: 2

A

They can accept data as soon as it has been produced, without the need for batching

They can enable real-time reporting and analysis of streamed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly