[07] Monitoring Flashcards

1
Q

What CloudWatch metrics does ECS generate for clusters?

A

CPUReservation, CPUUtilization, MemoryReservation, MemoryUtilization, GPUReservation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the CPUReservation metric for clusters?

A

The percentage of CPU units that are reserved by running EC2 tasks in the cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the CPUUtilization metric for clusters?

A

The percentage of CPU units currently used, divided by the total amount reserved for a cluster / service (only includes EC2 tasks)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the MemoryReservation metric for clusters?

A

The percentage of memory that is reserved by running EC2 tasks in the cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the MemoryUtilization metric for clusters?

A

The total memory used by tasks in a service / cluster relative to the amount reserved (only includes EC2 tasks)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the GPUReservation metric for clusters?

A

The percentage of total available GPUs that are reserved by running tasks in the cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What CloudWatch metrics does ECS generate for services?

A

CPUUtilization, MemoryUtilization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the CPUUtilization metric for services?

A

The number of CPU units in use divided by the total amount reserved for a cluster / service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the MemoryUtilization metric for services?

A

The total memory used by tasks in a service / cluster relative to the amount reserved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What metrics are generated when Service Connect is enabled?

A

Metrics similar to an Application Load Balancer (ALB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How frequently are CloudWatch metrics published by ECS?

A

With a 1-minute frequency (ECS internally collects multiple samples and aggregates them)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you collect metrics via Prometheus?

A

By adding an AWS Distro for OpenTelemetry (ADOT) sidecar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a way to add an ADOT sidecar to a task definition?

A

The console has an option to automatically add this sidecar to the task definition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What state is a container instance in when the ECS agent health check passes?

A

OK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What state is a container instance in when the ECS agent health check fails?

A

IMPAIRED

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How frequently does the ECS agent perform health checks on the underlying EC2 instance?

A

Every two minutes

17
Q

What is the IMPAIRED state?

A

When the ECS agent health check fails

18
Q

What is the OK state?

A

When the ECS agent health check passes

19
Q

What can tasks use to access metadata about themselves?

A

The container metadata file and the task metadata endpoint

20
Q

What does the container metadata file provide information about?

21
Q

How is the container metadata file enabled for EC2 tasks?

A

By the ECS agent

22
Q

How is the container metadata file mounted?

A

As a Docker volume

23
Q

What does the task metadata endpoint return information about?

A

The current container or its task

24
Q

What allows services on the EC2 instance to access information about the container instance and its agent?

A

Container introspection

25
What types of events does ECS emit?
Container instance state changes, task state changes, and service action events
26
What are some examples of container instance state changes that trigger ECS events?
Tasks being stopped or started on the instance, or the agent disconnecting
27
What are some examples of INFO events related to service actions?
SERVICE_STEADY_STATE, TASKSET_STEADY_STATE, CAPACITY_PROVIDER_STEADY_STATE, SERVICE_DESIRED_COUNT_UPDATED
28
What are some examples of WARN events related to service actions?
SERVICE_TASK_START_IMPAIRED, SERVICE_DISCOVERY_INSTANCE_UNHEALTHY
29
What are some examples of ERROR events related to service actions?
SERVICE_DAEMON_PLACEMENT_CONSTRAINT_VIOLATED, ECS_OPERATION_THROTTLED, SERVICE_DISCOVERY_OPERATION_THROTTLED, SERVICE_TASK_PLACEMENT_FAILURE, SERVICE_TASK_CONFIGURATION_FAILURE
30
What service deployment state changes trigger ECS events?
SERVICE_DEPLOYMENT_IN_PROGRESS, SERVICE_DEPLOYMENT_COMPLETED, SERVICE_DEPLOYMENT_FAILED
31
What state will a service deployment be in while ECS is performing additional steps?
SERVICE_DEPLOYMENT_IN_PROGRESS
32
What state indicates a successful completion of a service deployment?
SERVICE_DEPLOYMENT_COMPLETED
33
What state indicates a failed service deployment?
SERVICE_DEPLOYMENT_FAILED