[06] Services & Scheduling Flashcards
What types of service autoscaling does ECS support?
Target tracking, step scaling policies, and scheduled scaling
What does target tracking autoscaling do?
The desired count of the service is adjusted to maintain a metric at a target value
What is a limitation of target tracking autoscaling?
It only works for services that should scale-out when the metric is above the target value
How are scaling decisions made for scale-out and scale-in with multiple scaling policies?
Scale-out occurs if any of the policies are ready to scale-out, scale-in occurs if any of the policies are ready to scale-in
What does step scaling autoscaling do?
The desired count is adjusted in steps based on the size of alarm breaches
What is an advantage of step scaling autoscaling?
Step-scaling allows faster scale outs because the threshold and step size is configurable
What happens during a cooldown period after a scaling event?
Further scale-outs will only occur if they are larger than the previous event, and scale-in activities are blocked
What are scaling decisions based on?
The actual number of running tasks in the service, not the desired count
What happens to scale-in processes during deployments?
Scale-in processes are stopped during deployments, but scale-outs can occur
What are the most common forms of service load balancing?
ALBs and NLBs
Can NLBs be used for dynamic port mapping?
Yes
What should the load balancer’s subnet configuration include?
All the AZ’s that tasks are running in
What condition should be met to prevent 502 errors?
StopTimeout > target group registration delay > client connection timeout
What other condition should be met to prevent 502 errors?
task idle timeout > ALB idle timeout
Do the load balancer and tasks need to be in the same VPC?
Yes, but not necessarily the same subnets
What can be assigned to NLBs to give them a static IP?
Elastic IPs
Can services be registered to multiple target groups?
Yes, but these target groups must be used either all with ALB(s), or all with NLB(s)
How can NLBs be configured to terminate connections when the task is stopped?
By setting the connection terminate setting in the target group
What are the two ways to connect to a service via API Gateway?
- Create an HTTP API and route to the service using an ALB, NLB or CloudMap 2. Create a REST API, configure private integration, and route to the service using an NLB
What does Service Connect only support connectivity between?
ECS services
Why does Service Connect only support connectivity between ECS services?
Because it doesn’t publish DNS records - the proxy sidecar discovers the endpoints using the CloudMap API
What networking modes are supported by Service Connect?
bridge
and awsvpc
What features do the proxies attached to each task by Service Connect enable?
Round-robin load balancing, retries, and the collection of CloudWatch metrics
What does Service discovery integrate with?
Cloud Map
What does Service discovery support in addition to ECS clients?
Non-ECS clients
What networking modes are supported by Service discovery?
awsvpc
and bridge
for clients that support SRV
records
What are the three states a service can be in?
ACTIVE, DRAINING, INACTIVE
When is a service in the DRAINING state?
Deletion has been triggered, but there are still active tasks
When is a service in the INACTIVE state?
All tasks have transitioned to STOPPING or STOPPED and ECS is ready to delete the service
What happens to unhealthy tasks in a service?
They are replaced by the service scheduler
If a replacement task is UNHEALTHY, what does the service scheduler do?
It will stop either the original or replacement task
What are the two service scheduler strategies?
REPLICA, DAEMON
What does the REPLICA scheduler strategy do?
The scheduler maintains the desired number of tasks and spreads tasks across AZs by default
What does the DAEMON scheduler strategy do?
Exactly one instance of the task runs on each active container instance that matches the task placement constraints
What happens with daemon tasks on an instance?
They launch before replica tasks are started on an instance, and are the last to stop to ensure they have priority for resource reservations
What are the three update deployment types for services?
Rolling update, Blue / Green deployment with CodeDeploy, External deployment
What is a rolling update?
Current tasks are incrementally replaced with new tasks
What is minimumHealthyPercent for a rolling update?
The lower limit on the number of tasks that should run during a deployment, expressed as percentage of the desired count
What is maximumPercent for a rolling update?
The upper limit of tasks that can run in the service relative to the desired count
What are stuck deployments?
When the scheduler can’t stop or start any new tasks
What causes a stuck deployment when tasks can’t be stopped?
The minimum required tasks is rounded up e.g. if the desired count is two and minimumHealthyPercent is 75%, then tasks can’t be stopped
What causes a stuck deployment when tasks can’t be started?
The maximum allowed tasks is rounded down e.g. if the desired count is 2 the maximum percent is 125%, then tasks can’t be started
What is failure detection for deployments?
It identifies when a deployment has failed, and optionally triggers a rollback to the last COMPLETED deployment
What does the deployment circuit breaker check?
That the tasks transition to the RUNNING state, and don’t fail ELB, CloudMap or container health checks
What is the failure threshold for the deployment circuit breaker?
50% of the desired task count, bounded on [3, 200] tasks
How can CloudWatch alarms be used for custom failure detection criteria?
The alarm is ignored if it’s in the ALARM state at the beginning of the deployment, and during the bake time ECS continues to monitor the alarms even after the new tasks have started while the deployment remains IN_PROGRESS
What steps does the service scheduler take to replace tasks during a deployment?
First removes the task from the load balancer and waits for connections to drain, then sends a SIGTERM to the container(s), then a SIGKILL if they don’t stop before the stopTimeout
What handles Blue/Green deployments?
CodeDeploy
What are the three ways to shift traffic during a Blue/Green deployment?
Canary, Linear, All-at-once
What is required for a Blue/Green deployment?
The service must use an ALB or NLB with two target groups
What does the external deployment type allow?
ECS to delegate deployment to a third-party controller, which uses the ECS API actions e.g. UpdateTaskSet
What is task scale-in protection?
It prevents a specific task from being terminated by service autoscaling or a deployment
How can task scale-in protection be set?
Either with the container agent endpoint or ecs:UpdateTaskProtection
What is service throttling?
When service tasks repeatedly fail to reaching RUNNING
What happens during service throttling?
The time between subsequent restart attempts is gradually increased up to 15 minutes
What tasks will trigger service throttling?
Only tasks that never reach RUNNING, not tasks which fail health checks or immediately exit
What service event is generated during throttling?
(service service-name) is unable to consistently start tasks successfully.
What states do tasks transition through in ECS?
PROVISIONING, PENDING, ACTIVATING, RUNNING, DEACTIVATING, STOPPING, DEPROVISIONING, STOPPED
What state will a task be in while ECS is performing additional steps before it is launched?
PROVISIONING
What is the PROVISIONING state?
ECS needs to perform additional steps before the task is launched e.g. provision an ENI for the task if it’s awsvpc
What state is a task in when ECS is waiting for the agent to take further action?
PENDING
What is the PENDING state?
ECS is waiting for the agent to take further action
What state is a task in when ECS is performing additional steps after it has launched?
ACTIVATING
What is the ACTIVATING state?
ECS is performing additional steps after the task has launched e.g. registering it with an ELB target group
What state is a task in when it is successfully running?
RUNNING
What is the RUNNING state?
The task is successfully running
What state is a task in when ECS is performing additional steps before stopping it?
DEACTIVATING
What is the DEACTIVATING state?
ECS is performing additional steps before the tasks is stopped e.g. deregistering it from a target group
What state is a task in when ECS is waiting for the agent to stop it?
STOPPING
What is the STOPPING state?
ECS is waiting for the ECS agent to take further action
What linux parameters are supported for stopping tasks?
For Linux containers, the agent sends a SIGTERM to the container(s), then SIGKILL after waiting the stopTimeout
What state is a task in when ECS is performing additional steps after stopping it?
DEPROVISIONING
What is the DEPROVISIONING state?
ECS is performing additional steps e.g. deleting the ENI for awsvpc tasks
What state is a task in after it has been successfully stopped?
STOPPED
What is the STOPPED state?
The task has been successfully stopped