High Availability Practices Flashcards

Question 1

Q

What can affect availability

Answer

A

System Maintenance, Software Updates, Infrastructure issues, Malicious Attacks, System load and dependencies. Additionally, in the cloud, latency and provider issues.

Question 2

Q

How is availability measured

Answer

A

Availability is typically measured by SLA and using 9s. For example, Five 9s mean 99.999%

Question 3

Q

How do you monitor availability

Answer

A

Create a Health Check Endpoint

Question 4

Q

What should a health check endpoint monitor

Answer

A

Subsystems like storage, databases and third-party dependencies

Question 5

Q

What should a health check endpoint return and should you secure a health check endpoint

Answer

A

Status Code content, yes it should be secure

Question 6

Q

What are some methods that can be employed to ensure high availability

Answer

A

Queues/Streams, Throttling,

Question 7

Q

How can throttling be employed

Answer

A

Set a limit to individual user access, monitor metrics and reject when limit is exceeded

Disable or degrade nonessential services so that critical services can function, for example, a video call can switch to audio only during bandwidth issues

Prioritize certain users to satisfy high impact customers’ requirements

Question 8

Q

How can a queue be employed

Answer

A

Introduce a Queue between the task and service
The tasks are placed in the Queue

The Service can possibly be autoscaled based on Queue Size in some advanced implementations.

If a response is expected, the service must provide a suitable implementation, however, this pattern isn’t suitable for low latency response requirements

Question 9

Q

What are some resiliency patterns

Answer

A

Bulk Head, Circuit Breaker, Compensating Transaction, Retry, Leader Election, Scheduler Agent Supervisor, If on AWS: Multiserver Pattern, MultiDatacenter Pattern, Floating IP

Question 10

Q

What is the bulk head resiliency pattern

Answer

A

Partition services into groups, Limit service resources to that group, Define partitions into business and tech requirements, hiPri customers get more resources, Leverage frameworks like polly/hystrix that limit containers resources

Question 11

Q

What is the circuit breaker resiliency pattern

Answer

A

If a service negatively affects applications if it were to continue to run, it is shut down.

Question 12

Q

What is the compensating transaction resiliency pattern

Answer

A

Records all steps to a workflow and undoes them if there is a failure.

Question 13

Q

What is the retry resiliency pattern

Answer

A

Intelligently attempt to reestablish contact with a failing service

Question 14

Q

What is the leader election resiliency pattern

Answer

A

A single task instance should be elected as leader. This will coordinate the actions with other subordinate instances.

High Availability Practices Flashcards

To enhance knowledge of providing high availability of applications