Availabilty Flashcards

Question 1

Q

Describe Availability.

Answer

A

Availability refers to a property of software that it is there and ready to carry out its task when you need it to be through fault tolerance (preventing a fault from becoming a failure) and recoverability (recovering from a fault)

Question 2

Q

What is the formula to calculate Steady-state availability.

Answer

A

MTBF/(MTBF + MTTR)

Question 3

Q

What properties do we use to measure availability?

Answer

A

MTBF: the mean time between failures
MTTR: the mean time to repair

Question 4

Q

What are the goals of availability availability

Answer

A

Availability tactics enable a system to endure faults so that services
remain compliant with their specifications.
The tactics keep faults from becoming failures or at least bound the
effects of the fault and make repair possible.

Question 5

Q

What are the benefits of redundant spares?

Answer

A

The benefit of a redundant spare is a
system that continues to function
correctly after only a brief delay in the
presence of a failure.
* The alternative is a system that stops
functioning correctly (or altogether)
until the failed component is repaired.
* This could take hours or days.

Question 6

Q

What are the trade offs in redundancy spares?

Answer

A

The tradeoff with any of these patterns is the additional cost and
complexity incurred in providing a spare.
* The tradeoff among the three alternatives is the time to recover from
a failure versus the runtime cost incurred to keep a spare up-to-date.
* A hot spare carries the highest cost but leads to the fastest recovery
time, for example.

Question 7

Q

What are the benefits of the TRW pattern?

Answer

A

TMR is simple to understand and to implement.
* Is independent of what might be causing disparate results and is only concerned about making a reasonable choice so that the system can continue
to function.

Question 8

Q

What are the tradeoffs in TRW?

Answer

A

There is a tradeoff between increasing the level of replication, which raises
the cost, and the resulting availability. In systems employing TMR, the
statistical likelihood of two or more components failing is vanishingly small,
and three components represents a sweet spot between availability and cost.

Question 9

Q

What are the tradeoffs in TRW?

Answer

A

There is a tradeoff between increasing the level of replication, which raises
the cost, and the resulting availability. In systems employing TMR, the
statistical likelihood of two or more components failing is vanishingly small,
and three components represents a sweet spot between availability and cost.

Question 10

Q

What is the goal of the circuit breaker pattern?

Answer

A

. A circuit breaker keeps the invoker from trying countless times, waiting for a response that never comes.

Question 11

Q

What are the trade offs in the circuit breaker pattern?

Answer

A

Care must be taken in choosing timeout (or retry) values. If the timeout is too long,
then unnecessary latency is added. But if the timeout is too short, then the circuit
breaker will be tripping when it does not need to—a kind of “false positive”—which
can lower the availability and performance of these services.
© Len Bass, Paul Clements, Rick Kazman, distributed und

Question 12

Q

Describe the process pairs pattern for availability.

Answer

A

This pattern employs checkpointing and rollback. In
case of failure, the backup has been checkpointing and (if necessary)
rolling back to a safe state, so is ready to take over when a failure
occur

Question 13

Q

Name the detection tactics for availability.

Answer

A

Ping/ echo
Heartbeat
Monitor
Timestamp
Sanity Checking
Condition Monitoring
Voting
Exception Detection
9.Self Test

Question 14

Q

Name and describe the recovery tactics used in Availability.

Answer

A

Redundant spare. This tactic refers to a configuration in which one or
more duplicate components can step in and take over the work if the
primary component fails.
Exception Handling: dealing with the exception by reporting it or
handling it, potentially masking the fault by correcting the cause of
the exception and retrying.
* Rollback: revert to a previous known good state, referred to as the
“rollback line”.
* Software Upgrade: in-service upgrades to executable code images in
a non-service-affecting manner.
* Retry: where a failure is transient retrying the operation may lead to
success.
* Ignore Faulty Behavior: ignoring messages sent from a source when it
is determined that those messages are spurious.
* Graceful Degradation: maintains the most critical system functions in
the presence of component failures, dropping less critical functions.
* Reconfiguration: reassigning responsibilities to the resources left
functioning, while maintaining as much functionality as possible.
Shadow: operating a previously failed or in-service upgraded component in a
“shadow mode” for a predefined time prior to reverting the component back to
an active role.
* State Resynchronization: partner to active redundancy and passive redundancy
where state information is sent from active to standby components.
* Escalating Restart: recover from faults by varying the granularity of the
component(s) restarted and minimizing the level of service affected.
* Non-stop Forwarding: functionality is split into supervisory and data. If a
supervisor fails, a router continues forwarding packets along known routes while
protocol information is recovered and validated.

Availabilty Flashcards

(14 cards)