14 Fault Tolerance Flashcards
How to contain defects
Duplication and backup to reduce the chances for (software) failures or damages due to them.
Rare event assumption
Rare Event Assumptions: Impossible to anticipate all rare events.
Failure Independence assumption: Different components fail independently of one another.
Recovery Block
Failures are detected, but the underlying faults are not removed.
NVP
Software’s basic functional units consist of N parallel independent versions.
The system input is distributed to all the N versions.
The individual output for each version is fed to a decision unit.
The decision unit determines the system outoput using a specfic decision algorithm.
Ways to acheive diversity in Version Independence
People Diversity
Process diversity
Technology Diversity
In Fault Tree Analysis, circles are:
Uncontrollable Event
In Fault Tree Analysis, rectangle are:
Controllable Events
Risk
“the possibility of suffering loss”
Not bad, it is essential to progress.
The challenge is to manage the amount of risk.
Risk can be one of two parts:
Risk Assessment
Risk Control
Risk Exposure
For each risk:
RE = p(unsatisfactory outcome) X loss(unsatisfactory outcome)
Risk Reduction Leverage
For each mitigation action;
RRL = (REbefore - REafter) / cost of intervention.
Risk Assessment
Quantitative (Standard Costs and probability measures)
Qualitative (Develop a risk classification matrix)
Containment Walls
To contain such damaing disasters.