Fault Tolerance Flashcards
What are faults?
Unavoidable events
What is fault tolerance?
The aim to design a system in such a way that faults do not result in system failure
What is redundancy?
The use of additional elements within a system which would be required if the system was free of faults.
What is the triple modular redundancy (TMR) system?
A fault-tolerant system architecture that triplicate the processing elements and associated hardware , running them concurrently and uses voting mechanisms to achieve reliability by tolerating and correcting single faults.
What does a TMR system consist of?
3 identical hardware modules and one voting module.
What is the output of a TMR’s system voting module?
Output of the majority of the modules.
What is temporal redundancy?
Used in order to tolerate or detect transient faults
What is the disadvantage of TMR?
- Doesn’t cover design faults in the modules
- If one module fails it is likely that identical modules fail at the same time
- Helps only against random faults not systematic faults
- Doesn’t help against simultaneous failures of multiple modules.
What does many fault-tolerance techniques rely on?
Detection of faults
What are the three methods for obtaining hardware fault tolerance?
- Static redundancy
- Dynamic redundancy
- Hybrid approaches
What is static redundancy?
Static redundancy uses fault masking to hide faults, so that the system continues to work correctly even if a fault occurs.
What is dynamic redundancy?
Dynamic redundancy uses fault detection, to detect if a fault occurs and reconfigures the system in order to nullify the effects of faults.
What is hybrid approaches for hardware fault tolerance?
Using fault masking to prevent errors from propagating through the system and fault detect to reconfigure the system so that faulty units are removed
What may static redundancy use?
Fault masking and voting mechanisms.
What are single-point failures?
Vulnerabilities in a system where the malfunction/failure of a component can lead to a complete or partial breakdown of a system.
How can we avoid single point failures in the voting elements?
Triplicate the voting and pass the 3 output of the voting elements on to the next modules.
What is N-Modular Redundancy (NMR)?
Uses N modules instead of 3 modules and voting among these
What is the disadvantage of N-Modular Redundancy (NMR)?
- Additional cost
- Size
- Weight
- Power consumption