12.13.Fault Tolerance - Transactions Flashcards
What is a characteristic that distinguishes distributed systems from single-machine systems?
Partial failure
What is the goal when partial failure occurs?
Tolerate faults
What is being fault tolerant related to?
Dependability
What is dependability>
the trustworthiness of a computing system which allows resilliance to be justifiably placed on the service it delivers
What are the requirements for Dependability?
- Availability
- Reliability
- Maintainability
- Safety
What does Safety mean?
If and when FAILURES occur the CONSEQUENCES are not catastrophic for the system
What does Availability mean?
the probability that the system operates correctly at ANY GIVEN MOMENT
What does Reliability mean?
LENGTH OF TIME that it can run continuously without failure
What does Maintainability mean?
how EASILY a failed system can be REPAIRED
Different types of failures?
- Crash
- Omission
- Response
- Timing
- Arbitrary (Byzanitine)
What is a technique for failure masking?
Redundancy
How many types of redundancy are there?
- Physical
- Information (send extra bits to allow for recovery if need be)
- Time (repeat action if need be)
What is one of our most important considerations in failure masking?
Making sure that a failure won’t leave the system in an inconsistent state
How is avoiding leaving the system in an inconsistent state achieved?
1.Atomic operations!
“The sequence of operations must execute as an ATOMIC operation”
When do concurrent executions not interfere with each other?
If their execution is equivalent to a serial one (they don’t interleave)