Dependability - Theory Flashcards
What are the basic steps on building reliable systems?
Error detection, error containment, error masking
What dependability is?
A measure of how much we trust a system
The ability of a system to perform its functionality while exposing reliability, availability, maintain ability, safety, security
What is reliability?
Continuity of correct service
What is availability?
Readiness for correct service
What is maintainability?
Ability for easy maintenance
What is safety?
Absence of catastrophic consequences
What is security?
Confidentiality and integrity of data
When do we think about dependability?
During design time and runtime
Failures in development should be avoided, failures in operation cannot be avoided, they must be dealt with
Design should take failures into account and guarantee that control and safety are achieved when failures occur. Effects of such failures should be predictable and deterministic not catastrophic.
How can we provide dependability?
Through failure avoidance, and tolerance partum
What are some of the failures avoidance procedures we can take?
Conservative design
Design validation
Detailed test
Infant mortality screen
Error avoidance
What techniques can we implement in order to increase tolerance?
Error detection/error masking during system operations
Online monitoring
Diagnostics
Self recovery and self repair
Define reliability and how it is calculated
Ability of a system or components to perform its required functions under stated conditions for a specified period of time
It is therefore the Probably that the system will operate correctly in specified operating environment until time T
Define availability in how it is calculated
The degree to which system or component is operational and accessible when required for use
Is calculated by dividing the uptime by the sum of the uptime with the downtime (total time)
It is the probability that the system will be operating at time T
Is it possible to have systems with low reliability that have high availability? What about the opposite?
Yes, system failures can be repaired quickly and do not damage data, low reliability may not be a problem
The opposite is generally more difficult
What is MTTF?
Meantime to failure it is the meantime before any failure will occur