Software reliability (week 11) Flashcards
How is software reliability defined?
The probability of failure-free software operation for a specified period of time in a specified evnironment
Describe Probability of Failure on Demand (POFOD) measure
- Likelihood a system will fail when a request for service is made
- A POFOD of 0.0001 means 1 in 10000 requests may result in a failure
- Very relevant for safety critical systems
Describe Rate of Occurrence of Failure (ROCOF) measure
• Frequency of occurrence of failures
• ROCOF of 0.005 means 5 failures are likely in each 1000 time
units
• Relevant for banking/financial systems
Describe Mean Time Between Failures (MTBF) measure
• Measure of time between recoverable failures over the lifetime of
the product
• A MTBF of 100, failures may occur between every 100 time units
• Relevant for ‘long duration transaction’ systems such as database
operations, word processor systems, etc
• MTBF should be longer than the typical transaction length
Describe Availability (AVAIL) measure
• Measure of how long a system is available for use
• An availability of 0.995 means a system will likely be available for
995 out of 1000 time units
• Relevant for continuous systems
• Such as communication/broadband systems
Describe Mean Time to Recover (MTTR)
• Average time taken to recover from a software failure
• Eg: if Amazon MTBF was three years, and the MTTR was one day,
then customers would notice about the long recovery time.
• If the MTBF was, say, twice per day, but the MTTR was less
than a second, then customers would not notice..!
• Backup systems/system redundancy attempts to reduce MTTR to
very low levels
What does a reliability specification consist of?
• Need a quantitative statement of the reliability requirement
• A description of the environment where the equipment is to be
used, stored, etc.
• Clear definition of what constitutes a failure
• What tests are used to demonstrate reliability
Give an example of reliability specification
For example a radar system which uses high and low
power searching and tracking
• Specify MTBF required for each case
• Condition of use for the system: • Temp, vibration, light, pressure, operator ability
• Successful operation – is it measurable?
• Failure performance specification
• How, who, when, where of the tests to determine
reliability
Describe reliability Engineering?
Focuses upon costs of system downtime, spares, personnel, cost of
repair
• Safety Engineering focuses upon accident reduction and risk
reduction (minimising life threatening situations)
• A highly reliable system results not only from good reliability
engineering
• Predicting failure rates, costs, etc
• But also on good development engineering, rigorous methodology
applied etc.
How is AVAIL calculated?
MTBF/(MTBF+MTTR)