Reliability Specification Flashcards
What is meant by Reliability Specification
Meaning: is a technical document that outlines the requirements for a systems reliability.
Reliability can be measured, so non-functional reliability requirements may be specified quantitatively.
- Non-functional reliability: requirements define the number of failures that are acceptable during normal use of the system or the time in which the system must be available.
- Functional reliability requirements: define system and software functions that avoid, detect or tolerate faults in the software and so ensure that these faults do not lead to system failure.
Software reliability requirements may also be included to cope with hardware failure or operator error.
Name each part of the specification process (4)
Risk Identification:
Identify the types of system failures that may lead to economic losses.
Risk Analysis:
Estimate the costs and consequences of the different types of software failure
Risk Decomposition:
Identify the root causes of system failure
Risk Reduction:
Generate reliability specifications, including quantitative requirements
defining the acceptance levels of failure.
Recap (Name the types of System Failures)
Loss of Service:
- The system is unavailable and cannot deliver its services to users.
Incorrect service delivery:
The system does not deliver a service correctly to users.
System/data corruption:
Damages to the system or its data. Usually in conjunction with other types of failures.
What is meant by a reliability metric?
Meaning: Probability that a system failure will occur when in use in a particular setting.
Note: System reliability is measured by counting the number of operational failures and, where
appropriate, relating these to the demands made on the system and the time that the system has been operational.
Key Note: A long-term measurement programme is required to assess the reliability of critical systems.
Name and Describe each Reliability Metric?
Probability of failure on demand (POFOD):
- Meaning: The probability that the system will fail when a request for service is made.
- Used when demands for service are intermittent and relatively infrequent.
- Appropriate for protection systems where services are demanded occasionally and where there are serious consequences in case of failure.
Rate of occurrence of failures/Mean time to failure (ROCOF/MTTF):
- Meaning: Reflects the rate of occurrence of failure in the system.
* ROCOF of 0.002 means 2 failures
are likely in each 1000 operational
time units e.g. 2 failures per 1000
hours of operation.
- Relevant for systems that process a large number of similar requests in a defined time period.
* E.g., Credit card processing system, Supermarket checkout system.
Mean Time to Failure (MTTF):
- Meaning: Measures the average length time a system can be expected to run without failure.
- Relevant for systems with long transactions i.e. where system processing takes a long time. e.g. conveyor belt
- MTTF should be longer than expected transaction length so that the system does not normally fail (recall meaning) during a session or transaction.
Availability:
- Meaning: Measure of the fraction
of the time that the system is
available for use.
- Takes repair and restart time into
account.
- Availability of 0.998 means
software is available for 998 out of
1000 time units.
- Relevant for non-stop,
continuously running systems:
* telephone switching systems,
railway signalling systems,
e-commerce systems, etc
What are the failure consequences?
When specifying reliability, it is not just the number of system failures that matter but the consequences of these failures.
Failures that have serious consequences are clearly more damaging than those where repair and recovery is straightforward.
In some cases, therefore, different reliability specifications for different types of failure may be defined.
What is meant by ‘over-specification’ of reliability?
Meaning: a high-level of reliability is specified but it is not cost-effective to achieve this.
- In many cases, it is cheaper to accept and deal with failures rather than avoid them occurring.
How do we avoid this:
- Specify reliability requirements for different types of failure. Minor failures may be acceptable.
- Specify requirements for different services separately. Critical services should have the highest reliability requirements.
- Decide whether or not high reliability is really required or if dependability goals can be achieved in some other way.
Describe the steps to reliability specification?
- For each sub-system analyse the consequences of possible system failures.
- From the system failure analysis, partition failures into appropriate classes.
- For each failure class identified, set out the reliability using an appropriate metric.
- Identify functional reliability requirements to reduce the chances of critical failures.
Note: Different metrics may be used for different requirements.
What is meant by Functional reliability requirements?
Name each type of functional reliability requirement?
Meaning: specification for system and software functionality that avoids, detects and tolerates software faults.
3 Types:
Checking requirements:
- identify checks needed to ensure that incorrect data is detected before it leads to a failure.
Recovery requirements:
- Are geared to help the system recover after a failure has occurred.
Redundancy requirements:
- specify redundant features of the system to be included.
Process requirements:
- Are for reliability which specify the development process to be
used may also be included.
Summary of Topic (Reliability Specification)
Reliability requirements can be defined quantitatively. They include
probability of failure on demand (POFOD), rate of occurrence of failure (ROCOF) and availability (AVAIL).