What are the main categories of failures with which to test the behavior of an EUC (Equipment Under Control)? How to design the system architecture so that it can tolerate a failure without losing safety function?
Hardware Failure
A fault occurs when a predetermined function cannot be performed or performance is outside the requirements.
Regarding hardware failures, mostly they are random failures:
- Safe failures
- Dangerous failures
- No effect failures
- No part failures
Download Infographics
Do you want to contribute to our page? Follow us on Linkedin
-
Safe failures
Failure of an element and/or subsystem and/or system that plays a role in implementing the safety function which:
- determines the spurious operation of the safety function to bring the EUC (or part of it) into a safe state or maintain a safe state
- increases the likelihood that the spurious operation of the safety function will bring the EUC (or part of it) into a safe state
Failures result in a loss of production of services, but not a loss of safety.
Safe failures can be:
- Detected, when the failure is detected through internal or external diagnosis
- Undetected, when the failure is not detected through internal or external diagnosis
-
Dangerous failures
Failure of an element and/or subsystem and/or system that plays a role in implementing the safety function that:
- prevents the operation of a safety function when required (demand mode) or causes the failure of a safety function (continuous mode) so that the EUC is put into a dangerous or potentially dangerous state
- decreases the probability that the safety function functions correctly when required
Failures lead to a loss of safety.
Also, dangerous failures can be detected or undetected.
-
No effect failures
This is the failure of an element that plays a role in the implementation of function’s safety but has no direct effect on it.
Failure without effect is not a necessary parameter for SIL evaluation.
-
Guasti no part
Failure of an element that plays a role in implementing the safety function, but which has no direct effect on the safety function. Failure without effect is not a necessary parameter for SIL evaluation.
Recommended in-depth study:
Spurious failure
A spurious activation is when an instrumented safety function is activated unrequested.
A spurious activation of an SIS will normally result in the safe state of the equipment under control (EUC). However, spurious activations may be unwanted due to:
- Generation of unnecessary production losses
- Generation of “false alarms”, which may again lead to loss of confidence in the SIS
- Increased risk of dangerous events following a spurious activation; for example, during start-up
- Excessive stress on components and systems during shutdown and start-up
- Spurious activation can also create a dangerous event. For example, spontaneous airbag deployment while driving.
Common cause failure
Common cause failure can occur when:
- Two or more elements in the same safety instrumentation function must not be completely independent
- Two or more safety instrumented functions in the same system must not be completely independent
- Common cause failure is a typical redundant configuration factor
- Failures are due to a single underlying physical defect or phenomenon
- Faults due to Common Cause occur at the same time or in a specific time interval (often of short duration)
Common cause failures are not considered, for example, in the NooN configuration (parallel configuration), where it loses “safety instrumented function” operation as soon as a fault occurs. As such, there is no interest in considering common cause faults when calculating the next fault.
Architecture examples
Single- and multiple-channel architecture is shown, in which, for the sake of simplicity, the sensor and the final element always remain individual.
The single-channel architecture has low reliability. The channel having a hidden fault is enough for the safety function to be lost.
For example, a single-channel architecture availability can be better performed by combining a diagnostic circuit that can work independently and can perform the emergency if it detects a fault in the channel.
Double-channel architecture puts the system in a safe state, one of the two outputs must open:
- This model is more reliable, as it can tolerate a fault without losing the safety function.
- Common cause failures must also be considered.
- Very low availability, as the first spurious failure leads the system to safety.
Triple-channel architecture puts the system in a safe state, 2 out of 3 outputs must open, so this guarantees high reliability and high availability.
This is the architecture with the best compromise between availability and reliability.
Even the two- and triple- channel architectures can be upgraded by combining them with a diagnostic circuit able to put the system in emergency mode.