Don't miss MaximoWorld 2024, the premier conference on AI for asset management!

Experience the future of asset management with cutting-edge AI at MaximoWorld 2024.

Sign Up

Please use your business email address if applicable

The Connection Between Equipment Risk and Equipment Reliability and Its Affect

Operational equipment reliability, and the resulting plant uptime, are inversely linked to the number of risks you allow your equipment and machinery to suffer. The contrary connection between equipment risk and reliability is not obvious, but it reveals itself to us when the risk equation is divided into its fundamental elements.

We start by examining the most commonly used form of the risk equation:
Risk ($/yr) = Consequence of Occurrence ($) x Frequency of Occurrence (/yr)

The equation says that risk is equal to the cost of a failure event multiplied by the frequency of the event. 

The Frequency of Occurrence divides further, so the full form of the risk equation becomes: Risk ($/yr) = Consequence ($) x [No. of Opportunities to Fail (/yr) x Chance of a Failure]

The Number of Opportunities to Fail is how many times a year a situation arises that could lead to a failure event. The Chance of a Failure is the odds that a failure will happen once there is an opportunity. Throw the two dice in Figure 1, and every throw is an opportunity to get one on each die, but the odds are 1 in 36 that it will actually happen in the next throw.

Figure 1

The Chance of Failure is one (1) if it will definitely fail every time the opportunity arises, and it is zero (0) if there will never be a failure when the situation arises. Chance uses values between 1 and 0 because the likelihood of a thing going wrong is usually possible to some degree. The chance of both dice being one is 0.0278-poor odds to bet on.

For operating plant and equipment the Chance of Occurrence of equipment failure becomes the Chance of Equipment Failure, which is the opposite of Equipment Reliability (the chance of not failing, i.e. the chance of success).

The reliability equation for equipment is:
Equipment Reliability = 1 - Chance of Equipment Failure
With a little manipulation, this becomes:
Chance of Equipment Failure = 1 - Equipment Reliability
Including equipment failure into the full risk equation, we get:
Risk ($/yr) = Consequence ($) x [No. of Opportunities to Fail (/yr) x {1 - Equipment Reliability}]

The full risk equation gives us massive insight into how we can maximize production equipment uptime. There is a direct inverse connection between equipment risk and equipment reliability. When equipment reliability is perfect (Reliability =1) the risk is zero, and if there are no opportunities to fail, there is also no risk (Opportunity = 0). If you want high equipment reliability, you must remove the possibility of a failure event arising in your machines and equipment.

Now that the connection between high equipment risk and low reliability is clear, we can make better operational and maintenance strategy choices.

Table 1

Impact of Equipment Risk on Maintenance Strategy

Risk is reduced by minimizing the consequence of an event or by reducing the frequency of an event. Which focus you chose to take as your key operational risk management strategy will be a major factor in your future production success. Table 1 shows a range of the common maintenance and reliability strategies divided into chance reduction strategies and consequence reduction strategies.

Consequence reduction strategies limit cost escalation by reacting to developing failure quickly. These strategies allow failure to start, and then you manage a problem so the least time, money, and effort is lost. They tolerate failure and loss as routine. They accept that it is only a matter of time before problems severely affect an operation.

Companies that use consequence reduction strategies minimize their losses by learning to fix problems and breakdowns fast and/or by doing lots of predictive maintenance to find embryonic failures. They hold many spare parts in store for insurance, set up a cache of parts by machines, train their repair people to fix things speedily, improve maintainability to do repairs faster, and have dedicated condition-monitoring groups looking at equipment for problems.

Minimizing risk by reducing its consequences means that you accept failure as normal. In an organization that mainly uses consequence failure management, its people wait for evidence of failures and then act. Reducing only the consequences of risk still makes work for everyone. This work never ends, because people and resources fix failures instead of removing failure causes so that there are fewer opportunities to have failures. In this way, a reactive culture is instilled in the organization.

Figure 2

The risk matrix of Figure 2 shows that reducing the consequences of an incident reduces risk since less money is lost-you move to the left on the matrix. That is the purpose of such things as emergency plans, fire brigades, and ambulances. If we react quickly, correctly, and early enough, the losses can be minimized.

The use of consequence reduction techniques on your equipment is an important risk control principle to contain costs, but it will not improve your reliability. Those activities that reduce failure consequence improve availability but do not improve reliability. You save some maintenance costs by preventing breakdowns, but there will be much frantic activity and "fire-fighting." For reliability improvement, you must reduce the frequency of failure; you must remove the chance of failure happening.

The alternate equipment risk management strategy we can apply is to use chance reduction techniques. Fewer failure incidents occur because chance reduction stops failure opportunities from starting. The risk matrix shows that chance reduction strategies lead to fewer failure events; reliability improves because you reduce the frequency of failure. The number of incidents fall over time. If failures drop from once a quarter to once a year to once every two years to once every five years, you have created reliability. On the risk matrix, reliability improvement moves you down the table.

Chance reduction strategies focus on identifying potential problems and making business system changes to prevent or remove the prospect of failure. The chance reduction strategies view failure as avoidable and preventable. These methodologies rely heavily on improving business processes rather than improving failure detection methods. They expend time, money, and effort to identify and stop problems so that the chance of failure is minimized.

The maintenance activities that pay-off the most are those that reduce frequency of a failure event. Stop an equipment risk incident from happening, and the equipment failure event cannot occur. If a maintenance activity does not reduce equipment risk, it is a waste of time, money, and effort. When you reduce failure frequency you automatically increase equipment reliability. With high reliability comes high availability, high throughput, and low maintenance costs.

You cannot expect to move more than a cell to the left on the risk matrix by using consequence reduction strategies. Your costs might halve, or even drop to a quarter, if you get good at spotting and managing impending failures, but when using frequency reduction strategies, you can easily move down many cells, bringing you a reduction in risk of up to hundreds of times. Consequence reduction strategies cannot achieve that amount of improvement. The use of chance reduction techniques should be your prime means of equipment risk control because they will give you both large maintenance cost reductions and far higher equipment reliability.

Both equipment risk reduction philosophies are necessary for optimal protection, but a business with a chance reduction focus will proactively prevent defects, unlike one with a consequence reduction focus that will find and fix failures early. Those organizations that primarily apply chance reduction strategies have truly set up their business to ensure decreasing numbers of failures, as a consequence they get outstanding equipment reliability and reap all the wonderful business performance that world-class reliability brings.

It is in your organization's best interest, and it will generate the most profit consistently for the least amount of work, to focus strongly on the use of chance reduction strategies. Consequence reduction strategies are still important and necessary-once a failure sequence has initiated, you must find it quickly, address it, and minimize its effects so you lose the least amount of money. But consequence reduction will not take your organization to world-class success and profit, because it expends resources. Only chance reduction strategies reduce the need for resources, because they proactively eliminate failure incidents through defect elimination and failure prevention that removes the opportunity for failures to start.

Mike Sondalini

Mike Sondalini has been in engineering and maintenance since 1974. Mike's career extends across original equipment manufacturing, beverage production, steel fabrication, industrial chemical manufacturing, quality management, project management, industrial asset management, and industrial training. His specialty is helping capital equipment-intensive companies build sound business risk management practices, introduce world-class lean practices, develop ultra-high reliable enterprise asset management systems, and instill the precision maintenance skills needed to continually improve plant uptime.

ChatGPT with
Find Your Answers Fast