Saving Cost by Reducing Reliability?

Otherwise the two differ. In an economic downturn, we may need lower production availability. That can help reduce maintenance costs. Availability depends on both reliability and maintainability. What should we work on to achieve the best results?

Cash Streams Related to Reliability

The link between reliability and cost is quite complex. There is first the ‘set-up' cost of installing reliability improvement measures and attaining a reliability culture. This takes time, effort and resources, and is a ‘non-refundable cost'; if you don't use it, you do not get your money back, it is merely frittered away. Next there is a cost of sustaining this position; this is an ongoing cost, which can be reduced or eliminated if we do not wish to retain the achieved levels of reliability. These two result in a negative cash flow.

What we get in the benefits stream is distinct and different from the above ‘costs'. As reliability improves, trips and breakdowns are reduced or eliminated. This has two positive cash flow effects; the uptime improves, raising potential profitability, while maintenance costs fall, due to fewer breakdowns.

Availability or Uptime equates to potential revenues. Two factors contribute to availability: how long the equipment runs before it needs maintenance, and how long that intervention takes (the downtime). We can play with both variables to achieve the desired level of availability. However there are technical and economic limits to the level to which we can raise reliability and reduce downtime. For example, if we have achieved very high reliability, i.e. very few trips or breakdowns, then further improvements become disproportionately resource-intensive and costly. Similarly, if we reduce downtime by perfect planning and scheduling, further improvements can mean much higher spares inventory, standby manpower (e.g., by shift maintenance cover), additional cranes and other lifting gear etc.

Maintenance Intervals

A popular concept is to ‘stretch' maintenance intervals to save costs. Let us examine the validity of this approach.

The failure rates or MTBFs of the failure modes depend on the design and build quality, how we operate the equipment, its loading and the degradation mechanisms that are applicable in each case. If the new regime is such that the operating context is more benign, then we can expect the degradation and hence failure rates to decrease. If we then calculate the effect of these expected reduction in failure rates and then assign a new set of maintenance intervals, there is no problem.

A reality check will show that this is not how it happens in practice. Management issues an order to cut the maintenance budget by some percentage that will keep the unit operating costs as before, at the lower volume of production. Let us say this is 20%. The planner is then told to increase the maintenance intervals of ALL planned tasks by 20%, or better still, by 25%. There are two sources of error in this approach:

1. Each failure mode will be affected differently by the change in operating context -all failure modes are not created equal!
2. The relationship between reliability and maintenance interval is not linear

relationship between reliability and maintenance interval

Arbitrary changes in maintenance interval without careful thought can raise failure rates, in other words more breakdowns will result. That means more maintenance costs, not less!

Across the board interval changes affect the test intervals of protective devices and systems. These have vital safety functions that protect the integrity of the facility. It is easy to throw out the integrity ‘baby' with the rest of the bathwater, but the consequences can be dramatic and unacceptable.

Attitudes, Behavior and Motivation

Once we reach sustainable high reliability levels, a cultural change takes place in the mindset of people. For example, PM compliance will be in the high 90%s, equipment will be released and returned on time etc. Reducing reliability deliberately at this level, by e.g. withdrawing resources to sustain it is wasteful. It will reduce a small element of the total costs (the cost of sustaining an achieved level of reliability), but people's attitudes and behaviors will be adversely affected. Through a single act of slash-and-burn they will tend to lose the pride and motivation that has taken years to build. A probable result will be additional breakdowns and increased maintenance costs.

Matching Required Availability to Effort

Actively reducing reliability levels as a means of matching the reduced demand on availability is counter-productive. A better way is to reduce maintainability, e.g. by reducing spares inventories, logistics support etc. By doing so, we will reduce service levels and these associated costs will drop. So when our need for high availability vanishes with the economic downturn, lowering maintainability is the most effective (and defensible) way.

Drivers of Maintenance Costs

Two drivers determine maintenance cost, the reliability of the equipment and the productivity of the workforce. Both need to be kept at high levels, whatever our target availability may be. Let production (but not safety system) equipment wait to be repaired, don't keep lots of spares in stock, and minimize the craft strength so that production equipment waits for attention in a longer queue. People will understand the logic of these steps - we don't need high availability, so it is OK to incur downtime. This means we can wait for spares, logistic support and labor.

Don't Prejudice Your Safety Culture

Availability of equipment affects both production volumes and safety. In a downturn, we require lower production volumes, but we have to keep our eye on safety at all times. That means we still have to have high availability of our safety systems, such as fire protection or relief (valve) systems. We cannot compromise on their reliability at any time. Hence we must maintain a good reliability culture, but culture is all pervasive. We cannot accept a poor reliability culture for production availability while demanding a good reliability culture for safety systems. Deliberately lowering reliability levels will affect safety adversely. A note of caution: maintainability of safety systems must be retained at an adequate level.

Matching Reliability to Risk Levels

Every industry faces risks; these may be technical, financial and market driven. Management's job is to manage these risks effectively and efficiently. In a quantitative sense, risk has two components, frequency and severity. Severity often depends on local circumstances and we must try to lower this as far as possible through actions to mitigate it. At the same time we have to reduce the frequency of failures, or in other words improve reliability. Thus reliability is an essential feature of risk management.

There is a risk-optimum level of maintenance work (NOT a cost optimum). That determines what work must be done and when. Any other regime reduces reliability and raises risk, possibly to an unacceptable level. Some experts advocate an RCM philosophy whose stated aim is to reduce costs directly. When risk issues are not the primary focus, such attempts can raise risks uncontrollably. With cost reduction alone in focus, risk issues may be ignored. This has led to major disasters e.g. Bhopal.

Dealing with the upturn, when it arrives

Lowering reliability deliberately makes the return journey difficult and costly. By then, people's mindsets would have changed and the reliability culture, established after years of painstaking effort, would have been frittered away.

We can raise maintainability fairly quickly, by investing in spares, logistics and additional craft resources.

Summary

When an operating unit faces a lower demand, it is necessary to trim its uptime (availability) to suit. Reducing its reliability and/or its maintainability can do this, but we need to consider whether one or the other is preferable.

Costs are a consequence, not the drivers of performance. In an attempt to manage costs downwards, we should prune the right drivers. Reducing reliability results is increased maintenance costs due to the increase in trips and breakdowns. Further, it can seriously affect risk levels, so reducing maintainability is the better approach.

A cost-focus can also adversely affect the motivations, attitudes and behaviors of people. These take years to develop, but can be lost in one stroke. Human reliability forms a vital link in the reliability chain. Since low motivation and poor behavior results in poor reliability and productivity, it is an important aspect to consider.

Paper written by V.Narayan on 1st February 2009 in response to a Blog posting at Maintenance.org