Reliability: A New Definition

Countless people, mostly engineers, use the word reliability to describe whether equipment can provide its necessary function. The Society for Maintenance & Reliability Professionals (SMRP) states that reliability relates to the level in which operational stability is achieved because equipment does not fail – the equipment is available at rated capacity whenever it is needed and it yields the same results on repeated operation. Reliability also could be defined as the probability that equipment will perform a required function without failure under stated conditions for a stated period of time. Reliability is an inherent characteristic of the system and, therefore, very much a part of the design. SAE International (formerly the Society of Automotive Engineers) has a similar definition, that is, reliability is the probability that machinery/equipment can perform continuously, without failure, for a specified interval of time when operating under stated conditions. Increased reliability implies less failure of the machinery and, consequently, less downtime and loss of production.

Note that the words machinery and equipment are italicized to stress that the emphasis in these definitions, and in most all of the work of SMRP, is focused on equipment and related maintenance practices to sustain the equipment. Clearly, this is an essential requirement for reliability. However, it’s not sufficient. The equipment in a given operation can be fully functional, yet the operation or the system can be failing functionally, or totally, for a variety of reasons.

While SMRP does wonderful work, my impression for some time has been that it would be more appropriate as SMrP (big M, little r) because its focus is primarily maintenance of equipment and not the reliability or capability of the process (vs. equipment) to provide its intended function. It gives insufficient attention to practices related to design, purchasing, start-up/commissioning, and operating practices, the very practices that have the greatest impact on the reliability of a given process. Maintenance does not control reliability. It supports reliability. In my experience working with hundreds of production plants, maintenance only has direct control of about 10 percent of the total production losses as measured by overall equipment effectiveness (OEE) or asset utilization (AU).

A New Definition

My definition of reliability is more focused on system-level functional capability, that is, process reliability, and takes a broader business perspective. It is as follows:

Process Reliability – the ability of a system to deliver the maximum in quality product or service, on time, in full, at the lowest sustainable cost.

The point being that if you limit yourself to being equipment focused, you will miss many, if not most, of the opportunities for improvement in the performance of the system as a whole. In a production system, OEE and AU, as shown in Figure 1, are the best measures for reliability. In a nonproduction system, such as rail systems, computer facilities, hospitals, etc., the measures would require adaptation to that specific system. OEE/AU fundamentally asks the questions: What is perfection or the ideal state? What are your gaps between actual performance and ideal? How will you close those gaps, recognizing that you may never truly achieve the ideal state? You will, however, be superior to other operations because you will have optimized the system.

Figure 1: Asset utilization (AU) and overall equipment effectiveness (OEE)

As previously noted, it’s been my experience that maintenance directly controls only about 10 percent of the losses from ideal, as depicted in Figure 1. As depicted in Figure 2, two-thirds of the losses typically have nothing to do with equipment, but rather relate to issues like production planning, product changeovers, transition losses, raw material quality and quantity, rate and quality losses, short stops, trial runs, operator absenteeism, shift handover delays, market demand, and perhaps other reasons. Of the one-third balance, some two-thirds of that relate to poor operating practices or the design of the system that makes operation or maintenance more difficult. In any event, these issues leave maintenance in control of some 10 percent of the total losses from ideal production. Maintenance does not control the reliability of the production process!

Figure 2: Losses from ideal production

Considerations

Application to Manufacturing and Production Plants – The focus of this article is on manufacturing and production plants and not facilities like data centers, rail systems, hospitals, and other operations that typically don’t produce a product, per se. In those operations, greater focus should be given to maintenance practices and their impact on the reliability of the facility. However, I suspect that design, purchasing, start-up/commissioning, and operating practices will also have a substantial impact on the reliability of those systems and, therefore, deserve additional attention, but I personally have little experience in these type of facilities. Nonetheless, my recommendation for these would be similar – define the ideal state, measure performance against that ideal, and then manage the losses from ideal.

Maximum in Quality Product – As noted in the definition, reliability is the “ability of a system to deliver the maximum in quality product.” To be clear, the fact that you can produce the maximum doesn’t mean you should. You don’t want excess inventory beyond what’s needed to manage variations or disruptions in the supply chain or production processes. You want to match demand with production and be able to meet that demand as the market varies.

Managing the Losses – The focus of your effort should be on managing the losses from ideal in the OEE/AU measure. You will always have losses from ideal. The question you must ask is: Are these losses acceptable to the business? Some will be, but most will not. As such, a business decision must be made for each loss regarding the investment required to reduce or eliminate the loss versus the value of reducing that loss in terms of gross profit, improved quality and delivery, potential loss of customers or future market share, etc. From a business perspective, the optimal state will be less than the ideal production state.

Taking this approach will provide higher capacity, lower costs, higher gross profits, better quality, and better on time/in full performance. And, incidentally, better safety and environmental performance. Reliability will be viewed as the ability of a system to deliver the maximum in quality product or service, on time, in full, at the lowest sustainable cost.

From Your Site Articles