How things fail

How things fail

"Reliability has two broad ranges of meanings:

  1. Qualitatively-operating without failure for long periods of time just as the advertisements for sale suggest, and
  2. Quantitatively-where life is predictable long and measurable in test to assure satisfactory field conditions are achieved to meet customer requirements.

Reliability is concerned with failure-free operation for periods of time, whereas quality is concerned with avoiding non-conformances at a specified time prior to shipment thus reliability measures a dynamic situation but quality measures a static situation. As in physics, statics is easier to understand and calculate than dynamics which involves higher levels of math and greater mental capabilities for comprehension."

- H. Paul Barringer

We study failed items for the same reason we do autopsies on humans: we want the data and we want it categorized correctly for making important decisions.

Failures require:

  1. A time origin which must be unambiguously defined;
  2. A scale for measuring the passage of time/starts/stops/etc., which motivates failure;
  3. The meaning of failure must be entirely clear for recording the event.

Failures during an asset's life can be attributed to the following causes:

Design Failures: This class of failures take place due to inherent design flaws in the asset or system. In a well-designed system, this class of failures should make a very small contribution to the total number of failures. Research by Winston Ledet (outlined in Don't Just Fix It, Improve It! A Journey to the Precision Domain) showed that approximately 20% of corrective work orders could be traced to poor design, build and installation issues.

Failure patternsFailure patterns (Courtesy of

Infant Mortality: This class of failures cause new (and repaired) assets to fail. In "Reliability-Centered Maintenance" by Nowlan and Heap, up to 72% of failures are in the "worse new" or "worse repaired" (infant mortality) category.

Infant mortality random failure patternInfant mortality random failure pattern (Courtesy of

Random Failures: Random failures can occur during the entire life of an asset. These failures are also referenced in "Reliability-Centered Maintenance" by Nowlan and Heap. Up to 77-92% of failures are random in pattern.

Wear Out: Once an asset has reached the end of its useful life, degradation of component characteristics will cause assets to fail. Ledet research stated that "wear out" as a cause for a corrective work order is 12% or less. Nowlan and Heap and related research shows 8-23% of failures are wear out related.

The following graphs shows the contribution of the different failure modes towards the overall failure rate.

Contribution of different failure modes towards component failure

Where does preventive maintenance fit with these patterns?

Where does what some call "predictive" maintenance, but we call asset condition management, fit in?

Where does prescriptive maintenance fit with these patterns?

What else should we understand about failure?

Find Terrence O'Hanlon on LinkedIn.

Terrence O'Hanlon

Terrence O’Hanlon, CMRP, and CEO of® and Publisher for Uptime® Magazine, is an asset management leader, specializing in reliability and operational excellence. He is a popular keynote presenter and is the coauthor of the book, 10 Rights of Asset Management: Achieve Reliability, Asset Performance and Operational Excellence.

Upcoming Events

August 9 - August 11 2022

MaximoWorld 2022

View all Events
80% of newsletter subscribers report finding something used to improve their jobs on a regular basis.
Subscribers get exclusive content. Just released...MRO Best Practices Special Report - a $399 value!
“Steel-ing” Reliability in Alabama

A joint venture between two of the world’s largest steel companies inspired innovative approaches to maintenance reliability that incorporate the tools, technology and techniques of today. This article takes you on their journey.

Three Things You Need to Know About Capital Project Prioritization

“Why do you think these two projects rank so much higher in this method than the first method?” the facilitator asked the director of reliability.

What Is Industrial Maintenance as a Service?

Industrial maintenance as a service (#imaas) transfers the digital and/or manual management of maintenance and industrial operations from machine users to machine manufacturers (OEMs), while improving it considerably.

Three Things You Need to Know About Criticality Analysis

When it comes to criticality analysis, there are three key factors must be emphasized.

Turning the Oil Tanker

This article highlights the hidden trap of performance management systems.

Optimizing Value From Physical Assets

There are ever-increasing opportunities to create new and sustainable value in asset-intensive organizations through enhanced use of technology.

Conducting Asset Criticality Assessment for Better Maintenance Strategy and Techniques

Conducting an asset criticality assessment (ACA) is the first step in maintaining the assets properly. This article addresses the best maintenance strategy for assets by using ACA techniques.

Harmonizing PMs

Maintenance reliability is, of course, an essential part of any successful business that wants to remain successful. It includes the three PMs: predictive, preventive and proactive maintenance.

How an Edge IoT Platform Increases Efficiency, Availability and Productivity

Within four years, more than 30 per cent of businesses and organizations will include edge computing in their cloud deployments to address bandwidth bottlenecks, reduce latency, and process data for decision support in real-time.

MaximoWorld 2022

The world's largest conference for IBM Maximo users, IBM Executives, IBM Maximo Partners and Services with Uptime Elements Reliability Framework and Asset Management System is being held Aug 8-11, 2022