by Will McGinnis

Attempting to contextualize generally obscure statistical predictions about machines is an all too common practice. While statistics and predictive analytics are not new concepts, they have not yet been applied deeply enough to mechanical processes to be embedded in the vernacular. With a focus on the actionability of analysis, similes help to convey the messages.
This article presents one of them.

Machines As Patients

In terms of failure, the primary difference between biological and mechanical systems is, for the most part, that small variations strengthen the biological and weaken the mechanical. A biological system dropped into an environment for which it is ill suited will eventually adapt to it, while the mechanical will simply fail. Despite this fundamental difference, they can be approached the same on the doctor's table.

Sickness

People and machines both get sick. The chronic propensity to fail can be seen in both terminally ill patients and mis-sized or improperly maintained machinery. Both cases are characterized by a likelihood of future failure that is independent of short-term variation in environment,
behavior or process.

Injury

Likewise, both people and machines can be injured. With a person, this could be a broken arm. In most cases, it is a single, unexpected event that is unlikely to be repeated unless the environment consistently reproduces those events (e.g., a professional skateboarder). The mechanical equivalent would be an operator's error. A broken part due to operator error is not necessarily an indication of a long-term misapplication of the machine, but rather an isolated event caused by outside factors. The only ways to avoid injuries are to change the environment or improve the process (e.g., quit skateboarding or train more to fall less).

Sickness vs. Injury

With these similes, it can be pretty simple to visualize and understand the cases of mechanical failure. Sickness is environmentally independent and chronic, while injuries are heavily influenced by the environment and are acute. A machine can be both sick and injured, or just one of the two.

Measurements for both machine sickness (HealthScore) and injury (Warnings) can be seen in Table 1.

Table 1

There are four assets with the four boundary cases of condition:

  1. Total Health (004)
  2. Injured, but well (001)
  3. Intact, but sick (002)
  4. Sick and Injured (003)

The goal is a machine with both a high HealthScore and no warnings (predicted injuries). The opposite of this ideal is a sick and injured machine, with both a low HealthScore, indicating chronic risk of failure, and a warning, indicating imminent likelihood of injury.

In between these two extremes are machines with a warning, but a high HealthScore, indicating an environmental risk but no systemic misapplication, and machines with no warning but low HealthScore, indicating a systemic misapplication that is being well accounted for by the environment (i.e., users and maintainers). These are both interesting cases because they can be easily rationalized as good results, but while they are far better than sick and injured, they can be greatly improved by changing the environment or system, respectively.

Figure 1 shows an example of how a population of assets might look visualized in this way.

Plants As Populations

After establishing this idea of machines as patients (i.e., biological-esque entities), the natural extension is to look at a plant as a population of such entities. Scholar Nassim Taleb describes all things as being along a scale from fragile to anti-fragile.

If something is fragile, then a small, random variation makes it weaker. Conversely, if something is anti-fragile, then small, random variations make it stronger. Right in the middle of the two is robust or resilient, where variations do not affect the item.

Individual machines are generally fragile by this definition. Likewise, individual people are, in many cases, fragile. A population of people, however, can be extremely anti-fragile, adapting over time to changes in the environment and gradually improving quality and longevity of life. The goal of a plant should be a population of machines that resembles a population of people or animals more so than a house of cards, where a collection of fragile items is more fragile than its constituent parts.

Selection

The first key concept that contributes to the anti-fragility of biologic populations is selection. Those animals or bacteria that are least fit for the environment do not continue on into the next generation. Likewise, to increase the anti-fragility of a plant, the fitness (health) of the machines needs to not only be tracked, but used to inform the replacement of machines. Machines that are chronically unfit (unhealthy) must be replaced with machines that are less predisposed to this condition. Where the machines themselves cannot be changed to be more fit for the environment, the environment must be changed.

Continuous, iterative improvement of both the conditions surrounding the machines in a plant and the fitness of the machines to that environment are critical in building the capacity for the plant to not only withstand unexpected variations, but to benefit from them.

Compartmentalization

The other critical contribution to a population's anti-fragility is compartmentalization. The lower the effect that one failure has on other members of the population, the more anti-fragile that population will be.

Taleb uses the examples of airlines and banks. If a plane accidentally crashes somewhere in the world, other planes are no more likely to crash because of it. In fact, the aviation industry will learn from the accident and improve its processes so future planes will be less likely to crash, the very definition of anti-fragility. A bank, on the other hand, is more likely to collapse if one of its peers collapses. The financial crisis of 2007 is testament to this fact; the interconnectedness of the global economies (lack of compartmentalization) makes the population of banks fragile.

For a plant, this means the goal should be to minimize the impact of failures on downstream processes as much as possible. Having described the idea of a plant as a population of entities with individual variations in health and injury, what can you actually do in your plant?

You want to have less downtime, lower maintenance costs and fewer accidents, and make more money. In a world of certain variation, you want your plant to have lower risk and to better weather storms. To do this, you must follow these steps:

  1. Compartmentalize failures,
  2. Predict machine injuries,
  3. Replace sick machines,
  4. Repeat.

It is an iterative process that reflects the constantly changing environment in which workers and their machines operate. By focusing on Steps 2 and 3, you can leverage preexisting data sources in your plant and use the power of predictive analytics to predict acute failures and quantify long-term machine health.

In closing, here is one more simile. If your plant is a population of sick and injured patients, how are you maintaining it? With predictive analytics, you have a means by which you can perform triage in order to build continuous improvement.

Keep reading... Show less

Upcoming Events

August 9 - August 11 2022

MaximoWorld 2022

View all Events
banner
80% of Reliabilityweb.com newsletter subscribers report finding something used to improve their jobs on a regular basis.
Subscribers get exclusive content. Just released...MRO Best Practices Special Report - a $399 value!
DOWNLOAD NOW
Conducting Asset Criticality Assessment for Better Maintenance Strategy and Techniques

Conducting an asset criticality assessment (ACA) is the first step in maintaining the assets properly. This article addresses the best maintenance strategy for assets by using ACA techniques.

Harmonizing PMs

Maintenance reliability is, of course, an essential part of any successful business that wants to remain successful. It includes the three PMs: predictive, preventive and proactive maintenance.

How an Edge IoT Platform Increases Efficiency, Availability and Productivity

Within four years, more than 30 per cent of businesses and organizations will include edge computing in their cloud deployments to address bandwidth bottlenecks, reduce latency, and process data for decision support in real-time.

MaximoWorld 2022

The world's largest conference for IBM Maximo users, IBM Executives, IBM Maximo Partners and Services with Uptime Elements Reliability Framework and Asset Management System is being held Aug 8-11, 2022

6 Signs Your Maintenance Team Needs to Improve Its Safety Culture

When it comes to people and safety in industrial plants, maintenance teams are the ones who are most often in the line of fire and at risk for injury or death.

Making Asset Management Decisions: Caught Between the Push and the Pull

Most senior executives spend years climbing through the operational ranks. In the operational ranks, many transactional decisions are required each day.

Assume the Decision Maker Is Not Stupid to Make Your Communication More Powerful

Many make allowances for decision makers, saying some are “faking it until they make it.” However, this is the wrong default position to take when communicating with decision makers.

Ultrasound for Condition Monitoring and Acoustic Lubrication for Condition-Based Maintenance

With all the hype about acoustic lubrication instruments, you would think these instruments, once turned on, would do the job for you. Far from it!

Maintenance Costs as a Percent of Asset Replacement Value: A Useful Measure?

Someone recently asked for a benchmark for maintenance costs (MC) as a percent of asset replacement value (ARV) for chemical plants, or MC/ARV%.