There is a plethora of oil analysis training courses in the market today. However, any course of reputable quality will discuss strategies around setting alarms for wear debris. The primary strategies for wear debris alarms include:

  • OEM recommended absolute values,
  • Statistically derived,
  • Rate of change.

Absolute Values

The beauty behind OEM recommended absolute values is that it gives a starting point on wear alarms. This is especially useful when an end user is just getting started on the road to oil analysis. However, the key phrase here is "starting point." It has been noted and proven in multiple studies that two like machines running the same process, under the same load, can, and most likely do, have different levels of wear over similar time periods. Here lies the problem with absolute wear alarms.

Let's look at a case involving a process critical gearbox at an industrial location that utilized OEM recommended absolute alarm values. This particular gearbox had historically been running 0ppm of iron during the regular oil samples. During one particular sample, the iron value came back at 6ppm. While in many instances, 6ppm in an industrial gearbox is considered to be just noise, but in this particular instance, there was a major cause for alarm. The analyst in this case reviewed the iron data, was able to cross-correlate that data with other oil sample data such as PQ index (Figure 1) and particle count data (Figure 2), all which showed significant increases, called for additional testing and warned of an impending failure.

Unfortunately, the customer in this instance relied on OEM data. The OEM indicated that this level of iron should not be considered a problem by any means. In fact, the OEM quoted an AGMA standard as supporting documentation for not alarming the iron until it reached above 50ppm. The AGMA guideline used was not for the type of gearbox discussed in this example, rather it was for a gearbox used in a completely different application. Five days after the initial warning and request for additional testing, the gearbox suffered a catastrophic failure resulting in the actual casing of the gearbox splitting in two.

Now we can understand why using OEM defined alarms should be considered just as a starting point.

Statistically Derived Alarms

The next alarm level, and one that is highly supported and recommended in many oil analysis training courses, is the statistically derived alarm. This term is loosely defined as simply utilizing population standard deviation to determine where the caution and critical point should be with respect to wear debris alarms.

In calculating the statistically derived alarm, one must take the average of the selected dataset then calculate the standard deviation. From this point, the end user can establish alarm points. The initial, or caution alarm, is generally set at one or two standard deviations above the average with a critical point being two or three standard deviations above average.

The choice should really be that of the end user, however, we are advocates of initially alarming wear debris at the average plus two standard deviations, with a critical value of the average plus three standard deviations. The approach allows for a very tight focus on the top 5 percent of problem machines. This becomes especially useful at the early stages of an oil analysis program to help reduce the natural occurrence of work order overload.

It is worth noting, however, that wear debris distributions do not generally fall into the normal distribution bell curve as seen in Figure 3. Most real wear distributions are closer to a log-normal or truncated log-normal curve. In these cases, two standard deviations do not necessarily equal the 95 percentile. Figure 4 shows a typical wear distribution curve.

Rate of Change Alarms

Rate of change alarms have long been considered the most precise method of setting alarms. The idea behind this method is to basically track the wear generation rate. The accepted thought on this is if we can monitor changes in the rate of wear, then we can make a better estimation of machine condition, as well as identify a potential condition very high up on the P-F curve. Generally, the goal is to normalize the data to a specific rate, such as wear per 100 hours of operation.

In order for this method to have a solid level of accuracy, the sample run time must be fairly consistent. This very basic method, while a decent starting point, can quite easily result in a false-positive situation, particularly when the run time on the oil is significantly lower than the normal sampling run time. If we were to use the simple data shown in Table 1, one could conclude that the sample showing a generation rate of 67ppm per 100 hours of operation would indicate a severe wear condition. This would likely result in some level of inspection when, in actuality, absolutely nothing could be wrong with the component.

Internal studies were done at Fluid Life, an oil analysis laboratory with facilities in the United States and Canada, to determine the impact of using different sampling run times when calculating the wear generation rate. The study was performed on a vast number of equipment makes and models. The results were the same regardless of the component type, the make, or the model.

It is often stated that as much as 10 percent to 30 percent of a sump volume can be left behind during an oil change. This is attributed to oil remaining behind on the moving components. By using simple math, one often assumes that this means that only 10 percent to 30 percent of residual debris would be left behind as well. That is not the case. According to a portion of the study, which included 3,772 samples, 49 percent of the iron was left behind after an oil change, on average. Looking at a completely different make and model of component with a total of 687 samples, the study showed 58 percent of wear debris remaining after an oil change, on average.

This tells us that we simply can't assume the wear metals start at 0ppm. We also cannot assume that the majority of wear debris is removed during an oil change. In fact, if we do continue to utilize the "as preached" way of calculating generation rate, we are setting up for failure. The Fluid Life study indicates that when comparing the "naïve" rate to the actual wear rate, there is a 273 percent increased chance of calling a component in a state of failure when all may very well be normal.

As we refer to Figure 5, the Actual Fe line is the average Fe in ppm for all samples in each oil hour "bucket." The oil hour buckets are grouped in 25 hour increments such that the "0" bucket includes all samples >= 0 hours up to and including those samples collected at 24 hours. The 25 hour bucket are those samples listed >=25 hours up to those samples collected at 49 hours, etc.

The naïve generation rate is the average Fe in each bucket divided by the bucket center value and then multiplied by 100 hours and is in ppm per 100 hours. (i.e., 0 bucket = 6.59/12.5*100).

The "corrected" is the average Fe in each bucket, with the estimated y intercept from the fitted line subtracted and then divided by the bucket start and multiplied by 100 hours.

Using this method allows one to calculate a better estimation of the generation rate using just the data for a single sample coupled with the intercept knowledge. As can be seen in Figure 5, the naïve generation rate will show substantially higher ppm/hr readings if the oil hours are lower than normal, particularly if they are less than 100 hours. Also, it counterintuitively informs us that the generation rate steadily decreases as the oil is left in for longer amounts of time.

In Conclusion

The proper setting of wear debris alarms can have a make or break effect on the overall effectiveness of an oil analysis program. While the goal of predictive maintenance is to identify a potential failure high up on the P-F curve, without a full understanding of alarms, it is likely that one could create a false identification of failure. Once that is done, a site will experience a similar effect as a missed opportunity and the credibility of the entire oil analysis program comes into question.

Jeff Keen is a Professional Computer Engineer. He serves as Vice President of Information Technology and Research and Development with Fluid Life. With 21 years of experience, Jeff designs and develops systems to manage and evaluate oil analysis results and related information.

Matt Spurlock is a Certified Lubrication Specialist and a Certified Maintenance & Reliability Professional. He serves as a Senior Reliability Specialist and Instructor with Fluid Life. With over 20 years in the field, Matt specializes in in-depth oil analysis data evaluation and lubrication program optimization for customers across all industries.

Upcoming Events

August 9 - August 11 2022

MaximoWorld 2022

View all Events
80% of newsletter subscribers report finding something used to improve their jobs on a regular basis.
Subscribers get exclusive content. Just released...MRO Best Practices Special Report - a $399 value!
Harmonizing PMs

Maintenance reliability is, of course, an essential part of any successful business that wants to remain successful. It includes the three PMs: predictive, preventive and proactive maintenance.

How an Edge IoT Platform Increases Efficiency, Availability and Productivity

Within four years, more than 30 per cent of businesses and organizations will include edge computing in their cloud deployments to address bandwidth bottlenecks, reduce latency, and process data for decision support in real-time.

MaximoWorld 2022

The world's largest conference for IBM Maximo users, IBM Executives, IBM Maximo Partners and Services with Uptime Elements Reliability Framework and Asset Management System is being held Aug 8-11, 2022

6 Signs Your Maintenance Team Needs to Improve Its Safety Culture

When it comes to people and safety in industrial plants, maintenance teams are the ones who are most often in the line of fire and at risk for injury or death.

Making Asset Management Decisions: Caught Between the Push and the Pull

Most senior executives spend years climbing through the operational ranks. In the operational ranks, many transactional decisions are required each day.

Assume the Decision Maker Is Not Stupid to Make Your Communication More Powerful

Many make allowances for decision makers, saying some are “faking it until they make it.” However, this is the wrong default position to take when communicating with decision makers.

Ultrasound for Condition Monitoring and Acoustic Lubrication for Condition-Based Maintenance

With all the hype about acoustic lubrication instruments, you would think these instruments, once turned on, would do the job for you. Far from it!

Maintenance Costs as a Percent of Asset Replacement Value: A Useful Measure?

Someone recently asked for a benchmark for maintenance costs (MC) as a percent of asset replacement value (ARV) for chemical plants, or MC/ARV%.

OEM recommended maintenance plans

One-third of CEO Terrence O'Hanlon's colleagues think so - at least as a starting point. What do you have to say?