Recently, a question was forwarded from Lhoist’s National Maintenance Manager Mauricio Arroyo to Reliabilityweb.com’s Global Relationship Leader and Director of Women in Reliability and Asset Management Maura Abad. The question was prompted after Mauricio read the Asset Condition Monitoring Project Manager’s Guide (Guide) to which Maura had provided a download link. The question was:
“In their [the Guide’s coauthors] experience across the industries, what is the average Cost Avoidance that they have found?”
In the original query and subsequent communications, Mauricio elaborated by adding:
“I have always heard that the cost of no maintenance (run to failure) is “x” times higher than planning a job and or applying any of the Preventive and Predictive techniques that we have available today. That ‘x’ was the ratio that I am looking for. We are using at Lhoist a 3 to 1 Ratio, but based on other experiences that I have seen, it could be as high as 6 to 1.”
Maura forwarded the original question to Dave Reiber, Senior Reliability Leader at Reliabilityweb.com, Terrence O’Hanlon, CEO and Publisher of Reliabilityweb.com and Uptime Magazine, and Jack Nicholas, an independent author and consultant who, along with O’Hanlon and Reiber, is a coauthor of the Guide.
This prompted an intercontinental and transcontinental dialogue. Dave Reiber was in Budapest, Hungary, Terrence O’Hanlon was in Florida, Jack Nicholas was on a cruise touring Norway’s fjords, and Jack asked Anthony M. (Mac) Smith, author of the book, Reliability-Centered Maintenance (1993), and coauthor of RCM: Gateway to World Class Maintenance (2003), then on vacation in southern California, to comment as he had frequently expressed his opinion on the value of RCM-based maintenance.
I think the first thing to address here are the definitions of Cost Avoidance versus Cost Savings.
Cost Avoidance – An estimated dollar amount expected to be paid in the future if proactive events did not keep the machine, tooling, or system producing units; (e.g., Avoiding downtime by fixing something that could potentially break down)
Cost Savings – Dollars that are currently spent, but will not be spent in the future; (e.g., fixing something that causes you to use less steam, air, electrical power, etc.)
Note: These definitions differ from, but are not inconsistent with, those terms in The Professional’s Guide to Maintenance and Reliability Terminology published by Reliabilityweb.com.
On page 19 of the Asset Condition Monitoring Project Manager’s Guide, it is stated: “Monetary benefits (i.e., cost avoidance and cost saving actions) rapidly accumulate while an organization is working through the hump period. In fact, there are usually enough, so only a portion of these benefits may need to be calculated to make program justification apparent. It is most important for the ACM team to keep track of these cost avoidance and cost saving KPIs.”
I believe you should capture all of the Cost Avoidance and Cost Savings details. It will take more of your time, but when it becomes a habit, it will be easy to do and the information will always be there to support and promote your program, even after the initial finds become further apart.
The question of Ratio is completely subjective to your particular situation. I think it is best to get the Financial Officer for your business or organization involved. A good way to do this is to get the General Ledger (GL) account information for each area of the business.
Example: In the Automobile business, the costs on the final line (GL account) of a car or truck plant can be several hundred dollars a minute. In the same building, because of buffers and counters, the costs in the Paint or Body Shop (GL accounts) are significantly less, but still very high in relationship to downtime. Therefore, the cost avoidance is much higher on the Final Line. By using the GL account information, no one can dispute the value.
Any time you can capture Cost Savings, it should be recorded. It is indisputable.
On page 64 of the Guide, we mention “Find-of-the-Week.” It is important to keep this information at hand to celebrate the wins.
As to ratios, the returns are highly variable and depend on too many factors to generalize. In my experience, it’s better to gather overall return on investment (ROI) together and not try to segregate individual ones. What you’ll find is some things work very well (i.e., give high ROI) and others not so much for many different reasons, including the enthusiasm of those upon whom the savings or cost avoidance initiative and calculations depend, the effect of applications of various strategies and technologies, etc. For example, ultrasonic analysis works better on some systems than others. Same for infrared thermography or vibration analysis. So, the effect of your investment in technologies and all attendant costs will vary and so will your returns.
As to the KPI definitions and usage, as long as your definitions and algorithms make sense, the most important thing is to follow what you come up with rigorously and consistently all the time, allowing no deviation without a change to what you’ve formally written. Any changes should be documented in follow-on KPI definitions and rationale, with reason for the change, so anyone questioning your logic can understand why you made the change. This goes to the heart of your competence and integrity in the eyes of those who try to judge your credibility and the usefulness of your KPIs.
Thought about the ongoing dialogue about the 10 to 1 ratio and I feel there may be a larger issue that brought this up – namely the Role of Operations and Maintenance (O&M) in Supporting (or Failing to Support) Corporate Financial Goals. So, I took off on that topic last night and the lengthy response is as follows:
Try to look at things in the simplest way possible. So, for O&M people who are concerned about how their organization affects their company’s financial picture (i.e., Profit and Loss), they can control (or fail to control) only three aspects of what they do or fail to do. An example I frequently use to open most discussions that I have on the topic (Why Are We Here?) is that O&M people COST their company money to keep the production or process working as intended. Depending on how good they perform that function, they play a key role in keeping the product moving to the marketplace (PROFIT). This is why I believe the O&M organization should actually be considered a PROFIT CENTER if they do their job effectively. Now, the COST elements here are how good they PREVENT Failures (Preventive Maintenance - PM) from occurring and, if that doesn’t always happen, how quickly they can CORRECT the problem (Corrective Maintenance - CM) and restore continued production. If they are good at those two things, they are a direct contributor to PROFIT but, if they are not, they are a major factor in LOSS. In other words, if they avoid DOWNTIME (DT), they become a major factor in making money. To me, that is a fairly direct and simple way to think about O&M’s role in the organization.
Now, one of the things that intrigues me about RCM is that it requires you to first understand where your O&M organization gets its most pain – Which of my plant’s SYSTEMS are eating my lunch (i.e., costing me the most to constantly be in a fix it (or REACTIVE) mode? Enter the Pareto chart and the 80/20 rule. Which of the above parameters could logically be such a database for a Pareto chart? Well, it could either be CM data or DT data. Which of these two is the easiest and most reliable to access? Anyone with database experience will say CM data, either Work Order (WO) counts or WO cost data. From experience, CM count data is best, although trial exercises have clearly shown that either CM or DT leads to the same Pareto chart and the same 80/20 conclusion.
The second thing that intrigues me about RCM is that it now gives me a tool, the Decision Logic Tree, to further determine which Component Failure Modes in that System are the culprits behind its “bad actor” label. The criteria that identifies the culprits are those that lead directly to a Safety/Environmental, Outage and/or Economic issue when and if they occur. And, in fact, it also determines if they are Hidden. The default decision is to do nothing and fix when convenient. This is commonly referred to as Run to Failure (RTF).
So, where does the 10 to 1 ratio come from and what does it mean? Namely, if we did the right PM action in the first place, chances are very good that the Failure Mode would occur infrequently, if at all, and we would not be faced with the need to do CM. And, here is where my RCM analysis results start to show some eye-catching results. First, a large portion, typically 30 to 70 percent, of the Failure Mode population has no pre-RCM assigned PM task at the time of the analysis. Second, where a PM task did exist, it was frequently (10 percent or more) seen to be an action that could not have prevented the Failure Mode in the first place. So, when the analysis did define a new PM task, we became very interested in determining what the new PM would cost to prevent the Failure Mode versus what a CM action could (or did) historically cost to take it back to its originally intended functional state. A pattern emerged over time and the data reflected that the CM cost was typically eight to 12 times larger than the cost of the new PM task. So, we talk about that finding using a 10 to 1 ratio, on average. Some of the projects that provided inputs to this were paper mills, refineries, nuclear and fossil power plants, aircraft assembly lines and wastewater treatment plants, which are among my 87 RCM projects successfully completed. The absence of published data on this ratio was the resistance of Companies (i.e., lawyers) to publicly release such information. My book, RCM: Gateway to World Class Maintenance, contains seven fairly complete case histories, but even the release of this book was delayed over a year obtaining legal okay to publish what is there. Virtually, no hard cost data is ever seen in public books or papers.
Another startling result was the multiplier that existed when the Failure Mode led not only to the COST to restore, but also involved a significant DT and loss of product sales. Thus, the multiplier of 100 to 10,000 can occur there. If those ratios blow your mind, think about a 1200 megawatt power plant out for just one day and the cost of replacement power costing some $800,000 per day. Or, consider 20 sold 777 airplanes sitting on the tarmac that are costing the Seller a penalty of $25,000 per each day that delivery is missed all because the machines that made the pieces for the overhead bins were in a constant state of fail and fix.
I hope this helps to explain the 10 to 1 ratio.
Fabulous response – I learn more every time we speak or with every e-mail you send.
My one issue is that we should focus on the business issues that may or may not correlate to work order count, but I also agree with your (Mac’s) advice to use whatever data you can easily get your hands on.
Ford Motor Company came up with a 10X rule that states that the cost difference to removing a failure mode depends on the Lifecycle phase you choose to remove it in.
Remove the failure mode in design 1X Remove the failure mode in early build X 10 Remove the failure mode in final assembly X 100 Remove the failure mode in operations phase, could be 1,000X to 10,000X
So, a multiplier of 10:1 is reasonable.
This thread from Jack, Dave and Mac has been outstanding and I appreciate Mauricio thinking enough of us to ask. He got a real treat with better replies than most would ever have access to. Thank you.
There is such a lack of asking questions/discussion/respectful debate in our community – I really cherish these opportunities.
I hate to sound like a fuddy-duddy, but in the beginning of the Internet – for the first year when all global maintenance experts first connected – we hosted an e-mail discussion group that was basically a college education in all things maintenance, reliability and asset management. Maintenance.org still has healthy discussions, but I sure miss that e-mail group. Eventually people stopped sharing, they disagreed “disrespectfully” and then the dreaded marketing/sales messages followed. It was good while it lasted. This group and discussion reminds me of those days.