The Role of Critical Spares Analysis in Validating Spare Parts Recommendations
The day went from bad to worse an hour later when the mechanical foreman reported that several key parts weren’t on the shelf in the storeroom. The damaged ones were beyond repair and the original equipment manufacturer (OEM) was quoting eight weeks on expedited delivery. Best case was a local machine shop offering to reverse engineer and manufacture the needed parts in two weeks. After the dust settled, one of the corrective action items I was assigned by the site leader was to investigate why the part wasn’t on the shelf.
Management of spare parts usually generates a spirited discussion, especially when the critical ones aren’t on the shelf when they’re needed most. Sometimes the part has been consumed and not replaced, or perhaps it wasn’t set up to be stocked in the first place. The latter could be because a strong reliability program is either not in place at the facility or it has not identified the need to have the spare part on the shelf. I’ve even found that a part I set up on the shelf was not there because someone deleted it from inventory.
All of these situations are frustrating and illustrate the need for a robust and well-designed spare parts management process. This begins with an asset criticality analysis and a failure mode and effects analysis (FMEA) that identify the critical assets and the parts you are going to set up in maintenance, repair and overhaul (MRO) stores. Other elements of spare parts management include a process for issuing and reordering parts, and an approval step for adding/deleting stock items. During the process of approval of spare parts recommendations for critical assets, one of the tasks performed is a critical spares analysis (CSA). Since these are usually also very expensive parts, a more in-depth economic evaluation is necessary. But first, let’s understand FMEA a little better so we understand the inputs to the CSA. When your company starts the journey from reactive and urgency-
based maintenance to reliability excellence, your decisions on what spare parts to keep on the shelf will no longer be arbitrary. Instead, the decisions will be guided by a risk-based strategy. Best-in-class plants usually start with a failure mode and effects analysis to identify and make recommendations to mitigate risks to production, maintenance costs, and environmental, health, and safety (EH&S) incidents. As each of these risks is analyzed, it is assigned a risk ranking or risk priority number (RPN). This risk assignment is based on the consequences (or severity) of the failure and the likelihood (or probability) that it will happen. A scale or matrix with 1-5, 1-10, or some other relative index, is often used to associate these risks with dollars, pounds of production, etc. The calibration of this scale or matrix is usually approved ahead of time by site leadership or corporate engineering.
Spare parts is a common recommendation from a FMEA to mitigate risks. Other well-known categories of recommendation include preventive maintenance, predictive maintenance or condition monitoring, calibrations, equipment redesigns, changes or additions to operations standard operating procedures (SOPs), or formal defect elimination studies.
As mentioned earlier, a best-in-class MRO stores management program will include a process for additions, as well as approvals to do so. This is facilitated by a stock request form. While many of the tasks associated with setting up MRO stores inventory, such as price, delivery, vendor source, etc., will be done by materials management and procurement, the reliability engineer should provide technical input for the justification to stock the part. Spare parts for critical assets are usually the most expensive ones, so the need for (risk of not having) the part must be compared to the actual usage and costs of stocking this spare part. More importantly, this justification needs to be calculated in actual dollars and cents rather than a subjective assessment that “it’s important.” MRO inventory costs can quickly get out of control if this is not done consistently. This alone could put you at a competitive disadvantage if your plant operates in a market where profit margins are small. This process is what we referred to earlier as the critical spares analysis.
One key piece of data the reliability engineer usually provides for the CSA is the risk ranking from the FMEA that was discussed earlier. This credible piece of information should surely provide the clout needed to justify keeping the spare part for a critical asset, right? When the CSA is done, the results may sometimes indicate that the cost to procure and carry the spare part in MRO stores inventory may exceed the justification for it. For example, we may be mitigating a production loss valued at $300,000, but the spare part may cost $300,000. I’ve oversimplified the analysis for this example, but the point is that we must decide how to reconcile this. Was the risk assessment in the FMEA erroneous? Would we really accept a $300,000 loss because the mitigation recommendation wasn’t economically justified? These were just some of the questions I had when I first encountered this corrective action item.
There are a couple of answers. First, when we did the risk assessment in the FMEA, we may not have had hard data on the consumption rate of parts to verify our prediction of failure likelihood. More understandable, we may not have accurately known what that spare part was going to cost when we made that recommendation. And to some degree, when we’re conducting a FMEA, we can subconsciously get in a mindset that any recommendation is better than no recommendation and cost sometimes gets forgotten. It’s easy to say the FMEA team should have made better estimates or not relied on a single recommendation to mitigate a failure mode. But if you’re doing a full-scale assessment of a world-scale facility, you may be doing hundreds of FMEAs on tens of thousands of assets and under pressure to complete them in a short time, not years. So it’s not always practical to stop and gather all of that detailed information. This is acceptable at the moment because a CSA is a second-pass look at high-value spare parts on critical assets.
Depending on the criticality of the asset, the CSA process considers several possible avenues. Let’s just say, for example, the CSA determines that you should not stock this item at your facility. One of the options is to consider a partnering arrangement with a supplier or vendor to keep the part on consignment in its inventory to avoid or reduce your inventory holding costs. This could be plausible if the vendor has many other customers who also use that part and, therefore, turnover rates justify keeping the part in its inventory. In any case, accounting and procurement should be included in this discussion to ensure best practices are followed with regards to inventory taxes and any regulations that may be involved.
However, if it’s a unique item or one with a long manufacturing lead time, this isn’t any more justifiable for the vendor. There may be no incentive for the vendor to assume the financial burden of carrying the part. In this case, the reliability engineer could revisit the FMEA. The intent of this is not to work backwards from the answer and fudge the risk rankings until the justification comes out in the FMEA’s favor. Rather, a less expensive alternative recommendation might be found that could be implemented and still mitigate the failure mode adequately. Or an existing recommendation could be modified. Perhaps vibration monitoring or eddy-current testing frequency could be increased. An inexpensive redesign could be implemented, or an operations SOP could be revised that would reduce the likelihood of failure.
In conclusion, as you take control over spare parts management, implement risk-based asset strategies with spare parts recommendations and deliver fiscally responsible counter responses to critical spares analyses, you move around the roadblocks and arrive closer to your destination of reliability excellence.
Chris Endrai is a Reliability Engineering Subject Matter Expert with Life Cycle Engineering. He is based out of their Houston office and has over 20 years of direct, hands-on experience in petrochemical plant maintenance and reliability, including several world-class facilities that maintained 99%+ equipment availability. Chris holds a degree in mechanical engineering from Purdue University. www.lce.com