One of the time-proven ways to improve equipment and process reliability is to perform a Reliability Centered Maintenance analysis (RCM) on your critical assets. Developed in the 1970s as a tool to build a complete equipment maintenance strategy for commercial aircraft, the RCM process has transformed how companies view and perform maintenance around the world. From food and beverage, to oil and gas, to pharmaceuticals, customers are learning the value that a thorough RCM analysis can bring to their business with results that have improved productivity by 27% while at the same time reduced maintenance costs by a similar amount.1 The key results of an RCM analysis are failure mitigation tasks from which maintenance plans can be devel oped. RCM is a powerful tool that effectively integrates maintenance and reliability by focusing maintenance activities on four aspects of failure mitigation:
1. Elimination
2. Prevention
3. Prediction
4. Control (Plan to recover when an unexpected failure occurs)
One of the keys to getting the most from your RCM effort is having a cross-functional team that includes operations, maintenance, and reliability. It is imperative that these three groups work together closely, and develop a joint vision and strategy to address reliability issues. This strategy will include a holistic approach to failure mitigation and there will be spillover into improving work processes and communication channels. The team members representing these groups include: the tradespeople, process engineers, PdM (Predictive Maintenance) technicians, reliability engineers and the often overlooked equipment operators. Working together, this integrated team will provide a multifaceted perspective on the problem. By expanding the identification of failure modes and the corresponding failure mitigating tasks, a cross-functional team will develop a more robust solution than would a group that consists of only one specific work group. The integration of the operators into the RCM team will provide additional methods and expand the number of tasks available to address potential failures.
Figure 1 - Operations, Maintenance and Reliability must work together for any RCM effort to be successful.
The Role of Equipment Operator in RCM Analysis
Allied Reliability and Meridium have over 30 years of combined experience teaching and facilitating Reliability Centered Maintenance, and during these sessions the question instructors are most often asked is this: "The RCM process uses the word ‘maintenance' and I'm an equipment operator. Why am I here?"
This question opens a critical discussion around Reliability Centered Maintenance: the importance in having operations deeply involved in the RCM process from start to finish. Simply stated, RCM does not achieve its full benefit without involving key operations people in the process. Completing an RCM analysis without having an experienced and respected equipment operator involved will always result in an incomplete listing of failure modes and poorly detailed failure effect statements that, if done correctly, could help determine the actual cause and mitigate the failure.
So, how important is the operator when it comes to performing an RCM analysis?
Well, our experience has proven that it's not a best practice to conduct an RCM analysis without an experienced equipment operator. If you're thinking your operations people are not important in helping to develop your equipment maintenance strategy, think again. They are not only important in helping to develop the strategy; your strategy is incomplete if the operator is not a part of your equipment maintenance plan!
Let's think of an introductory RCM meeting as a train. This is the point where we can metaphorically see the engineer/operator of our train reach over and apply the hand brake. Just as things were beginning to roll along quite smoothly, we are about to come to a complete and abrupt stop. While the question regarding why an equipment operator should be involved in the RCM process is the most popular, the question that is most feared comes next: "You don't really expect the operator to actually perform maintenance tasks, do you?"
The room falls silent, and while it is now clear that we all understand why we need to have operators involved in the RCM analysis, it is not clear at all why operators need to perform maintenance tasks. In fact, what is clear at this moment is that the operator is not keen on performing any maintenance. He/she is, after all, an equipment operator, not a maintenance technician. And, the maintenance people don't want the operator taking away any of their tasks because of the fear that giving up certain tasks could cost someone their job. Our train has now screeched to a halt, and it is at this point we begin to discuss some real-life examples of equipment operators who are involved in, and perform regular maintenance tasks.
The relevance of the operator in maintenance tasks can be illustrated by asking questions that the attendees can relate to: "Do you watch NASCAR? Who is the operator of the race car? Is he or she in any way involved in the maintenance of that car?"
These questions are usually answered with "the driver drives the car and the mechanics and pit crew maintain the car" but within a minute or two someone brings up the fact that the driver actually communicates several things to the race team throughout a race in regard to how the car is performing, and how it holds the track in and out of the turns. Could this information that is being passed from the operator to the race team be considered maintenance?
Here is another relevant example: "Have you ever flown on a jet for vacation or business? Who is the equipment operator of that jet? Does the pilot perform any maintenance tasks for his aircraft?"
This conversation is easy, as most everyone has seen the pilot walk around the aircraft and perform visual inspections of the equipment before the passengers board. Once the visual inspection is complete, the pilot and copilot work to complete the preflight procedures or checklists. Safety and reliability are extremely important to this industry, and as a result there are several items that need to be checked each time to ensure the aircraft is fit for operation. While the pilots are indeed the equipment operators, it is quite clear that they perform several important maintenance tasks each and every time they operate the equipment. The airline pilots also receive some very intensive training on how to start, operate, and shut down the aircraft along with detailed instructions and training on how to handle and address critical component failures.
Types of Failure Modes Often Overlooked When Operators Are Not Included on the RCM Team
The reality of why equipment operators need to be part of your RCM effort comes down to the identification of failure modes. While your maintenance mechanics and engineers can typically do a thorough job at identifying the mechanical and electrical failure modes associated with the components within your system, they will fall short in identifying the process-related failure modes that the operators deal with and experience in the day-to-day operation of your equipment. While we would like to believe that most of our equipment is automated and controlled through a PLC, the reality is that we often rely on the operator's training and experience in starting, operating, and shutting down the equipment and performing equipment and product changes. In reviewing failure modes identified through the typical RCM analysis, 28% of the failure modes identified address failures resulting from how we operate our assets. Equally important, these failures are often the most frequent in occurrence. Without the participation of the operator, these important failure modes would surely be overlooked and continue to impact the reliability of your assets.
The Value Operators Bring in Writing Good Failure Effect Statements
An important aspect of every analysis is identifying and recording the failure effects that result from each failure mode. It is through the failure effects that we determine the consequence to our business should the failure occur. While our maintenance and engineering team members are well suited to identify the effects the failure will have on the component or part being analyzed, they often struggle to identify the effect each failure has on our overall process. Each failure-effect statement should include the following information:
• Events that lead up to the failure for wearbased components/parts and components/ parts that have a useful P-F interval.
• First sign of evidence that the failure has occurred. This is where in most cases, the operator will be able to identify the alarm that results from the failure or the indication they receive that failure has occurred.
• The secondary effects or damage that results from the failure mode. While, again, our maintenance people will do an outstanding job of identifying the mechanical and electrical collateral damage, we will need to rely on the operator's experience to identify how the process is affected.
• Events required to bring the process back to normal operating condition. While in most cases, some may consider this to be a simple statement about shutting down, troubleshooting, and replacing or repairing a part, we will again rely on the operator to identify critical effects regarding how to best avoid secondary damage that could result from improper shut down following the failure.
With a well-written failure-effect statement, we look for one more piece of critical information to help determine the consequence of the failure mode: the downtime that results from the failure mode. This data is best determined by the operator's input with regard to how long it takes from the time the failure occurs and the equipment is shut down to the time the process is restored to normal operating condition again.
The Role of Operators in Sound RCM Task Decisions
As the team moves forward to identify mitigating tasks, the operator will again play a key role in helping to identify operator-care tasks, as well as tasks that can be performed by the operators during equipment start-up and shutdown, product changes, and daily rounds. Experienced operators are also valuable in the development of tasks associated with process monitoring or process verification, including control logic changes and process data trending, as well as warning and shutdown alarms. Process verification is one of the most costeffective and reliable forms of PdM. By using the PLC to monitor, trend, and alarm key process variables such as pressure, temperature, flow, amp draw, and vibration, the system essentially becomes a condition-based monitoring tool-while your equipment is running. Our statistics show that nearly 50% of process verification tasks are identified by the operators participating on the RCM team.
As time moves forward and technologies improve, the role and involvement of operators in the effort to continuously improve equipment reliability will surely increase. Innovations in handheld data loggers over the last 10 years have made it simple to develop precise operator rounds where critical pressures, temperatures, and conditions can be entered for trending as well as immediate feedback.
The Role of Operators in Implementing RCM Mitigation Tasks
Operators provide a unique opportunity to add significant amounts of content to the proactive maintenance work process. Operators know the operating parameters of their processes and equipment and recognize when it isn't running correctly. As mentioned earlier, using this knowledge in building failure-based maintenance plans will result in sound RCM tasks. Many of these tasks will be proactive inspections which can be integrated into operator rounds (inspections). These inspection tasks utilize the operator's senses and knowledge base to detect process disturbances and equipment problems that are much harder to capture using sensor technology. Due to the capture using sensor technology. Due to the proactive nature of these inspections, critical data is obtained either prior to equipment damage occurring or very early in the failure process.
A key to the effective use of this mitigation strategy is the use of handheld technology to enable the inspection work process. Using this technology the operator can efficiently perform the inspections identified in the RCM analysis. Where abnormalities are seen, he/ she can input information to create alarms, which provide the basis for actions to be taken prior to the onset of performance degradation and equipment damage. The effectiveness of this strategy is leveraged by seamlessly linking the operator findings (alarms) into the maintenance work management systems. The work process may create work requests directly through an Asset Performance Management (APM) system, or there may be an intermediate step where the alarm is reviewed before a work request is processed. This enablement technology, linking the output of the RCM analysis to the work management system, is a distinguishing feature of advanced RCM systems.
Results of Operator Development of RCM Tasks
The inclusion of the operators into the development of the failure mitigating tasks can have a dramatic effect on the makeup of the proactive maintenance plan. A breakdown of proactive tasks by task type is shown in Figure 2.2 This plot suggests that 60% of the tasks resulting from failure modes analysis (the heart of RCM) are proactive in nature. Figure 33 indicates that 60% of the resulting tasks belong to operations, indicating if operations were not included in the failure mode analysis and development of RCM task decisions that 60% of the proactive maintenance tasks may not have been included in the maintenance plan. Further, much of the content is specifically focused on proactive tasks-a significant upgrade to the overall quality of the maintenance plan which will be manifested in the results realized from execution of the plan.
Importance of Integrated Information Flow in RCM
The ability to obtain key information concerning operations, failures, failure modes, PMs, etc. is important in the RCM process. This ability to get information directly impacts the productivity of the RCM team. By way of example, consider that each piece of critical equipment will have a work history, failure information, maintenance tasks, etc.-helpful information in the RCM analysis. In addition, there is condition monitoring data, preventive maintenance data, process data and operator inspection data, which can provide insight into identifying effective failure mitigation tasks.
The overall information flow supporting the RCM process is the centerpiece to improving efficiency. Further, the ability to access this information from a single portal, which also contains the RCM analysis, will improve docu mentation and the ability to review/upgrade the RCM results in the future. The information of interest to the RCM team comes from a disparate number of sources. As can be seen in Figure 4, an RCM platform which integrates this information provides the RCM team with the tools to improve their productivity significantly.
Expanding the Role of Operations to Leverage the RCM Process
RCM is an elegant and proven process that addresses failure modes by identifying causes and developing mitigation tasks. Earlier it was pointed out that such systems could improve productivity up to 27%. While this return is impressive, consideration should be given to how to leverage these results further. Implementation of the RCM process can be disruptive to the workforce, as it will introduce many changes to their daily work patterns and processes. Therefore an area to consider to further leverage the RCM process is leadership. There is a need for strong leadership to help the workforce manage the disruption introduced by a new and proactive maintenance approach until the new processes become integrated into the fabric of everyday work life.
Operations is uniquely positioned to have a significant impact on the success of a program by leading this change, and a key part of this effort will involve communication. It is generally accepted that, with a significant change in the work processes, communication is critical- there can never be too much communication during change.
The operators on the team are in a position to greatly impact the success of the RCM effort through positive influence of their peer group. Through communication of team activities, ensuring stakeholder involvement and improving work processes, the powerful RCM solution can be leveraged to deliver much more value. The operator(s) chosen for the RCM team should not only be knowledgeable of equipment failures, but also have the ability to influence their peer group. This ability is important to the communication process, vetting operator routes, creating short-term wins, and making the changes sustainable. The operators will be the interface point between the crews and the RCM team to:
1. Communicate activities, strategies, and progress of the RCM team.
2. Manage all crew interface such as meetings, presentations, and demonstrations.
3. Demonstrate operator routes and duties, get feedback on them, and optimize the proactive tasks based on peer feedback.
4. Report progress, performance (measures) and status to crews.
5. Manage the stakeholders' acceptance, involvement and support of the RCM change through interaction with the operators, coordinators, and maintenance.
6. Provide an important link to the long-term sustainability of the program. All of these activities are, in one way or another, related to communication. Of these steps, one of the most important is stakeholder management. The operator will know the personnel, culture, and needs of their peer group. He/she will also be uniquely positioned to support sustainability. The sustainability of the program must be addressed as a first step in the project and will require a focus on the work processes between groups, especially the interface points between maintenance and operations. Ineffi cient work processes cause communications problems between groups which result in ineffective actions, wasted resources, and interpersonal problems, as well as further degradation of communications and a downward spiral in the ability to sustain the process. Operators can play a key role in overcoming these roadblocks to sustainability.
Conclusions
Many reliability and maintenance solutions have been implemented in manufacturing plants in recent years with varied success. The RCM process is a robust technical solution that has proven it delivers significant value in many applications. Technology and information flow must be viewed as a work process enabler, and as a necessary step to execute failure-driven maintenance strategies efficiently. Through improved information management, integration of work processes, and leadership in the change process, the technical solution can be leveraged to deliver significantly more value than a stand-alone technical implementation. The results of this approach have proven to far exceed those that are seen from many traditional R&M programs. In addition, the ability to leverage the technical solution through communication and management of key stakeholders has been discussed. This approach will provide higher returns, faster implementation and is key in the sustainability of the RCM program.
Douglas Plucknette is the creator of the RCM BlitzTM method, author of the book Reliability Centered Maintenance - Using RCM BlitzTM and RCM Discipline Leader for GPAllied. Well-known on the conference and lecture circuit, Doug has over 30 years experience in the field of maintenance and reliability. Over the past 15 years, Doug has facilitated hundreds of RCM analyses and trained certified RCM BlitzTM Facilitators for companies around the world.
Paul Casto is Vice President of Value Implementation at Meridium, Inc. Paul is a leading practitioner in reliability and maintenance improvement methodologies with a focus on helping Meridium clients with value creation and realization. He gained hands-on manufacturing plant experience in reliability, maintenance, operations, engineering, and construction in a variety of industries including chemical, steel, aluminum, automotive, mining, aerospace, and consumer goods. Before joining Meridium, Paul served as the Manager of Reliability Technology for Eastman Chemical. Paul holds a BS in Electrical Engineering from West Virginia University, a Masters in Engineering Management from Marshall University Graduate College, an MBA from Clemson University, and a Masters in Maintenance Management and Reliability Engineering from the Monash University in Gippsland, Australia. He is currently pursuing an MS in Applied Statistics and a PhD in Industrial Engineering at the University of Tennessee. Paul is an ASQ certified Six Sigma Black Belt, holds ASQ certifications in Reliability Engineering and Quality Engineering and is a SMRP Certified Maintenance and Reliability Professional. He has served on the University of Tennessee's Maintenance and Reliability Center's advisory board and is an active member of ASQ and IEEE.
1. Paul Arnold RCM BlitzTM at Whirlpool http://www.maintenanceworld.com/Articles/arnoldp/whirledclass.html
2. Kathy Light and Steve Powers, "Managing Change in a Major Reliability Improvement Effort", MARCON 2010 Proceedings (2010) Presentation slide 15.
3. Ibid