Reliabilityweb PM Optimisation Maintenance Analysis of the Future

PM Optimisation Maintenance Analysis of the Future - Part 1

by Steve Turner

Introduction

Maintenance is one of the largest controllable operating costs in capital intensive industries. It is also a critical business function that impacts on commercial risk, plant output, product quality, production cost, safety, and environmental performance. For these reasons, maintenance is regarded in best practice organisations not simply as a cost to be avoided, but together with reliability engineering, as a high leverage business function. It is considered a valuable business partner contributing to asset capability and continuous improvement in asset performance.

The dilemma that many of us face (and mostly not of our own doing), is that we are managers in organisations which barely have sufficient resources to keep the plant working, let alone find ways of improving reliability. When this is the case, scarce maintenance resources are rationed and breakdowns consume resources first. Preventive maintenance suffers, which inevitably results in more breakdowns and the cycle continues.

In addition to lost productivity through unplanned maintenance, the "fix-it-quickly" mentality promotes "band aid maintenance", or temporary repairs, that often exacerbate the situation.Temporary repairs take additional labour to correct, or in the worst case, fail before correction.

Often in an effort to control costs, personnel numbers are reduced and morale declines as the fewer remaining personnel almost give up in despair. With this, work standards drop.

The vicious cycle feeds on itself and gradually organisations become almost entirely reactive. This situation is depicted in Figure 1.

Backlog

In such organisations, it seems that the level of plant availability drops to the stage where it stabilises at a low level - a level where it is not breaking down because it is not running; ie it is being repaired!

For many, the obvious solution is to seek to increase personnel numbers. However, this approach is not often the best. In today's economic climate, the management culture is mostly focussed on cost reduction and managers seeking only to increase staff numbers, rarely succeed.

Today many Asset Managers are embarking on an improvement program focussing on improving the maintenance processes and increasing the effectiveness or productivity of asset and human resources. Improving maintenance processes involves process re-engineering and increasing resource effectiveness in the following way:

Removing all maintenance tasks that serve no purpose or are not cost effective.
Eliminating any duplication of effort where different groups are performing the same PM to the same equipment.
Moving to a mostly condition based maintenance philosophy.
Adding maintenance tasks to manage economically preventable failure modes²that historically have been run to failure.
Spreading the workload around the trades and operators.

The long-term vision is to adopt such process in a way that achieves this goal in a systematic way and which can remain as a ‘living program' to capture the benefits of future learning and technical advances on a continuing basis.

The methodology used to address the vicious cycle of reactive maintenance has been developed over a period of five years with the co-operation of several of Australia's most notable asset intensive industries.

The program is endorsed by SIRFrt and is the preferred maintenance analysis method of one of the World's largest mining companies. The methodology, training programs and software package are known as PMO2000. Further information can be found at www.pmoptimisation.com.

Aim

This paper is written in three sections.

Section 1. PM Optimisation - Maintenance Analysis of the Future

The aim of Section 1 is to describe the process of PM Optimisation with specific reference to PMO2000 methodology. This section also aims to demonstrate the origins of many of the problems that today's asset manager face and how PM Optimisation can assist organisations improve their asset management.

Section 2. Comparing PMO and RCM methods of Maintenance Analysis

Section 2 of the paper aims to explain how PMO and RCM are quite different processes - RCM being a process developed for the design phase of the asset life cycle and PMO being developed for assets that have been commissioned. The paper demonstrates how PMO achieves the same analysis outcomes as RCM does but at a cost six times less and at a speed six times faster than RCM.

Section 2 of the paper also aims to provide a defence against the series of emotive and misleading articles written by RCM consultants in an attempt to discredit any maintenance analysis process that does not comply with SAE JSA 1011 titled "Evaluation Criteria for Reliability Centered Maintenance (RCM) processes".

Section 3. Understanding Statistical Methods of Maintenance Analysis

Section 3 provides a brief insight into the strengths and weaknesses of statistical packages which provide a means of maintenance analysis.

The Origins of Maintenance Problems

The Design and Commissioning Phase

Maintenance engineers commonly deal with the result of someone else's design - whether good or bad. When design is finished, construction starts and finishes, and the plant is commissioned. The Maintenance Engineer arrives someway through this (if he is lucky). Quickly he finds himself left with a maintenance budget being used to finish off construction / over-expenditure, a plant that is going through teething problems, spares arriving in dribs and drabs and little information about plant failure modes and the effect of failure. Rarely is the plant delivered to the maintenance department with a comprehensive and well-documented maintenance requirements analysis and a maintenance plan.

What happens in best practice organisations is that, amongst other things, a fully documented RCM based maintenance program is developed through the design phase. Unfortunately in the vast majority of capital projects in industry, any reliability engineering or failure analysis is done in an informal manner and certainly not provided to the maintenance department for use in developing asset management strategies and policies.

Post-Commissioning

After commissioning, (or sometimes before) the design team disbands and its members find work on new projects. The Maintenance Engineer is left to second guess the design intent, the plant limitations, the potential failure modes, and the likely consequences of them. The operations people are, at the same time, learning how to operate the plant and experimenting with it; pushing it to its limits and occasionally well over its design intent. There is limited money or time to change obvious design or maintainability problems in the new plant.

The task of defining the plant maintenance policy³ is a priority but a most daunting one. Whatever is achieved is done in a rush often using people in an opportunistic manner. The problems that emerge right from the beginning will be as follows:

There is no consistency of analysis philosophy.
Maintenance personnel, being risk averse, write maintenance policies which over service and use overhaul or intrusive methods as a means of prevention - often to the detriment of reliability rather than for its good.⁴
There is no audit trail, and only those who wrote the policies know their rationale. It becomes near impossible to review the program and objectively assess its effectiveness.

Full Production

When the plant swings into full operation and breaks down, more maintenance tasks are created and some existing tasks are done more frequently. Many of these new tasks duplicate others. Often, in an attempt to be seen to be doing something about high profile reliability problems, maintenance personnel create and perform tasks supposed to prevent the failures but, in reality, serve no realistic purpose.

Soon the preventive maintenance (PM) requirements exceed the labour resource available. PM is missed, preventable failures occur and unplanned maintenance work consumes more labour than necessary. The number of temporary repairs grows out of control and the costs of revisiting them or repairing additional damage caused by them wastes more resources.

The vicious circle of breakdown maintenance, temporary repair, and reduced PM gains momentum and becomes well entrenched.

Management Consultants (often with a cost reduction focus) arrive on site and cut staff numbers and budgets. This serves only to tighten the vicious circle and increase the rpm. The end result is typically a large morale problem for the maintenance department and a poorly performing plant.

Many organisations have tried to regain control by using RCM to develop their maintenance program. This is often a pursuit with limited scope and a high failure rate. This is because RCM is highly inefficient when used as a rationalisation tool. It consumes excessive amounts of the most valuable resources on site - those being the scarce maintenance and operations personnel.

A large element of the inefficiency of RCM, is that it does not acknowledge the experience and value of the current maintenance program. It starts from scratch and builds a maintenance program from the function down.

The high failure rate of RCM amongst mature operations is not surprising when it is realised that RCM was developed by Nolan and Heap (Nolan and Heap, 1978) for use in the design phase of the equipment life cycle (Moubray 1997). It was not designed for use in mature industries as a rationalisation tool.

Improvement Tactics

The Dupont Experience - Four Common Strategies

In this predicament, case studies and experience suggest that, outside of cultural and behavioural initiatives, asset managers should be focussing on a few key areas.

They must:

Develop focussed maintenance policies;
Improve planning and scheduling based on the revised policies; and
Focus on defect elimination.

The DuPont model of Up-Time featured in the Manufacturing Game⁵ illustrates these points very well. The table below illustrates how DuPont has modelled the relative effect of various strategies on plant uptime.

DuPont analysis suggests that if companies focus on planning only they will improve their uptime by 0.5%. If they focus only on maintenance scheduling, uptime will improve by 0.8%. If they focus on preventive and predictive maintenance only, uptime will actually get worse by 2.4%. If organisations focus on all of these three aspects, they will receive a 5.1% improvement in availability.

These results may well sound appealing in their own right, but DuPont has found that by adding defect elimination to the initiatives undertaken, then a 14.8% improvement in availability may be achieved in their plants. This information is provided in the table at Figure 2.

Figure 2: Table showing the effect of different reliability engineering activities on plant availability taken from the Manufacturing Game - SIRFrt. http://www.manufacturinggame.com

Problems with most PM Programs

The common problem with mature maintenance programs that were never designed correctly in the first place, is that between 40% and 60% of the PM tasks serve very little purpose (Moubray, 1997). The findings of many PMO reviews are that:

Many tasks duplicate other tasks.
Some tasks are done too often (and some too late).
Some tasks serve no purpose whatsoever.
Many tasks will be intrusive and overhaul based whereas they should be condition based.
There are many costly preventable failures happening.

This poses a significant issue for improving productivity as no amount of perfect planning and scheduling will make up for the inefficiencies of the maintenance program itself. Achieving 100% compliance with a program that is 50% useful and 50% wasteful can not be good asset management!

The DuPont analysis indicates that a process must be implemented that:

Can define the appropriate mix of preventive and predictive maintenance;
Can produce a maintenance program where the servicing intervals and the tasks themselves are sound and value adding in every case; and
Where defects which can not be maintained out of the plant can be eliminated through other means.

What is suggested, as a fundamental building block in adopting all these strategies, is to ensure all work undertaken is based on RCM decision logic. PMO is a means of rationalising all the Preventive Maintenance work to ensure that all the work adds value and there are no duplications of effort. Figure 3 illustrates this.

PMO 2000 from A to Z

Overview

The PMO 2000 process has nine steps. These steps are listed below and discussed in the following pages.

Step 1 Task Compilation
Step 2 Failure Mode Analysis
Step 3 Rationalisation and FMA Review
Step 4 Functional Analysis (Optional)
Step 5 Consequence Evaluation
Step 6 Maintenance Policy Determination
Step 7 Grouping and Review
Step 8 Approval and Implementation
Step 9 Living Program

Project Ranking

It should be noted that a full PMO 2000 assignment, there may need to be some form of criticality or system ranking process. This may be done by reviewing the equipment hierarchy or work schedule hierarchy,⁶ and subdividing it into appropriate systems and/or equipment items or trade filtered maintenance schedules for analysis. Having performed this task, the criticality of each of the project is identified is assessed in terms of their contribution to the client organisation's strategic objectives. Higher criticality systems tend to be those that will have an impact in the following ways:

Have a high perceived risk in terms of achieving safety or environmental objectives,
Have a significant impact on plant throughput, operating or maintenance costs, or
Are consuming excessive labour to operate and maintain.

Having conducted the criticality assessment, this is used as the basis for assessing which projects should be analysed first, and the overall level of rigour required for each analysis.

Figure 4 illustrates the sources of PM programs.

Step 1- Task Compilation

PM Optimisation starts by collecting or documenting the existing maintenance program (formal or informal) and loading it into a database via a spreadsheet. It is important to realise that maintenance is performed by a wide cross section of people including operators. It is also important to realise that in many organisations, most of the Preventive Maintenance program is done by the initiative of the tradesmen or operators and not documented formally. In this situation, task compilation is a simple matter of writing down what the people are doing. It is common for organisations to have an informal PM system in operation whilst it is rare for an organisation to have no PM at all.

Step 2 - Failure Mode Analysis

Step 2 involves people from the shop floor working in cross-functional teams identifying what failure mode(s) each maintenance task (or inspection) is meant to address.

Figure 5 illustrates the output of Step 2.

Step 3 - Rationalisation and Failure Mode Review

Step 4 - Functional Analysis

The functions lost due to each failure mode can be established in this step. This task is optional, and may be justified for analyses on highly critical or very complex equipment items, where sound understanding of all the equipment functions is an essential part of ensuring a comprehensive maintenance program. For less critical items, or simple systems, identifying all of the functions of an equipment item adds cost and time, but yields no benefits. Figure 7 illustrates Step 4.

Step 5 - Consequence Evaluation

In Step 5, each failure mode is analysed to determine whether or not the failure is hidden or evident. For evident failures a further determination of hazard or operational consequence is made.

Figure 8 illustrates Step 5.

Modern maintenance philosophy stems from the premise that successful maintenance programs have more to do with the consequences of failures than the asset itself.

In this step, each failure mode is analysed using Reliability Centered Maintenance (RCM) principles. This step establishes new or revised maintenance policies. During this step the following become evident:

The elements of the current maintenance program that are cost effective, and those that are not (and need to be eliminated),
What tasks would be more effective and less costly if they were condition based rather than overhaul based and vice versa,
What tasks serve no purpose and need to be removed from the program,
What tasks would be more effective if they were done at different frequencies,
What failures would be better managed by using simpler or more advanced technology,
What data should be collected to be able to predict equipment life more accurately, and
What defects should be eliminated by root cause analysis.

Figure 9 illustrates Step 6.

Step 7 - Grouping and Review

Once task analysis has been completed, the team establishes the most efficient and effective method for managing the maintenance of the asset given local production factors and other constraints. In this step it is likely that tasks will be transferred between trades and operations people for efficiency and productivity gains.

Step 8 - Approval and Implementation

In Step 8, the analysis is communicated to local stakeholders for review and comment. The group often does this via a presentation and an automatic report generated from the PM Optimisation software. This software details all the changes and the justification for each.

Following approval, the most important aspect of PMO 2000 then commences with implementation. Implementation is the step that is most time consuming and most likely to face difficulties. Strong leadership and attention to detail are required to be successful in this step. The difficulty of this step increases markedly with more shifts and also with organisations that have not experienced much change.

Step 9 - Living Program

Through Steps 1 to 9, the PM Optimisation process has established a framework of rational and cost effective PM. In the "Living Program", the PM program is consolidated and the plant is brought under control. This occurs as reactive maintenance is replaced by planned maintenance. From this point improvement can be easily accelerated as resources are freed to focus on plant design defects or inherent operational limitations.

During this step, several vital processes for the efficient management of assets can be devised or fine tuned as the rate of improvement accelerates.

These processes include the following:
Production / maintenance strategy,
Performance measurement,
Failure history reporting and defect elimination,
Planning and scheduling,
Spares assessing, and
Workshop and maintenance practices.

In this step it is the intention to create an organisation that constantly seeks to improve its methods by continued appraisal of every task it undertakes and every unplanned failure that occurs. To achieve this requires a program where the workforce is adequately trained in analysis techniques and is encouraged to change practices to improve their own job satisfaction and to reduce the reduce the unit cost of production.

Implementing a Successful PM Optimisation Program

Selling Maintenance as a Process rather than a Department

Change programs are not easy to implement particularly when an organisation has entered the vicious circle of maintenance.

The author's experience is that in most cases there needs to be some fundamental shifts in behaviour and motives at all levels across the organisation. This almost invariably involves modifying the behaviour and decision-making priorities for middle managers too. Above all, there needs to be a commitment to the long term and if there needs to be some short-term losses, then these will often be well worth it if returns can be generated quickly from the investment in the future.

The most important aspects of managing a PM Optimisation change program to break the vicious cycle, are described in the following paragraphs:

Choose projects that do not focus solely on one aspect

There needs to be a combination of projects that are likely to result in:

increased uptime, and
reduced labour requirements.

In many cases, this means tackling reliability problems in the process bottleneck as well as looking at maintenance intensive item categories⁷ that are prolific on site.

The reasons behind tackling projects that carry labour productivity rather than machine uptime is that

First line supervisors will contribute to the program if they see that there is a return on investment in labour terms. A target of returning five days labour pay for every one invested should be the lowest acceptable limit.
Returns on labour productivity are compounding (they can be reinvested in more productivity) whereas uptime improvements are finite.
Collect data about the before and after case

Collecting data about plant reliability achieves many things. The two major benefits are as follows:

Steering the analysis in the area of opportunity, and
Providing the basis for the project teams to demonstrate the value of the work that has been done.

Create cross functional teams from the shop floor

PM Optimisation is not a back office function of statistical perfection. It is an empirically based process of considering preventive maintenance options, and task rationalisation. Involving the people who will be left to do the work is constructive in gaining commitment to make the changes happen. Leaving them out of the analysis creates barriers to implementation.

Integrate operations and maintenance work management systems

In redistributing the workload, it is important that the various systems of maintenance scheduling come from the same origin or database. In most organisations this is not the case as the operations department has a system that works in isolation from the trades groups.

Implement outcomes as quickly as possible

There is a temptation to celebrate project success after the analysis and move on to new projects leaving implementation to drift and become poorly managed. This is a very bad outcome because the project has consumed scarce resources and wasted them. Without successful implementation, the work has created a cost and the workforce expectations have not been met. The workforce will correctly blame the middle management and participation in future projects will be more difficult to obtain.

Dysfunctional Organisational Structures

The organisational structure of most capital intensive industries can be described as being departmentalised with maintenance and operations having separate budgets, performance measures, and management structures. There are advantages associated with departmentalised organisational structures; however, such structures often lose efficiency through:

Conflicting goals and objectives of each department that sometimes result in decisions which are not congruent with the overall business goals. The most common being short-term production goals that often clash with the maintenance objective of reducing the overall cost of maintenance.
Duplication of effort with many departments attempting to achieve the same result but in isolation of each other. Electrical, mechanical and production PM schedules commonly fall into this category with each department checking the same machine for the same failure modes.
An overly bureaucratic decision making and approval process at all levels. This is often a result of conflicting objectives between managers.
Excessive demarcation of roles and responsibilities. Though becoming less prevalent, the inability to take responsibility for certain work due to past traditions prevents efficient use of resources at many sites.
A proliferation of independent systems and databases. The most common instance is where operations and maintenance personnel work through their own logbooks and records that are kept independently of the CMMS.
The process of defect elimination is seen largely as an engineering pursuit where problems often have multiple contributing factors and must be solved by cross functional teams. Many factors are not necessarily obvious and many are due to shop floor people taking practical measures to combat other problems that have secondary effects elsewhere.

Conclusion

There are a number of contributing factors to the difficulties faced by the modern asset manager. To be effective at making changes to the performance of a maintenance function, the asset manager should understand how these factors have arisen, how they impact on the business performance and how they can be effectively tackled. There is a way out of the vicious cycle of maintenance and the Optimisation or Rationalisation of Maintenance tasks is a fundamental strategy in this process.

To break the vicious cycle of maintenance, asset managers should focus on the areas of preventive maintenance and defect elimination. To improve their preventive maintenance organisations, there must be a shift to an environment where there is no duplication of PM effort, every PM task serves a purpose, all PM tasks are completed at the right interval, and with the right mix of condition based maintenance and overhaul.

Organisations that have mature PM programs and are struggling to complete them should take steps to rationalise what they have, rather than embarking on a "green field" approach such as the traditional approaches to RCM.

There are many statistically based maintenance analysis tools available, however, users should be careful in their choice. They should be mindful that they could be spending a lot of money on expensive packages and consuming a lot of time collecting vague data which, after years of effort, produces a meaningless result.

Be sure to read Part 2: PM Optimisation Maintenance Analysis of the Future

² Care must be taken as some failure modes cost more to prevent than the cost of failure itself.

³ A maintenance policy is the combination of what is to be done, how frequently and by whom.

⁴ Particularly if the maintenance is intrusive.

⁵The manufacturing game is a practical learning process where participants learn in an interactive environment, the strategies which will best enhance plant uptime. The game is available through SIRF Roundtable. Information is readily available at http://www.manufacturinggame.com.

⁶PMO can be conducted on any trade schedule such as lube rounds or operator rounds and can also be used to analyse the PM needs of a major shut down.

⁷ Ie; fans gearboxes, conveyors, machine tooling, pumps, electric motors, instrumentation - where there could be hundreds of similar items. Saving labour on one item multiplies across the whole of the site.

---

About the Author

The author, Mr Steve Turner, is a professional engineer who has been extensively trained in RCM methods and has deployed them over a 20 year period in various roles as an airworthiness engineer, a maintenance manager, as part of a design team and as a consultant. Over the past five years, Mr Turner has developed a process of PM Optimisation known as PMO2000.