Don't miss MaximoWorld 2024, the premier conference on AI for asset management!

Experience the future of asset management with cutting-edge AI at MaximoWorld 2024.

Sign Up

Please use your business email address if applicable

The Maintenance Function

In Webster's dictionary it defines maintenance as:


  • To maintain
  • Keep in existing condition
  • Preserve, protect
  • Keep from failure or decline

The ultimate goal of maintenance is to provide optimal reliability which meets the business needs of the company. Many people do not know the definition of reliability and it is:

"The probability or duration of failure-free performance under stated conditions"

Now that we understand maintenance maintains reliability let's see how a proactive maintenance process works. John Day, one the most well known proactive maintenance management advocates who was formerly the maintenance and engineering manager for Alumax Mt Holly was my mentor and manager for a number of years. John has spoken all over the world as to his model of proactive maintenance. I wanted to provide a valuable insight into what a successful plant considered the definition of the word "Maintenance" to be and in which I align with very much as it relates to the maintenance process.

John Day titled his vision and process to maintenance as:



Alumax of South Carolina is an aluminum smelter that produces in excess of 180,000 MT of primary aluminum each year. It began operation in 1980 after a 2-year construction phase. The plant is the last greenfield aluminum smelter constructed in the U.S. Alumax of SC is a part of Alumax, Inc., which has headquarters in Norcross, Georgia; a suburb of Atlanta, Georgia. Alumax, Inc. is the third largest producer of primary aluminum in the U.S. and the fourth largest in North America.

The vision of general management was that the new smelter located on the Mt. Holly Plantation near Charleston, SC, would begin operations with a planned maintenance system that could be developed into a total proactive system. At the time in 1978-79, there were no maintenance computer systems available on the market with the capability to support and accomplish the desired objectives. Thus TSW of Atlanta, Georgia was brought on site to take not only the Alumax of S.C. maintenance concepts and develop a computer system, but they were to integrate all the plant business functions into one on-line common data base system available to all employees in their normal performance of duties.

Since the development and initial operation of the Alumax of SC maintenance management system, it has matured and rendered impressive results. These results have received extensive recognition on a national and international level. The first major recognition came in 1984 when Plant Engineering magazine published a feature article about the system. Then in 1987 A.T. Kearney, an international management consultant headquartered in Chicago, performed a study to find the best maintenance operations in North America. Alumax of S.C. was selected as one of the seven "Best of the Best". And in 1989, Maintenance Technology magazine recognized Alumax of SC as the best maintenance operation in the U.S. within its category and also as the best overall maintenance operation in any category. Mt Holly's proactive model is shown below in figure 1.


From a basic point of view there are two maintenance approaches. One approach is reactive and the other is proactive. In practice there are many combinations of the basic approaches.

The reactive system (see Figure 1.1.1) responds to a work request or identified need, usually production identified, and depends on rapid response measures if effective. The goals of this approach are to reduce response time to a minimum (the computer helps) and to reduce equipment down time to an acceptable level. This is the approach used by most operations today. It may well incorporate what is termed as a preventative maintenance program and may use proactive technologies.

Fig 1.1.1

Figure 1.1.1. Reactive Maintenance Model

The proactive approach (see figure 1.1.2) responds primarily to equipment assessment and predictive procedures. The overwhelming majority of corrective, preventative, and modification work is generated internally in the maintenance function as a result of inspections and predictive procedures. The goals of this method are continuous equipment performance to established specifications, maintenance of productive capacity, and continuous improvement. Alumax of SC practices the proactive method. The comments which follow are based upon the experience and results of pursuing this vision of maintenance.

Fig 1.1.1

Figure 1.1.2. Mt Holly's Proactive Maintenance Model


Alumax of SC began development of the maintenance management concept with the idea that maintenance would be planned and managed in a way that provides an efficient continuous operating facility at all times. Add to this that maintenance would also be treated as an investment rather than a cost, and you have the comprehensive philosophy on which the maintenance management system was built. An investment is expected to show a positive return, and so should maintenance be expected to improve the profitability of an operation. The management philosophy for maintenance is just as important as the philosophy established for any business operation. For most industries, maintenance is a supervised function at best, with little real cost control. But it must be a managed function employing the best methods and systems available to produce profitable results that have a positive effect on profitability.

The development of a philosophy to support the concept of proactive planned maintenance is important. It is believed that many maintenance management deficiencies or failures have resulted from having poorly constructed philosophies or the reliance upon procedures, systems, or popular programs that have no real philosophical basis.


Today there is little disagreement that the function and control system of a good maintenance management program must be computer based.

Using the philosophy that maintenance management is to be considered in the same way that all other business functions are considered, it is difficult to justify any other approach other than complete integration of maintenance management functions with total organizational management functions. The computer is the tool to use to accomplish this difficult and complex task.

The computer, in an integrated operation, must be available for use by every member of the maintenance organization as well as all other plant employees who have a need. It is an essential part of the maintenance employee's resources for accomplishing his work. It is just as important to a mechanic or electrician as the tools in his toolbox or the analysis and measurement instruments that he uses daily.

The computer must supply meaningful and useful information to the user as opposed to normal computer data.

A successful integration of data systems will tie together maintenance, warehouse, purchasing, accounting, engineering, and production in such a way that all parties must work together and have the use of each other's information. This is part of the answer to the question being asked almost universally, how do you break down the barriers between departments and get them to work as part of the whole or as a team. The computer system must be on line, available, and time responsive. A batch system or semi-batch system will not provide the support needed for a dynamic, integrated, maintenance management system.

In the integrated system with a common data base, data is entered only once and immediately updates all other files so that its use is immediately available to all functional areas. This means that anyone in any functional area can use or look at data in any other area, unless it is restricted. Some have referred to this effect as the "fish bowl effect" since everything is visible to all. This stimulates cooperation, in fact, it dictates cooperation.


Everyone knows what maintenance is; or at least they have their own customized definition of maintenance. If the question is asked, words like fix, restore, replace, recondition, patch, rebuild, and rejuvenate will be repeated. And to some extent there is a place for these words or functions in defining maintenance. However, to key the definition of maintenance to these words or functions is to miss the mark in understanding maintenance, especially if you wish to explore the philosophical nature of the subject. Maintenance is the act of maintaining. The basis for maintaining is to keep, preserve, and protect. That is to keep in an existing state or preserve from failure or decline. There is a lot of difference between the thoughts contained in this definition and the words and functions normally recalled by most people who are "knowledgeable" of the maintenance function; i.e., fix restore, replace, recondition, etc.


If we shift our defining thoughts to maintenance in the pure sense, we force ourselves to deal with keeping, preserving, and protecting. But what are we to keep, protect, or preserve? You may think that it is the machine, equipment, or plant, and that is true. But how are you to define the level to which the machine, equipment, or plant is to be kept. One way would be to say - "keep it like new". At face value the concept sounds good, but it is more subjective than objective. The answer to maintenance levels must be defined by a specification.

A specification is a detailed precise presentation of that which is required. We must have a specification for the maintenance of equipment and plant. In actual usage today the specification, if it exists, is not detailed or precise. A specification usually does exist informally in the mind of the mechanic or management member even though they may be unable to recite it. This means that at best, it is a variable, general-type specification. This kind of specification is defined in terms of and is dependent upon time available, personnel training level, pressure to produce a current order now, money allocated or available, or management opinion. Obviously, a specification like this will not qualify as a true specification, nor will it qualify as a supporting component of the act of maintaining. The true maintenance specification may be a vendor specification, a design specification, or an internally developed specification. The specification must be precise and objective in its requirements. The maintenance system and organization must be designed to support a concept based on rational specifications. Detailed work plans and schedules may be constructed to provide the specification requirement at the maintenance level. In the maintaining context, the specification is not a goal. It is a requirement that must be met. The maintenance system must be designed to meet this requirement. The specification must be accepted as the "floor" or minimum acceptable maintenance level. Variation that does occur should be above the specification level or floor. The specifications will probably be stated in terms of attributes and capacity.

In reference to maintenance specifications, included are individual equipment specifications, process specifications, and plant performance specifications.


The maintenance department is responsible and accountable for maintenance. It is responsible for the way equipment runs and looks and for the costs to achieve the required level of performance. This is not to say that the operator has no responsibility for the use of equipment when in his hands - he does. The point is that responsibility and accountability must be assigned to a single function or person whether it be a mechanic or operator. To split responsibility between maintenance or any other department where overlapping responsibility occurs is to establish an operation where no one is accountable. Alumax of SC considers this a fundamental principle for effective operation of maintenance.

The maintenance function is responsible for the frequency and level of maintenance. They are responsible for the costs to maintain, which requires development of detailed budgets and control of costs to these budgets.

Just as the quality function in an organization should report to the top manager, so does the maintenance function for the same obvious reasons. This allows maintenance problems to be dealt with in the best interest of the plant or company as a whole. Maintenance efforts and costs must not be manipulated as a means for another department to achieve its desired costs results.

Where the maintenance department or group is held responsible and accountable for maintenance, the relationship with other departments takes on new meaning. The maintenance department can't afford to have adversary relationships with others. They must have credibility and trust as the basis of interdepartmental relationships. This is an essential element for the successful operation of a maintenance management system.

Fig 1.1.3


The organizational chart or better yet the organizational graphic (Figure 1.1.3) is constructed on the basis that the central functional element for core maintenance is the Technical team. The relational (syntax) aspects of the organization are shown with concentric bands of teams. The nearer band of teams represents the tighter relationship to the core teams. Radial connecting lines show a direct relationship to a team or band of teams. Concentric connecting lines show a more indirect relationship between teams. The outer band of teams requires a RELATIONAL ORGANIZATIONAL CHART similar to the maintenance teams chart to define their close relationships and full relationship to other plant teams. This particular chart is predicated on the relationship of all teams to central core maintenance teams.

Technical Teams - Core Maintenance - These teams perform core maintenance for the plant. They are composed of qualified electricians, mechanics, and technicians. The teams are assigned based on a functional requirement plant wide or on the basis of a geographic area of responsibility. The focus, direction of the team, and individual team member needs are provided by an assigned member of the facilitator and directional control team.

Facilitator and Directional Control Team - Members of this team have been trained and qualified to provide team organizational dynamics and traditional supervisory functions as required. With the facilitator, the team must address work performance by categories, administrating, training/safety/housekeeping, budgeting and cost control and information reporting as well as the technical requirements of the team. These members perform the necessary traditional supervisory functions, especially related to personnel functions, for the technical teams.

Work Distribution and Project Coordination Team - This team works with the Facilitator, Planning and Engineering teams to staff technical teams to meet work load requests, inventory requirements, contractor support, and field superintendence of engineering projects.

Job Planning Team - This team works closely with the Technical teams and the Facilitator team to plan and schedule maintenance, overhaul, and contractor work. Where operators are doing maintenance functions, the same applies. In addition, information and reports are prepared by this team for all other teams as required or requested. Quality control of the data input is a responsibility of this team. Coordination of production requirements must also be performed.

Technical Assistance Team - This team is a resource to the Technical teams and Facilitator team for continuous improvements, modifications, trouble shooting, and corrective action.

Materials Support Team - This team works with the Planning team, Facilitator team, and the Technical teams to meet planned job requirements and emergency material requirements.

Maintenance Management Team - This team provides overall coordination of maintenance and material functions to meet the plant capacity requirement. Overview of budget and cost control is also provided.

User/Operator Maintenance Team - This is a team of designated operators who perform assigned and scheduled maintenance work. They must be selected, trained and qualified prior to being assigned to this team.

Plant Engineering Team - This team provides projected management for the Plant capital budget program. They provide consulting and trouble shooting to the Technical Teams on an as requested basis.

Other teams in the outer band of the organizational chart must be specifically defined by individual relational organization charts.

For each of the above teams, a detailed performance requirement document must be developed. Individual team members are guided by a specific job performance document. These documents detail the vision, mission, processes used, and strategies employed.

Does the maintenance function provide a service or produce a product? Again, definition is important in the development of this part of the philosophy. Service is defined as a useful labor that does not produce a tangible commodity. A product is something that is produced, usually tangible, but definitely measurable. In the case of the maintenance function and the development of this philosophy, both a service and a product are considered as an output of maintenance. The current thinking which is related to traditional maintenance (reactive maintenance) suggests that the maintenance function is for the most part a service function. But the philosophy being developed here considers the maintenance function as the provider of a product with a small but limited service component. Consider the product produced by maintenance to be capacity (Production/Plant capacity). Writers on the subject of maintenance have suggested this concept in the past, but little has been made of developing the idea to date. A predominate service approach to maintenance, as is currently practiced, is a reactive mode of operation, and is typical of most maintenance operations today. React means response to stimulus. Most maintenance operations today are designed to respond to the stimulus of breakdown and the work order request, except for small efforts related to preventative maintenance and predictive maintenance, usually less than 25% of man-hours worked. This simply means that the maintenance function must be notified (stimulated) of a problem or service requirement by some means, usually by someone outside of the maintenance organization, then maintenance reacts. Rapid response is the "score card" of this system.

It is being suggested by this proactive philosophy that the maintenance function be addressed as the producer of the product-capacity. Capacity is measured in units of production or output (or up time). A total proactive system must specifically be designed to produce capacity (product). If the maintenance function is to be classified as proactive, it cannot stand by and wait for someone to call or make a request. In a total proactive approach, maintenance must be responsible and accountable for the capacity and capability of all equipment and facilities. The function must provide a facility and equipment that performs to specification and produces the product (capacity). Stated again, the maintenance function is a process that produces capacity which is the product. See Table #2 for a more detailed analysis of service vs capacity.

The results of this model created benchmark that hundreds of companies followed and many continue to adopt all the time. In figure 1.1.4 you will clearly see the "World Class Benchmarks" of Alumax, Mt Holly.

Fig 1.1.4

Figure 1.1.4.

Companies who have adopted John Day's philosophy and strategy have achieved results beyond what was known within a company. One company of many companies who were successful was a large manufacturing company. Once senior management understood and adopted John's philosophy and approach it resulted in:

1. Increase plant capacity by $12 million dollars in the first year
2. Deferred a large capital project because the capacity it was to provide was found as part of what is called the "hidden factory"
3. Eliminated the need to hire a projected 12 additional maintenance staff members
4. The plant maintenance staff was reduced by 20% over the following three years because of attrition

The approach to proactive maintenance is not magic. Implementing the process is very difficult, but the results are worth the effort. In order to develop a true proactive maintenance process a company must have commitment from senior management to floor level personnel and the discipline to follow a known "best practices" which have been proven and work.

What is Reliability?

Most maintenance professionals are intimidated by the word "reliability". Why? Most people simply associate reliability with RCM (Reliability Centered Maintenance) and are unclear on what it actually means. The definition is simple:

Reliability is the ability of an item to perform a required function under a stated set of conditions for a stated period of time.

The definition of reliability is not at all intimidating, is it? Many companies focus on fixing equipment when it has already failed - not on ensuring reliability, and avoiding failure.

A common reason for this finding is that there is no time available to investigate the true requirements to ensure the reliability of equipment. Yet, there is a growing awareness among these reactive maintenance organizations of the consequences of poor equipment performance. These consequences include:

  • Higher maintenance costs
  • Increasing equipment failures
  • Asset availability problems, and
  • Safety and environmental impacts

Companies who operate in a reactive maintenance mode need to face the facts - there is NO 'silver bullet' to solve the complex problem of poor equipment performance. Upper level management has traditionally viewed Lean Manufacturing or World Class Manufacturing as the answer. Yet these strategies don't directly address the true target of optimal asset reliability. Forget the "silver bullet" and focus on asset reliability. The results will follow.

Companies Who Get It
Let's call this company XYZ Corporation. This corporation was fighting an up hill battle to survive with foreign competition, aging workshop, and many other issues. Their CEO (Chief Executive Officer) decided reliability would be their focus because maintenance is the largest controllable cost in an organization and without sound reliability of their assets losses multiply because reliability affects cost in so many areas. This corporation established a dedicated team of over 50 key employees, and over a two year period, they researched the world's best maintenance organizations. This team assimilated all "best practices" they found in the world and implemented them in a disciplined, structured environment. They found that focusing on reliability had the biggest return with the longest lasting results. Today this corporation is one of the top producers in their industry world wide.

Corporations like XYZ Corporation who truly understand reliability typically have the best performing plants. What are some of the common characteristics of a "reliability-focused" organization? They take a holistic approach to asset management, focusing on people and culture. Common characteristics found are:

  • Their goal is optimal asset health at an optimal cost.
  • They focus on processes - what people are doing to achieve results.
  • They measure the effectiveness of each step in the process, in addition to measuring results.
  • Preventive maintenance programs focus mainly on monitoring and managing asset health.
  • Preventive maintenance programs are technically sound, with each task linked to a specific failure mode - formal practices and tools are used to identify the work required to ensure reliability.

Moving toward proactive work
Many companies focus their entire maintenance efforts on a PM program that has little to do with meeting the actual reliability needs of the equipment. When these companies are asked why a particular PM task is done, you will typically hear: "This is the way we've always done it" .
Many companies attempt to use statistical analysis to improve reliability. Statistical analysis techniques such as Weibull Analysis are just used help to identify assets where reliability is a problem. Do you really need to spend your valuable engineering resources to figure out that your MTBF is too small? Statistical analysis is really most helpful in setting PM frequencies for time-based PM's, which should account for a very small portion of your PM's. Besides, it's usually very easy to point to the pieces of equipment that are bad actors. Forget the statistical analysis - rather than measuring failure frequency, figure out how to improve reliability.

The facts
Here are some sobering facts that will make you think twice about the effectiveness of a time-based PM program:

1. Less than 20% of asset failures are age related. So for the 20% of failures that are age related how can you identify the frequency of their Preventive Maintenance activities? Ask yourself this question:

Do I have good data to determine this frequency?

If your answer is yes, then this would mean most asset failures have been correctly documented and coded in the CMMS/EAM. My findings are 98% of companies do not have good failure history data.

2. Most reliability studies show that over 80% of asset failures are random. So we ask ourselves:

How do you prevent random failure?

In many cases, it is possible to detect early signs of random failure by monitoring the right health indicators for the asset to determine where the asset is on the degradation curve. In simple terms, how much has it degraded and how long will it be before I lose the intended function of the asset? This approach allows time to take the corrective action, in a scheduled and proactive manner - before the functional failure occurs!

Let's take this statement a step further. Preventive Maintenance for random failures must usually focus on the health of the asset (through monitoring indicators such as temperature, tolerance , vibration, etc.) in order to determine where an asset is on the degradation or PF Curve (figure 1.1.5). Point "P" is the first point at which we can detect degradation. Point "F", the new definition of failure, is the point at which the asset fails to perform at the required functional level. In the past, we defined "Failure" as the point at which the equipment broke down. You can see points P and F and the two different definitions of failure in the graphic below.

Fig 1.1.5

Figure 1.1.5. PF Curve

The amount of time that elapses between the detection of a potential failure (P) and its deterioration to functional failure (F) is known as the P-F interval. A maintenance organization needs to know the PF Curve on critical equipment in order to maintain reliability at the level required to meet the needs of the plant. Without this knowledge how can one truly understand how to manage the reliability of the asset?

The Barriers
Let's first look at a few barriers first which prevent a plant from obtaining a higher level of reliability of their assets

1. Most maintenance departments and production only understand that a failure means the equipment is broken. A true failure of an asset is when it is no one longer meets the function required of it at some known rate of standard.

Example: Conveyor is supposed to operate at 200 meters per minute so when the conveyor's speed is no longer meeting this requirement it has functionally failed thus causing an immediate loss of revenue for the company.

    2. Maintenance does not get involved when quality or production rate issues arise in the plant. In most cases when an asset has functionally failed in a plant no one in maintenance seems to understand the equipment has failed.
    3. Most maintenance departments do not know the performance targets of the plant equipment and do not understand why it is important that they understand them. This not a failure of the maintenance department but a breakdown of how a total is not aligned to meet the goals of it.

    Overcoming all three of these barriers is essential to rapid performance in reliability. If an understanding and focus on functional failure is applied by all plant personnel rapid results will follow resulting in higher asset reliability. The focus must be on the alignment of the total plant on meeting performance targets of each asset. These performance targets and current performance rates need to be posted so everyone is aware if a gap occurs in asset performance. Production and maintenance know that when an asset has functionally failed (no longer meeting the performance target) and is probably resulting in lost revenue. We must understand this is a production and management problem and both organizations must accept responsibilities for actions to mitigate the performance losses.

    One requirement a company must meet in order to have a rapid breakthrough in performance is they must define what a failure is truly:

    Old definition of failure (typically used in reactive companies): The equipment is broken or stopped. A good example is the conveyor stopped because of mechanical problem

    New definition of failure (typically used in proactive companies): The equipment is no longer performing the function required of its user.

    Examples would be:

    Partial Functional Failure Example: A conveyor is supposed to operate at 200 meters per minute however because of a problem it can only run 160 meters per minute.
    Total Functional Failure Example: A conveyor has stopped based on a mechanical problem.

    The function of the example conveyor is:
    1. To transfer a product from point A to point B
    2. To transport product at a speed of 200 meters per minute from point A to Point B

    Example of a new way to view failure

    After all we have reviewed thus far my question to you is, "In your plant is there any equipment operating below defined performance targets and when it does is maintenance engaged immediately?" You could have what is called "the hidden plant" and thus by focusing on the equipment performance targets reliability could be increased rapidly of your assets.

    Have you ever heard the saying "it is what you don't know that kills you"? This statement is true in reliability. Follow my advice and see rapid breakthrough in plant performance and you must know this is just the beginning of a long journey. Do sit back and be satisfied as the reliability results you gained by following my advice. A plant must now apply RCM (Reliability Centered Maintenance) methodology to meet the goal of "optimal reliability at optimal cost". I did not say use RCM. I stated RCM Methodology which could be RCM II, Streamlined RCM, FMEA or MTA. Be careful which methodology you use if you want rapid performance.

    The Bottom Line
    A company must take a step back and review the way it manages equipment performance. If equipment continues to fail after performing preventive maintenance or overhauls, then clearly a change is needed. The focus must be on ensuring the reliability of plant assets. As a starting point, everyone in a plant should understand the definition of reliability and what it means to the success of the company. Make "Reliability" your plant's collective buzzword.

    Excerpted from Rules of Thumb for Maintenance and Reliability Engineers by Ricky Smith and R. Keith Mobley

    ChatGPT with
    Find Your Answers Fast