In continuing the Uptime series on the 10 components of a successful vibration program, this article explores three more components: the right understanding, the right analysis and the right reporting.
Read Part 1: Right Goals
Read Part 2: Right People and Right Leadership
Read Part 3: Right Goals and Right Follow-Up and Review
Read Part 4: Right Tools and Right Data
Figure 1: 10 components of a condition monitoring program
table
Right Understanding
Right understanding is about knowing the equipment and understanding how it fails. If you do not understand how the equipment fails, then you cannot come up with an appropriate strategy to maintain it. When it comes to maintenance strategies, you generally have four options:
- Redesign the asset to remove the failure mode;
- Condition-based maintenance (CBM) if the machine gives you an indication that it is entering into the failure mode, you can monitor this indicator;
- Preventive or time-based maintenance (PM) if the component fails in a known amount of time, then you can replace it before that time;
- Run to failure maintenance (RTF) if the consequences of failure are very low, you can simply let the component fail.
Out of these four options, the first option is by far the best, but it is not always practical. It is always better to remove the root cause of the problem than to continuously fight the symptoms. Just think of the airline industry. Every time a plane crashes, the root cause is determined and steps taken to ensure it never happens again. But as creatures of habit, people are more inclined to keep tripping over the same crease in the carpet then to bend down and straighten it out.
For condition-based maintenance, one needs to consider the different failure modes and how they present themselves. Condition monitoring (CM) is based on the idea that machines tell you when they begin to fail. They can tell you this in a variety of ways, such as by vibrating differently, making different sounds, changing temperature, changing how electricity flows through them, changing pressure, etc. These are called indicators of a change in condition. It is necessary to understand the variety of indicators a machine presents for different failure modes in addition to understanding the failure modes themselves. The monitoring technology you choose (right tools) and the tests you perform (right data collection) are based on the indicator(s) you wish to measure.
You need to know how quickly the failure modes progress in order to know how frequently to take measurements. For example, a turbine with a large journal bearing can go from perfect operation to catastrophic failure in a matter of seconds, therefore, a continuous monitoring protection system is required. A centrifugal pump operating in a clean environment will give the first signs of bearing wear up to a year or more before the bearing actually fails, so monthly or quarterly tests are adequate.
Different indicators will appear at different times. For instance, a bearing will emit high frequency vibration at its earlier stages of failure and lower frequency vibration later. When it’s much closer to failure, it might make audible sounds or get hot. This also needs to be considered when choosing a monitoring technology.
Different machine fault conditions generate different patterns and frequencies of vibration and can appear at different test points and in different axes. Therefore, before taking a vibration test, it is important to understand the machine, its internal components and the faults it is likely to experience. This helps ensure you are testing the machine in the correct way. In order to do this, you need to know shaft rotation rates and the number of gear teeth, pump vanes, fan blades, etc. Because this information might not be readily available, you need to document the information you have and remember to track down the information you need.
Right Analysis
Right analysis boils down to creating baselines and looking for changes in these indicators over time. Most people seem to do all of this in reverse. They start with a tool or monitoring technology, then they look for things to test and then they look at the data as if it was tea leaves trying to figure out what it means. A better way to begin is with the asset and its failure modes. Determine the indicators that the machine produces when it begins to fail, select the appropriate technology and test configurations to monitor for those indicators, and then analyze the data to look for changes. If you have good software and take the time to set alarm limits on these specific indictors, your software can do the majority of the analysis work for you.
Right Reporting
Alarms are different than reports. For a report to be useful, it should contain what is referred to as actionable information. In other words, the person who receives the report should understand what the problem is and what to do about it. Just saying a machine is in alarm does not provide this information. It does not describe what the problem is or how it should be resolved. A typical format for a report might include a diagnosis, such as moderate motor bearing wear, and a recommendation, such as monitor for changes.
Because vibration and other CM technologies aim to diagnose problems very far in advance, it is not always necessary to act on the diagnosis right away. Reports, therefore, should contain priority or severity levels. Definitions of the severity levels should be agreed upon by all parties so the people receiving the reports know what action to take. Here is a typical severity scheme:
Level 1: Slight fault: No recommendation;
Level 2: Moderate fault: Monitor for changes; Consider risks of failure, availability of spare parts, upcoming shutdowns, etc.; Begin to plan the repair;
Level 3: Serious fault: Plan repair for the near future;
Level 4: Extreme fault: Shut down machine.
Many analysts prefer to wait until a problem is really bad before they report it. This is because they want to be absolutely sure the problem exists and be certain the machine is not repaired earlier than necessary. This behavior is contrary to the goal of providing an early warning to planners so they can plan better. On the other hand, some planners will receive a report with a low priority and schedule the repair right away because they have not been trained to understand the meaning of the severity levels. Optimally, everyone should have access to the same information and everyone should understand how to interpret the severity levels. In other words, report early with a low severity and train the people receiving the reports on how to interpret them.
The amount of detail in the report will depend on who is receiving it. If an outside service provider is providing reports to the maintenance department, the report might not only have a diagnosis, such as moderate bearing wear, but also the evidence that suggested the fault. This might include appropriate plots or trends and a description of why the conclusion was made. However, you don’t want to give too much detail to people who are not interested in it or who cannot understand it. The thicker the report or the harder it is to find the important information, the more likely it is to be ignored. One problem facing everyone in this information age is information overload, so make sure the reports contain only what is absolutely necessary to the person receiving it and understand that you might need to create different reports for different individuals.
It is also important to consider the how and when of reporting. How is the report transmitted to the person? When does the person receive it and how does this align with the goals of the program? When it comes to the how, it is important to ask if the report is passive or active. Dropping a paper report on someone’s desk is passive because the person may or may not get around to reading it. If the report arrives by way of e-mail or a software package that requires an acknowledgment, then you will know your message has been received. As for the when, it depends somewhat on the severity of the problem and the rate at which it can progress. A very serious problem cannot wait for an end of the month review. On the other hand, it makes sense to coordinate reporting or review with other planning activities.
Reports are also helpful to the analyst. In most cases, you are trending faults as they progress over time, so don’t look at your data every month like it is the first time you have seen it. Instead, start your analysis by looking at your last report. Your software should have a convenient method for displaying the prior report alongside the new data.
Report procedures should be audited. It is a good idea to occasionally sit down with all the stakeholders and make sure everyone is on the same page regarding the issues raised. It is also important to find out whether or not the reports are valued. Too often, people in a plant do things because it is their job and that job might be presenting vibration reports to managers or planners. But if the people receiving the reports do not actually act on them or find them valuable, then resources are being wasted. Either the reports need to be presented differently or the people receiving them need to be educated about their usefulness.
Lastly, reports should be audited for accuracy. What types of problems are being reported? How much misalignment versus unbalance versus bearing wear? What are the severities of the problems being reported? Are defects being discovered at an early enough stage? What percent of the diagnoses are correct? How many failures were missed entirely? All of these are important questions that should be answered in a formal way and as part of an ongoing process. The right follow-up and review also needs to be an integral part of the program.
Right understanding, right analysis and right reporting are only three parts of the puzzle. In order to have a successful program, one needs to have all 10 components in place: Right goals, right people, right leadership, right tools, right data collection, right follow-up and review, and right processes and procedures.