by Rolly Angles, RSA, Laguna Philippines
Frequent contributor at www.maintenanceforums.com

One of the biggest confusion in an attempt to perform a thorough Root Cause Analysis is understanding how deep should we pursue our analysis or simply stated, where do we stop our investigation in performing a Root Cause Analysis? Going to deep will lead us to the bible, Timothy 6:10, For the love of money is a root of all evil and going to shallow will allow the problem to recur again and again.

Although a lot of analytical tools are currently being used by industries in the market today, are they really meant to uncover the root cause or simply what we termed as the Physical Cause of the problem only. One industry found out that John Doe was responsible for closing the valve that wrecked one of the turbines. John Doe was sentenced to be suspended from work for 1 month without pay. The lawyers and managers are happy that the culprit was finally condemned. A year later, the same problem occur and this time by a man named JOHNNY THORR, and he quote, I just use to sweep dirt around here but the supervisor instructed me to close the valve and I don't know what valve so I close them all. The question being raised here is John Doe and Johnny Thorr the Root Cause of the problem? Are we certain if John Doe is punished, the problem will be gone for good?

Before we begin any further with our analysis, we need to ask ourselves in the first place, why do we need to perform a Root Cause Analysis?

Do we simply do it to comply with customer's requirements such as manufacturing plants? Do we perform a Root Cause Analysis because our management wants to know what happened? Do we perform a Root Cause Analysis because we want to know what simply caused the problem? The reason for asking your industry the reason to perform a Root Cause Analysis so that we can perform the analysis with the right reasons, and there is no better reason I can think of than learning from the failure itself. We need to learn from the things that go wrong . . .

Again, do we learn from the failure when we discipline someone, or we just expand the gap between our people and us. I recall a couple of years back when I was teaching Root Cause Failure Analysis for two days in one plant when after the end of the first day, one of the participants approached me and said, Sir, I'm afraid, I won't be able to attend the 2nd part of your training tomorrow and when I asked why? The person said that he will be serving his suspension starting tomorrow for a weeks duration. My curiosity aroused me so I asked the person which was a maintenance technician what happen, and he told me that the operator committed an error and both of us will have to serve our suspension tomorrow. I asked why was he included if it was the operator who committed the mistake, and he said that it was the procedure management wants. And the worst part is, their names and mistakes had been published detailing their mistake committed for everyone to see. VERY HUMILIATING INDEED !!! By the way, this maintenance technician had already been in that industry for 12 years and it was his first taste of discipline in the number of years he worked for in that industry.

HUMAN CAUSE IN ROOT CAUSE ANALYSIS

Root Cause Analysis believes that all failures are caused by humans and all humans are prone to commit mistake and errors Either someone did the wrong job or simply did the job wrong. People commit slips and lapses. Believe me when I tell you that even with the best procedures in town, people are likely to commit an error.

December 12, 2002 : A small plane belonging to Philippine Air Force crashed into a plant run by IBIDEN Philippines killing one person and injuring eight people.

A person died in the hospital since he was given the wrong blood, investigation indicates that the person who died switch bed with another person since he wanted to be near a window. The nurse thought that this was the person that needs to have a blood transfusion.

People commit errors and errors can be classified as slips and lapses. Slip is when somebody does something incorrectly such as an electrician rewinds a motor incorrectly that it run backwards. A lapse is to miss out a step in a key sequence of events such as leaving a tool behind after an extensive and exhaustive Preventive Maintenance in an equipment. Both are Human Errors and both are not intentional in the first place. A lot of factors can be attributed as to cause these kinds of errors fatigue, pressure, environment, inattention to details, just to name a few.

One thing we must understand is that most human error are not necessarily the fault of the person who actually committed the mistake. What is important is that we need to understand that either the error was caused by external circumstances far beyond our control or from flawed rules and systems that needs to be changed. We also need to understand the reason why the error was committed in the first place so we can learn from them. In fact, are we convince that if we were on the shoes of the person who committed the error, will we do it differently? But the most important lesson is, do we really learn from the failure itself by blaming and punishing people?

Photo

LATENT CAUSE IN ROOT CAUSE ANALYSIS

Underneath every Human Cause lies a deeper cause called Latent Causes. These are concealed and hidden causes that eventually cause the human error to be committed. The only way to address these Latent Causes is to expose them and we can only expose them if we truly understand what it is all about. Latent Cause Analysis is not just about system flaws and procedures that eventually led a person to commit the mistake. It is not just about organizational management weaknesses. It is not all about flawed management decisions but rather Latent Causes means understanding that we ourselves are part of the problem. People create system, people make decisions and if the decision or system is flawed then a human error is most likely to occur without a doubt.

Learning from the things that go wrong is not as easy as we think it is. The only way to learn from the things that go wrong is if we can see ourselves in the mirror and admit that we are all part of the problem. Collectively, we all have our share that eventually caused a person to commit a mistake. The sad thing about this is when this happens we isolate the person who commit the mistake and put all fingers and blame on him, we always wanted a fall guy.

Before we can understand Latency we need to ask ourselves the following :

- What is it about the way we are that contributes to our problems?

- What is it about the way I am that contributes to our problems?

If we are part of the problem, then we must be responsible in being part of the solution also. Engineers, Technical People, Maintenance can easily arrived at the physical cause of the problem. If some mechanical component such as a bearing failed, these people can dig up evidence that will eventually lead to its physical cause, but digging deeper into the latent cause, I strongly recommend a third party, an independent person and unbiased one, unless you can carry out this probe yourself.

Example of Physical Cause : Evidence on the bearing's raceway shows fatigue and spalling, and after an investigation, the team found out that misalignment of the motor and pump seems to be the physical cause of the problem

Example of Human Cause : After studying the evidences that lead to this misalignment problem, the team found out that the person performing the alignment do not posses the skills and do not have the tools to perform such practices. He was only using his eye sight.

Probe on Latency : Investigation shows that there was no instrument (Laser Alignment) and no training at all as to why these equipment's needs to be aligned. And when we probe on with the Latency Causes, we understand that each of us are part of the problem.

Training Department : After this incident occurred we at training department realized that a course on misalignment should be provided for our maintenance people to understand.

Maintenance Department : It is normal for management to budget and reduce operating cost, but this instrument is what we need to justified in order to improve our equipment's uptime. We requisition this instrument a year ago but it was denied by Management. I believe we can justify this instrument if we really want to by performing a cost study in the first place and presenting an ROI to management.

Purchasing Department : We always adhere to management to cut cost and to the extent that we change vendors on parts which are more cheaper, I think part of our fault is not to consult this with our technical people for evaluation purposes.

Top Management : We should have provided the budget for training and instrument in the first place, I learn to realize after this severe downtime that this instrument maintenance is asking is not a nice to have but a must have after all.

And again, tell me if we are all part of the problem or is the person who perform the problem be condemned "ALONE" in the first place? Let us all be in the shoes of this person who made the mistake and ask ourselves are we certain that we have done it differently?

Latency is not about system causes but rather understanding how we people in a way contribute to the problem since we taught that we are serving in the best interest of the company. I consider Latency to be a higher form of human cause. Again, exposing these Latent Causes is the only way to understand a true and meaningful Root Cause Analysis. Hence, to answer the question, where do we end our probe on Root Cause Analysis only when we reach the Latent Cause of the problem.

Upcoming Events

August 9 - August 11 2022

MaximoWorld 2022

View all Events
banner
80% of Reliabilityweb.com newsletter subscribers report finding something used to improve their jobs on a regular basis.
Subscribers get exclusive content. Just released...MRO Best Practices Special Report - a $399 value!
DOWNLOAD NOW
Harmonizing PMs

Maintenance reliability is, of course, an essential part of any successful business that wants to remain successful. It includes the three PMs: predictive, preventive and proactive maintenance.

How an Edge IoT Platform Increases Efficiency, Availability and Productivity

Within four years, more than 30 per cent of businesses and organizations will include edge computing in their cloud deployments to address bandwidth bottlenecks, reduce latency, and process data for decision support in real-time.

MaximoWorld 2022

The world's largest conference for IBM Maximo users, IBM Executives, IBM Maximo Partners and Services with Uptime Elements Reliability Framework and Asset Management System is being held Aug 8-11, 2022

6 Signs Your Maintenance Team Needs to Improve Its Safety Culture

When it comes to people and safety in industrial plants, maintenance teams are the ones who are most often in the line of fire and at risk for injury or death.

Making Asset Management Decisions: Caught Between the Push and the Pull

Most senior executives spend years climbing through the operational ranks. In the operational ranks, many transactional decisions are required each day.

Assume the Decision Maker Is Not Stupid to Make Your Communication More Powerful

Many make allowances for decision makers, saying some are “faking it until they make it.” However, this is the wrong default position to take when communicating with decision makers.

Ultrasound for Condition Monitoring and Acoustic Lubrication for Condition-Based Maintenance

With all the hype about acoustic lubrication instruments, you would think these instruments, once turned on, would do the job for you. Far from it!

Maintenance Costs as a Percent of Asset Replacement Value: A Useful Measure?

Someone recently asked for a benchmark for maintenance costs (MC) as a percent of asset replacement value (ARV) for chemical plants, or MC/ARV%.

OEM recommended maintenance plans

One-third of CEO Terrence O'Hanlon's colleagues think so - at least as a starting point. What do you have to say?