Reliabilityweb Six Common Errors when Solving Problems

Six Common Errors when Solving Problems

1. Focus on Prevention, Not blame

Example: Several employees not following a defined procedure cause a business interruption. Managers identify these people and, in turn, force them to take three days off as disciplinary action. This represents a common story. Companies find those at fault and reprimand them using anything from one to three days off or even termination. This, the thinking goes, shows everyone how serious the organization is about solving problems.

Holding someone accountable doesn’t necessarily mean that they have to be disciplined. In fact, people take responsibility without being punished. It’s human nature to make unintentional errors. Say you lock your keys in the car. Did you do it on purpose? No. Did you know it wasn’t a good idea? Yes. Then why did you do it? When you locked you keys in your car, did you park you car in the garage for three days as a punishment to make sure it didn’t happen again? Of course not, but this actually follows the logic of some companies that point blame at employees and send them home for three days. Yet neither action solves anything. People don’t wake up intending to create problems during their work day. (If they do cause problems intentionally, then it is sabotage and the district attorney needs to be contacted for prosecution.) Errors do occur at work, but the overwhelming majority is unintentional.

“Blame” is sometimes confused with “accountability,” and accountability within an organization has come to refer to disciplinary action. Accountability actually means taking responsibility for actions and instigating specific steps so the problem is less likely to occur again—and it does not require punishment.

Organizations need to find out why a mistake was made. The person who made the mistake knows the answers, yet if that person doesn’t tell us why the incident happened (the causes), we won’t know what specific actions will correct the problem. The blame-and-punish approach teaches others in an organization that, if they make a mistake, they should make sure no one finds out.

There must be consequences within an organization, of course, but many jump too quickly to punish the guilty, thinking it will prevent a problem from occurring again. Always identify why a person didn't follow the procedure and focus on the specifics behind the incident. Is this the first time that the person didn't follow the procedure, or does everyone not follow the procedure? Did this person trade shifts with someone else who also never follows the procedure? If so, why hold this one individual accountable for the entire organization?

Put another way, does your organization focus on finding specific actions to prevent the problem or finding who to blame? Ultimately, organizations should focus on those actions, not the person.

“Blame cultures” pervade company culture because blame is easy. If we’re late to work, it’s easy to blame the alarm clock. If the clock is in perfect running order, then we have to take responsibility for being late and take actions to prevent it happening again. By the same token, if a mistake happens at work--and company procedures and work processes are clear and equipment is performing as it should-- it encourages employees to think what they can do to prevent the problem from happening again. This increases accountability.

For this to happen requires all employees to be able to tell management when a procedure is ineffective or unclear. “Following a procedure” does not mean “disengage your brain.” A procedure should clearly identify specific steps learned through experience, and that experience comes from employees who follow the procedure. They know their jobs better than anyone. If they feel they can’t speak freely and suggest alternative ways of doing things, procedure will never evolve to meet an organization’s goals. Every person on the payroll must feel they can identify what is going well and what is going poorly.

To change the culture of an organization, start with one small problem, get results and repeat. Focus on one problem, work on ways to prevent it by identifying the causes, and have the people involved with the incident offer specific suggestions, including what specific changes to the work processes they think would prevent the issue from happening again.

2. A Cause Never Stands Alone

Example: A person doesn’t follow a defined procedure, resulting in a business interruption. The root cause is determined to be “Did Not Follow Procedure.”

Most incorrectly believe that root-cause analysis ultimately finds one cause. When asked to define a root cause, they typically say, “It’s the one thing that caused the problem to happen.” A longer explanation might go as follows: “Root cause is the fundamental cause that, if removed or controlled, prevents the problem from occurring.” More significant than just a “cause,” they say, the root cause, if eliminated, prevents the problem from occurring. This seems reasonable, but in reality it’s just not accurate.

The explanation requires a review of systems-thinking. A system is nothing more than a combination of parts that work together to perform a function. A car represents a “system,” since all of its parts need to work together for the car to function properly. Taking this concept a step further, each part of the car is also a system. The engine, part of the car’s “system,” is also its own system that breaks down into its own parts.

If you were to ask four people--“What’s the most important part of a car?”--you might get four different answers: the key, the engine, the driver, the battery. Each person thinks he or she is “right.” The person who says the battery is the most important part thinks that, without the battery, all the other parts wouldn’t function. It makes sense, but this same argument could be used for the key, the engine and the driver. The bottom line: There is no one right answer. There is no part of the system that is “most important”; without any element, the car won’t run. In this instance, four people provided different answers; all of them told the truth; not one is wrong.

This seemingly paradoxical statement reveals a misconception behind root cause—the thinking that one thing caused the problem. People use the logic that if that someone didn’t follow the procedure, not following it caused the problem; if the procedure were followed, the problem would not have happened. Yet like a car needs all its parts to function properly, a problem requires multiple causes to happen. And those multiple causes make up the root cause. Put another way, a root cause isn’t one cause but a system of causes working together.

Consider a common example: heat, fuel and oxygen are all required to make fire. Remove any, and we prevent the fire from occurring. There is not “a cause” to any fire – there is a system of causes working together. That system is the root cause. Ultimately, that system of causes will help find the best solutions for any incident that occurs.

In the previous example, “not following procedure” was indeed one cause. Yet proper root-cause analysis would have found additional causes--bad equipment maintenance, poorly written procedures, and so on. Put them all together, and you have the root cause. Armed with this comprehensive analysis, organizations can focus on what root-cause analysis is designed to do—not to find one cause, but find the best solutions based on a system of causes the analysis uncovers.

3. A Problem Description Is Not a Problem Analysis

Example: An organization writes a description of a problem as follows:

On October 14, 2004, there was a pump failure that resulted in the loss of the CSG unit for approximately eight hours. The loss of the pump function was due to a failure on pump B because of a seal leak. The seal flush had been set improperly after an overhaul of the pump. Pump A was out of service because of routine maintenance.

A problem description—defined as a narrative typically written to chronicle an issue--can help piece together what actually happened in an incident. A problem description, however, is not an analysis, which is defined as the breaking down of something into its constituent parts. Translated into problemsolving terms, the “constituent parts” are the problem’s causes; hence, problem analysis means to break a problem down into causes.

An analysis involves piecing together a problem’s myriad cause-and-effect relationships, with strings of causes and effects. Unlike job procedures or work processes, which build forward through time (first perform step 1, then step 2, then step 3), a problem analysis builds backward through time, with the analysis asking why each event occurred the way it did. Why was the CSG down for eight hours (effect)? It is because there was a loss of pump function (cause). Why was there a loss of pump function (effect)? It is because pump B had a leak (cause 1) and pump B was out of service (cause 2).

The last sentence describes a critical point where one event has two causes, splitting the line of cause-and-effect into two branches. Trying to describe this through sentences and paragraphs can be a significant challenge. So here, visual tools help immensely:

A problem description doesn’t show the nature of the cause-and-effect relationships for a given issue, but is instead a statement of fact used to build a complete problem analysis.

As a side note, a timeline--used as a linear, one-dimensional array--can be very helpful on some issues to understand the sequence of events, but it is not an analysis. Timelines can complement problem analyses very well. A timeline, the process analysis and its visual tools described here, along with process maps, can help build a complete understanding of the problem at hand.

4. Start an Analysis with the Impact to the Goals, not the Causes

Example: An investigator asks a group of people, “What’s the problem?” (Note: In common use, “problem” and “cause” are used interchangeably.) Everyone answers with something different. Some people respond by saying, “That’s not the problem, this is the problem …”

No question fosters disagreement within an organization more than “What’s the problem?” People see things differently. We have experts in disparate fields who hold different backgrounds and knowledge—and we can use this to our advantage, but only if we communicate effectively. This happens through accommodating different points of view. Knowing that a problem is a “systems issue” created by multiple causes working together (see No. 2, “A Cause Never Stands Alone,” above) helps bring these different views to light. Each view can be seen as just one “cause” among many that lead to the ultimate problem.

When people say the problem is “the seal flush was not set properly,” someone else may respond by saying that “the problem is that pump A never should have been out for service for maintenance.” Another may say “the real problem is that seal failed.” What causes this disagreement is the fact that the investigation focuses on the problem, not an organization’s overall goals.

Overall goals get everyone, regardless of perspective, to give the same answers. These overall goals truly define problems. Try asking any two people in a power plant, “How many injuries do you want to have on a given day?” An operator on the floor would give the same answer as the president in the board room: “zero injuries.” The same would happen by asking, “How many megawatt losses do you want on a given day.” The answer is still zero.

If you want people to disagree, start talking about the problem. If you want them to agree, start the investigation by focusing on the impact to overall goals.

5. Apply the Basics of Cause-and-Effect, not the Buzzwords

Example: A problem occurs within a company. The company incident-report form asks for the immediate cause, the basic cause and the root cause. If one person fills out the form, no issues arise. If two or more contribute, no one agrees on which causes are the “basic,” “immediate” and “root” causes.

Though they intend to clarify a complex investigation, the adjectives actually create much of the confusion. A cause is defined in the dictionary as something required to produce an effect—it’s that simple. All these terms can be boiled down to two: “cause” and “possible cause.” “Cause” describes anything--supported by evidence--that is causally related to a particular issue. “Possible causes” are hypotheses that lack evidence and need to be substantiated.

6. Problem Solving Works throughout an Organization

Example: Within an organization, problem solving (root-cause analysis) investigates only certain types of individual problems that have already happened

A problem-solving technique should not be limited to only certain situations. The cause-and-effect principle provides the bedrock of all problem-solving methodologies in all circumstances, be it an auto mechanic troubleshooting a car or a team investigating the Columbia Space Shuttle tragedy. The level of detail changes, but cause-and-effect does not change. A bias to cause-and-effect allows organizations to understand why issues went poorly as well as those that went flawlessly. It can also help prevent undesirable events from happening in the first place, and it can help determine specific needs to create some desired condition in the future.

Reliability-centered maintenance, failure modes and effects analysis and root-cause analysis are all grounded in cause-and-effect. This means that different methods do not need to be used for different situations. Consider Cumulative Cause Maps™, which collects information on specific incidents by retaining an organization’s knowledge and experience in one visual tool. The problem-solving approach not only improves individual investigations, it also changes the way an organization captures, stores, communicates and shares knowledge across the enterprise. And it can apply anywhere, from the car shop to Houston’s space-flight control.

A simple approach to cause-and-effect is fundamental for developing an organization’s problemsolving discipline. Anchored by the cause-and-effect principle, problem solving can help investigate safety and environmental issues, work-process deficiencies, production problems, equipment failures, customer service issues, productivity problems and more. Indeed, it can help at every level of an organization.

Conclusion

These six elements identify some of the most common mistakes that organizations make when solving problems, but it is not a complete list. There are many things an organization can do to simplify and improve the way it analyzes documents, communicates and solves problems. Ultimately, avoiding the buzzwords and sticking to a simple, fundamental approach to cause-and-effect can help transform a company’s problem-solving efforts.