Root Cause Analysis (RCA) is a valuable tool for reliability improvement in manufacturing and production operations. Yet, most efforts to implement an RCA program fail to achieve meaningful results despite significant investments in employee training. What needs to be done to assure that RCA becomes a functional work process in organizations?
If you are a manager of an organization or a frustrated RCA advocate and this sounds familiar, be assured that this is a common situation in many facilities and across all industries. Root Cause Analysis can be one of the most difficult reliability programs to develop, and many have struggled with their implementations.
In my opinion, Root Cause Analysis is currently one of the most underutilized reliability and quality improvement tools available for organizations seeking to eliminate failures and reduce manufacturing costs. Virtually all manufacturing organizations can benefit from having an effective RCA program. Most will agree that it is vital for continuous improvements.
Yet, there are few organizations that have achieved the level of excellence in RCA where it is routinely used to continually improve overall facility performance. In most cases, RCA, if done at all, is used in the case of major production loss events, damage to assets, or in reaction to a crisis situation within the organization.
In these instances, the need to perform RCA is typically driven by the urgent demand within the organization to “know why this happened so it can be understood by others, usually senior – – management. Most of us have experienced this urgency situation at some point in our careers, and have seen how effective problem solving is when focused and given priority.
Unfortunately, this application of Root Cause Analysis does not constitute a work process for improvement. In most cases, the outcome of this analysis is focused on preventing a recurrence of this single event. This does not constitute strategic improvement; it is a short-term activity addressing only one issue.
The Need for a Work Process
To better understand why Root Cause Analysis is not fully utilized, we need to review the basic concepts of Organizational Work Processes. A work process is a system that provides a frame work for organizations to accomplish tasks in a repeatable, consistent manner. An example of a work process is the payroll function.
Each pay period, there are activities that are driven by organizational objectives and timelines. There are clear expectations for the outcomes of these activities, and individual roles and responsibilities have been defined for all participants. A “system” is in place to insure that things get done in a defined and predictable manner.
Without getting too conceptual in our discussion on work systems, it becomes apparent that most RCA programs do not have the benefit of a work process such as that described above. In many cases in an RCA program, it is unclear what is to be done, – – when it is to be done, who will do it and how corrective action will be initiated. Most organizations assume that RCA training is all that is required for individuals to be successful in their efforts.
In reality, those who have had RCA training are usually unable to be effective in the absence of a work process. At best, the success of these individuals will be limited to areas where they can exert their personal influence in obtaining time and resources to correct true root causes. While some individuals have been successful using RCA to eliminate problems at the root cause level, in most cases it is difficult to do without organizational support.
Additionally, those who attempt RCA activities as individuals will usually experience conflict with others who are not aware of their objectives for doing RCA. The reasons for this lie in organizational culture, which has been defined as “observable patterns of behavior that have been positively reinforced over time.”
Most organizations are focused on urgent, task-orientated activities. These “cultures” have encouraged individual participation in efforts to accomplish short -term objectives. Root Cause Analysts focus on improvement issues that may not be viewed as urgent or important by others. As such, analysts’ daily work priorities will be questioned by some who may see these individuals as “not helpful.”
This is a very common situation where individuals in a given department have been trained to do RCA but no overall organizational agenda exists for improvement. Management begins to view the participants in these conflicts as “problem employees,” when the true cause of the conflict is the absence of work processes and the lack of defined goals and roles for those involved.
Many RCA advocates experience the frustration of this conflict situation and lose enthusiasm for RCA. In my opinion, this is why RCA training usually fails to deliver long-term results, and most programs end up faltering.
Rick Kalinauskas, CMRP, is President of Reliability Support Services, an educational consulting firm focused on helping organizations develop their Root Cause Analysis capabilities for achieving cost reductions and strategic improvements. He is an ardent proponent of RCA application and is a specialist in performing Root Cause Analysis for clients in manufacturing and industry.
Rick has over 20 years experience in the application of structured problem solving methodologies as part of his 30 years of involvement in manufacturing. He has a particular focus on the behavioral and work process aspects of reliability improvement. Rick has served in several maintenance and operations management roles with Nestle USA, as reliability engineering manager for International Paper, as an Asset Care Program Manager for Coors Brewing Co., and as a corporate reliability specialist in Root Cause Analysis for Halliburton KBR. He has worked with several clients in the oil and gas and pharmaceutical industries as well.
Rick resides in Chesapeake, Virginia with his wife Mary.He can be contacted at 757-646-4128 or email: [email protected]
Semiconductor devices are almost always part of a larger, more complex piece of electronic equipment. These devices operate in concert with other circuit elements and are subject to system, subsystem and environmental influences. When equipment fails in the field or on the shop floor, technicians usually begin their evaluations with the unit's smallest, most easily replaceable module or subsystem. The subsystem is then sent to a lab, where technicians troubleshoot the problem to an individual component, which is then removed--often with less-than-controlled thermal, mechanical and electrical stresses--and submitted to a laboratory for analysis. Although this isn't the optimal failure analysis path, it is generally what actually happens.
Semiconductor devices are almost always part of a larger, more complex piece of electronic equipment. These devices operate in concert with other circuit elements and are subject to system, subsystem and environmental influences. When equipment fails in the field or on the shop floor, technicians usually begin their evaluations with the unit's smallest, most easily replaceable module or subsystem. The subsystem is then sent to a lab, where technicians troubleshoot the problem to an individual component, which is then removed--often with less-than-controlled thermal, mechanical and electrical stresses--and submitted to a laboratory for analysis. Although this isn't the optimal failure analysis path, it is generally what actually happens.
I use the term RCPE because it is a waste of good initiatives and time to only find the root cause of a problem, but not fixing it. I like to use the word problem; a more common terminology is Root Cause Failure Analysis (RCFA), instead of failure because the word failure often leads to a focus on equipment and maintenance. The word problem includes all operational, quality, speed, high costs and other losses. To eliminate problems is a joint responsibility between operations, maintenance and engineering.
I use the term RCPE because it is a waste of good initiatives and time to only find the root cause of a problem, but not fixing it. I like to use the word problem; a more common terminology is Root Cause Failure Analysis (RCFA), instead of failure because the word failure often leads to a focus on equipment and maintenance. The word problem includes all operational, quality, speed, high costs and other losses. To eliminate problems is a joint responsibility between operations, maintenance and engineering.
This paper presents an overview of an integrated process for system maintenance, fault diagnosis and support. The solution is based on Qualtech System, Inc.’s (QSI’s) TEAMS toolset for integrated diagnostics and involves several key innovations. As a showcase of the integrated solution, QSI, along with Antech Systems and Carnegie Mellon University (CMU), have recently completed a research project for the Information Technology Branch at the Naval Air Warfare Center–Aircraft Division (NAWC-AD) in St. Inigoes, MD. The entire system, termed ADAPTS (Adaptive Diagnostic And Personalized Technical Support), provides a comprehensive solution to integrated maintenance and training.
This paper presents an overview of an integrated process for system maintenance, fault diagnosis and support. The solution is based on Qualtech System, Inc.’s (QSI’s) TEAMS toolset for integrated diagnostics and involves several key innovations. As a showcase of the integrated solution, QSI, along with Antech Systems and Carnegie Mellon University (CMU), have recently completed a research project for the Information Technology Branch at the Naval Air Warfare Center–Aircraft Division (NAWC-AD) in St. Inigoes, MD. The entire system, termed ADAPTS (Adaptive Diagnostic And Personalized Technical Support), provides a comprehensive solution to integrated maintenance and training.
The power industry’s operating and maintenance practices were held up to intense regulator and public scrutiny when on November 6, 2007, a Massachusetts power plant’s steam-generating boiler exploded and three men died. The Department of Public Safety’s Incident Report investigation determined that the primary cause of the Dominion Energy New England’s Salem Harbor Generating Station Unit 3 explosion was extensive corrosion of boiler tubes
The power industry’s operating and maintenance practices were held up to intense regulator and public scrutiny when on November 6, 2007, a Massachusetts power plant’s steam-generating boiler exploded and three men died. The Department of Public Safety’s Incident Report investigation determined that the primary cause of the Dominion Energy New England’s Salem Harbor Generating Station Unit 3 explosion was extensive corrosion of boiler tubes
I was asked recently to give a second opinion on the cause of failure of an axial piston pump. The hydraulic pump had failed after a short period in service and my client had pursued a warranty claim with the manufacturer. The manufacturer rejected the warranty claim on the basis that the failure had been caused by contamination of the hydraulic fluid. The foundation for this assessment was scoring damage to the valve plate.
I was asked recently to give a second opinion on the cause of failure of an axial piston pump. The hydraulic pump had failed after a short period in service and my client had pursued a warranty claim with the manufacturer. The manufacturer rejected the warranty claim on the basis that the failure had been caused by contamination of the hydraulic fluid. The foundation for this assessment was scoring damage to the valve plate.
Root Cause Analysis has the potential of CHANGING people, IF the leader of the investigation knows of this potential. Far from “just another problem-solving exercise,”the root cause analysis should SLOW PEOPLE DOWN to the extent that they can see the truth of the incident under inquiry, WHATEVER THE TRUTH MIGHT BE. This paper focuses on two parts of our human nature which are large obstacles to root cause discovery, i.e., our unwillingness to slow down, and our unwillingness to let go of certain basic assumptions about life. Warning: This paper is designed to challenge the way you think about Root Cause Analysis.
Root Cause Analysis has the potential of CHANGING people, IF the leader of the investigation knows of this potential. Far from “just another problem-solving exercise,”the root cause analysis should SLOW PEOPLE DOWN to the extent that they can see the truth of the incident under inquiry, WHATEVER THE TRUTH MIGHT BE. This paper focuses on two parts of our human nature which are large obstacles to root cause discovery, i.e., our unwillingness to slow down, and our unwillingness to let go of certain basic assumptions about life. Warning: This paper is designed to challenge the way you think about Root Cause Analysis.
A fault tree is constructed starting with the final failure and progressively tracing each cause that led to the previous cause. This continues till the trail can be traced back no further. Each result of a cause must clearly flow from its predecessor (the one before it). If it is clear that a step is missing between causes it is added in and evidence looked for to support its presence. Below is a sample fault tree for the moral story of the kingdom lost because of a missing horseshoe nail.
A fault tree is constructed starting with the final failure and progressively tracing each cause that led to the previous cause. This continues till the trail can be traced back no further. Each result of a cause must clearly flow from its predecessor (the one before it). If it is clear that a step is missing between causes it is added in and evidence looked for to support its presence. Below is a sample fault tree for the moral story of the kingdom lost because of a missing horseshoe nail.