1. Managing the FRACAS
The FRACAS is an important system that requires management attention just like any other. Purposeful management for success requires that the FRACAS be driven from the top down through management policies and procedures to insure quality of effort and meaningful results.
The beginning step in the development of the FRACAS is the establishment of management policies for equipment and process reliability improvement that include requirements for reporting, analyzing, and correcting system failures. The policy statement should include a statement of purpose for the FRACAS, a statement of personnel responsibilities at all levels, and a description of the basic elements required in the FRACAS.
The FRACAS by its nature is a procedure driven system. It requires procedures for reporting, analysis, and correction of system failures. FRACAS procedures will guide how failures are reported, where information is stored, which specific analysis methods will be used, when they will be used, and who will use them.
2. Basic Elements of the FRACAS
2.1. Failure Reporting
Failures must be reported in ways that lend themselves to analysis with Reliability Engineering tools such as Weibull Analysis, RCM, and Availability Simulation. The best reporting schemes use individual failure modes as the basis for failure reporting. Reporting schemes need to follow the hierarchical structure of the equipment within the process.
2.1.1. Failure Modes
Failure modes describe the individual failed components of the maintainable item, including a descriptor for what happened to the component. Failure modes are the things that occur and cause the system to lose its ability to produce its desired outputs.
2.1.2. Developing Failure Modes
Failure modes are best developed using an orderly system that includes a functional analysis of the equipment used in the process. Equipment is generally broken down into a hierarchy that shows graphically how the facility is put together to achieve its business output.
2.1.3. Failure Modes and Effects Analysis (FMEA)
Failure Modes and Effects Analysis (FMEA) is perhaps the best way of developing failure modes for inclusion in the FRACAS reporting system. It is an extremely systematic way of looking at the functions of maintainable items to determine the most likely causes of their loss of function. The causes of loss of functional failure are the equipments failure modes.
A thorough FMEA that considers all the failure modes present produces the most exact results, but may be too time consuming to be of practical use in the everyday work environment.. A useful group of failure modes can be generated by developing a list of the most likely failure modes using a functional breakdown of equipment. Development of the FMEA is best done by a group of people who work with the equipment day-in and day-out. What is important is to understand the functions of the equipment and what things break or fail that cause the equipment to lose its function.
22.214.171.124. Maintainable Items
Maintainable items represent the lowest level of the facility hierarchy than can be further broken down into components. Maintainable items have specific, well definable functions that enable the system to produce its desired output. It is the loss of the function of these items that leads to lost production, lost quality, safety issues, environmental issues, and operational issues.
The maintainable item level is where we set maintenance tactics and strategies to keep system performance at desired levels.
Functions define the reason for the existence of the maintainable items. Most maintainable items have one or more primary functions and one or more secondary functions. Functions describe what the maintainable item does, not what it is. Functional Statements need to be written in a way that makes it easy to identify what the functional failure is. The best functional statements use everyday language that we all can understand. Local jargon is acceptable as long as everyone who uses the FMEA will understand what the jargon represents.
126.96.36.199. Functional Failures
The functional failure statement describes the loss of required or desired function of the maintainable item. They usually contain an adjective and the functional noun. Functional failure statements rarely if ever contain a part name.
188.8.131.52. Failure Modes
Once the functional failures are defined we can apply the failure modes much as shown in tables one and two. The important thing to remember is that the failure mode is a combination of a component name as well as a descriptive word to tell what happened to the component.
Every member of the organization has roles and responsibilities related to the reporting of failures. It is important for each person in the organization to understand his/her roles and responsibilities.
2.2.1. Facility Manager
The facility manager is responsible for establishing policies that require the development of the FRACAS. The facility manager provides the top down driven impetus for insuring that everyone in the organization is focused on reporting, analyzing, and correcting failures.
2.2.2. Program Champion
The FRACAS program champion is responsible for developing the written procedures needed to implement the program. The Champion provides upward and downward communication of program policies, goals, and results. The Champion has direct responsibility for insuring that required training takes place, and that each individual in the organization understands what his/her roles and goals are within the FRACAS program.
2.2.3. Operations and Maintenance Managers
Successful development and use of the FRACAS depends on close cooperation between the operations and maintenance managers within the organization. Breakdowns in communication at this level often lead to significant reductions in the benefits that can be achieved with a well implemented FRACAS. The tone of communication between these two managers usually sets the tone of communication between their subordinates.
2.2.4. Operations Supervisors
Operations supervisors play an extremely important role in developing and sustaining FRACAS efforts. Operations supervisors are responsible for insuring that the goals of the FRACAS are made known to their direct reports, and for insuring that initial failure reports are of high quality. Poor initial reporting will lead to poor final reports, and can make the data gathered useless for predicting and preventing future failures.
2.2.5. Maintenance Supervisors
Maintenance supervisors also play an important role in developing and sustaining FRACAS efforts. They are responsible for insuring that their maintenance personnel take the necessary time to insure that information about failed components is correct, and is in line with the failure modes defined within the FRACAS reporting system. Again, poor quality of information here will often lead to poor final reports and information that is not very useful for predicting and preventing future failures. Good failure reporting requires good communication between the operations and maintenance supervisors.
Operators provide initial failure reports for the FRACAS. They need to understand the importance of giving meaningful and accurate reports about the functional failures they observe. Operators need to have a thorough understanding of the maintainable items that are present in the system. It is not reasonable to expect that operators will know or be able to determine what is causing the functional failure. It is reasonable to expect that they will be able to describe the functional failure in enough detail to aid maintainers in the troubleshooting process, and to provide useful information to the FRACAS analyst.
Maintainers are in a position to have the greatest impact on the outcome of FRACAS efforts. They are usually in the best position to determine which components failed, and what happened to them. They may be in a position to determine what caused the failure mode to occur, but it is not reasonable to expect that they will be able to determine the cause of every failure mode. The maintainer has very specific responsibilities that require enumeration.
184.108.40.206. Preserving Evidence
The maintainer will usually be the first one on the scene to have direct contact with the failed components. It is his responsibility to document and record the condition of the components as he finds them. The maintainer needs to be taught preservation techniques, and how to record conditions around the component using words and pictures. In no case should the maintainer attempt to clean or alter the condition of the failed components. The maintainer should protect the evidence by covering it loosely with some protection like plastic bags to prevent contamination from outside sources.
220.127.116.11. Recording Conditions
The maintainer should record conditions around the failed component. The best way is to take digital photos and write concise notes about what is found.
18.104.22.168. Identifying Potential Causes or Causal Factors
The maintainer may be able to determine what caused the component to fail, as well as some causal factors that may have led up to the failure. It is important to allow the maintainer to say "I don't know" at this point. Frequently the maintainer will not be able to tell what caused the component to fail during an initial analysis of the scene. In this case saying I don't know is better than an unfounded guess as to cause. Determining cause may require further examination by an engineering specialist such as metallurgist and people experienced in determining causes for the failed components in question.
2.2.8. Failure Analyst
The failure analyst is responsible for screening initial failure reports to determine if the reports are complete, and whether or not further analysis is required. The analyst may order a Root Cause Failure Analysis (RCFA) depending on whether or not the consequences of the failure warrant it. The analyst determination to order the RCFA should be driven by policy and guidelines written into the FRACAS. The analyst is also responsible for insuring that failure data is analyzed using available analysis tools on a regular basis to determine whether there need to be updates to the Preventive and Predictive Maintenance Program, RCFA's for recurrent failure modes, or RCFA's for failure modes exhibiting infant failures.
2.3. Analysis Methods
Well collected failure data allows the analyst to use a variety of analysis methods to determine how to improve asset performance. A well trained analyst can use Weibull Analysis, Reliability Centered Maintenance (RCM), Availability Simulation, and Root Cause Failure Analysis (RCFA) to analyze the data and determine solutions to asset performance problems.
2.3.1. Weibull Analysis
Weibull Analysis, invented in the 1930's by Swedish born Waloddi Weibull, has become the statistical analysis method of choice for examining equipment failures. The low number of data points required for making reasonable decisions, as well as the ability to look at times to failure distributions to determine potential maintenance tactics give it substantial advantages over other forms of statistical analysis for making asset management decisions.
2.3.2. Reliability Centered Maintenance (RCM) and Availability Simulation
RCM coupled with Availability Simulation allows the analyst to look at a wide variety of potential maintenance tactics to determine which set of tactics can be applied to equipment failures to achieve the best combination of profit, safety criticality, environmental criticality, and operational criticality for meeting the goals of the business. Availability Simulation changes maintenance decision making from a day-to-day exercise into a strategic planning exercise which can look far into the future of the assets.
2.3.3. Root Cause Failure Analysis (RCFA)
RCFA is arguably the most powerful tool available for improving asset performance. RCFA allows the organization to analyze and eliminate major failures as well as the small recurring failures that chip away at company profits each and every day. The FRACAS database is instrumental in insuring that good hard data is used to back up the potential causes for failure given during RCFA exercises. The most important element in successful RCFA programs is the reliance on hard facts rather than supposition by RCFA participants.
It is the strong combination of RCFA and RCM that allows an organization to make rapid and sustainable improvements in asset performance.
3. The FRACAS Database
The FRACAS database is the repository for all gathered failure information. It must be developed in a way that allows easy entry of failure data, and easy retrieval of failure data for analysis using the various methods previously described. The database may take several forms depending on the size and sophistication of the organization.
3.2. Forms of the Database
The FRACAS database may take the form of a custom-built database for use in small organizations, an off the shelf database for use across larger organizations, or in some cases it may be integrated into the facility's Computerized Maintenance Management System (CMMS) or Enterprise Asset Management System (EAMS).
3.2.1. Custom Built Databases
Small companies or facilities may often opt to develop their own FRACAS database due to the lack of funds and resources required for purchasing either off the shelf packages or CMMS/EAMS packages. The advantage to this method is low entry cost as well as development based on the specific needs of the organization. It is usually maintained by a single dedicated individual. The major drawback to this type of system is the inability to share and report data across a larger user base.
3.2.2. Off The Shelf Solutions
There are a large variety of off the shelf FRACAS software packages available today. They are usually more suitable for larger organizations. Most systems have some for of analysis ability already built into them, and offer the ability to attach external documents and pictures to enhance failure reporting and analysis. The available systems can be used in LAN and WAN environments so that they can be a global solution for a large company. Off the shelf systems require either total separate data entry, or some combination of separate data entry and import entry from either a CMMS or an EAMS environment. In most cases the import data entry is accomplished by exporting data from the CMMS/EAMS to an office product such as Excel, and then importing the information into the FRACAS database.
Most providers of FRACAS software are constantly updating and improving the software, and are open to changing the software based on direct inputs from their user base.
3.2.3. CMMS or EAMS Solutions
Very sophisticated organizations with large Information Technology (IT) or Information Systems (IS) departments may be able to implement the FRACAS database within their EAMS/CMMS. The advantage of this solution is that all information is in a single repository that is accessible from all levels of the organization. The disadvantages are that it requires a sizable investment in programming resources, and a programming change by the EAMS/CMMS supplier may require an extensive rewrite of the FRACAS module. Most IT/IS departments are unwilling to commit to providing the follow on resources that may be required to support future changes.
3.3. Minimum Database Requirements
As a minimum the FRACAS database must contain elements that allow the user to analyze failures using Weibull Analysis, RCM, Availability Simulation, and RCFA. The following list is meant to represent the absolute minimum requirements for the database.
22.214.171.124. Equipment Hierarchy
The database must contain the equipment hierarchy down to the maintainable item level.
126.96.36.199. Failure Modes
Failure Modes as described in section one should be in the database in a tabular format. It is helpful if the failure modes are contained in failure mode groups to minimize the list of failure modes to search when assigned the mode to a given failure report.
188.8.131.52. Date and Time Stamp
The exact date and time of the report must be saved so that successful Weibull Analysis can be accomplished. The lack of specific times will impact the ability of the analyst to determine exact times to failure for specific failure modes. As an absolute minimum the date of the failure must be recorded.
184.108.40.206. Failure description
The failure reporter must have the ability to describe what happened in his own words to include the functional failure of the maintainable item.
220.127.116.11. Failure Impact
The database must contain information about the business impact of the failure in terms of cost, downtime, safety criticality, environmental criticality, and operational criticality.
18.104.22.168. Causal Factors
Information about what may have caused the failure, or any causal factors that may have led up to the failure must be recorded. This information can be vital when later analysis of the failures is performed.
22.214.171.124. RCFA Follow-up
Many organizations that undertake RCFA efforts fail to capitalize on the power of RCFA because they are unable to close the loop on following up recommendations. The FRACAS is an excellent place to keep information about which failures require and RCFA, and who has organizational responsibility for completing the implementation of RCFA recommendations.
3.3.2. Reporting Capabilities
The FRACAS database should allow the analyst to produce a variety of textual and graphical reports to aid in the analysis of failures. Reporting of Weibull data, failure frequencies for various failure modes, and database structure are extremely important.
A well designed Failure Reporting, Analysis, and Corrective Action System (FRACAS) can be an important part of any continuous improvement effort. Failure codes developed using functional analysis of plant equipment greatly improve the ability of the Reliability Engineer to analyze failures and initiate changes in equipment design, maintenance strategies, and operating strategies.
Simple, two-part failure codes for use in the CMMS/EAMS allow operators and maintainers to better record failure information for use in the FRACAS.
Note: Originally presented by Bill Keeter, Allied Reliability, at EAM-2006 The Enterprise Asset Management Summit in Las Vegas. Full proceedings are available on CD here.