IMC is set to revolutionize how we think about Asset Management. Happening in Marco Island, Dec 16th - 19th 2024

IMC 2024 is designed to equip you with the knowledge, strategies, and tools needed to lead with foresight and innovation.

Sign Up

Please use your business email address if applicable

reliability engineering

The Reliability Engineering Toolbox

Pareto Distribution

Vilfredo Pareto, and Italian economist in the late 1800s, who described the unequal distribution of wealth in the world.

The Reliability Engineering Toolbox

Failure

Failure is the loss of function when you needed the function to occur.

Data

Data is the informational energy which runs the reliability improvement machine. Data is acquired at great cost. Data needs to be retained and used to prevent future failure events. Proper use of data provides an understanding of failure mechanisms and prevents reoccurrence of bad events which cause safety or high cost failures to occur. Reliability data requires definition of a failure. Failures can be catastrophic failures or slow degradation-you decide by defining the failures. The units of the measure for the data must be in units of the degradation-sometimes it is hours, some times it is miles, and so forth-in short, what ever motivates the failure. Reliability always ceases with a failure or a removal from service in some aged condition which then generates a category of data called a suspension or censored data. Data is information in the form of facts, figures, or engineering databases which is obtained from engineering tests, experiments, or actual operating conditions. Reliability data is often incomplete as the exact times to failure are rarely known or recorded with much precision so that only partial information is available for analysis. Reliability data comes in two forms: 1) age-to-failure data, and 2) censored/suspended data such as occurs when unfailed items are removed from service or when they fail due to a different failure mode than we are studying-this is useful information and part of the data set. Some data is better than no data for resolving reliability issues.

FMEA is part of the Reliability Strategy Development toolbox

Failure Mode and Effect Analysis - FMEA

Failure mode and effect analysis (FMEA) is the study of potential failures that might occur in any part of a system to determine the probable effect of each failure on all other parts of the system and on probable operations success.

banner
A weekly collection of recommended articles and videos to boost your reliability journey. Right in your inbox
DOWNLOAD NOW
The Reliability Engineering Toolbox

Quality Function Deployment

Quality Function Deployment or QFD is a bad translation of a good reliability technique for getting the voice of the customer into the design process so the product delivered is the product the customer desires.

Total Productive Maintenance

Total productive maintenance (TPM) is a corporate-wide effort involving all employees to fully use equipment to the maximum limit employing an equipment-oriented management concept to reduce failures and increase utilization of equipment and processes in a productive manner. TPM programs are teamwork programs and require a corporate culture of teamwork devoid of us vs. them issues. All employees are expected to accept ownership of the equipment and processes to do many small things all the time to insure high levels of availability by eliminating failures in the early stages with low cost actions. The employees approach the process equipment as owners rather than renters.

The Reliability Engineering Toolbox: Poisson Distribution

Poisson Distribution

Poisson distributions are discrete distributions and the simplest statistic process where Poisson events are random in time which describes a stable average rate of occurrence of counted events.

The Reliability Engineering Toolbox

FRACAS

Failure reporting and corrective action systems (FRACAS) is an organized database for aiding in solving reliability problems using a common sense approach by systematically and permanently removing failure mechanism.

Mean Time

A density figure-of-merit metric often referred to as the average or expected value. In the simplest form it appears as arithmetic S(time)/S(events) or in complicated situations as a statistic metric. It applies to mean life (ML), mean down time (MDT), mean maintenance time (MMT), mean time between failures (MTBF for repairable items), mean time to failures (MTTF for replacement items), mean time between maintenance (MTBM), mean time between maintenance scheduled (MTBMs), mean maintenance time unscheduled (MMTu), mean maintenance time scheduled (MMTs), mean time between overhauls (MTBO), mean time between unscheduled removals(MTBRu), mean time to restore (MTR), mean time between downing events (MTBDE), and so forth. The units will be time/metric, e.g., hours/failure. The reciprocal of the metric provides an incident rate, e.g., failures/hour.

Design Reviews For Reliability

Specific questions to ask the design engineers during a review specifically for reliability using failure data from operations and maintenance are: 1) show the calculated availability for the system based on a RAM model, 2) show the calculated number of failures during the specified mission time between turnarounds based on a reliability and maintainability (RAM) model, 3) show details of FEMA studies, 4) show details of FTA calculations, 5) show the calculated mean times between downing events, 6) show the calculated the mean time between cutbacks from full production capability and losses thus incurred, 7) show the QFD matrix and details, and 8) show the calculated cost of unreliability.

Maintenance Engineering

A tactical job for rapidly repairing equipment to operable conditions by studying operating and repair manuals. Acquires failure data and prepares maintenance plans of restoring equipment to operable condition in a minimum amount of time. Prepares general diagrams, charts, drawings, and spare parts requirements for maintenance planners. Makes recommendations for improving the repair cycle. Provides manning level forecast for supervisors and estimates the duration of outages. Determines the cost advantages of alternatives for developing action plans to comply with internal/external customer demands for timely repairs of processes/equipment. The purpose of these activities is to restore equipment to service in a timely manner.

Reliability Policies

Management communicates with their staffs through important policy statements. Management policies are general and relate to procedures and rules which are specific for implementing policies. Written statements of policy regarding reliability are decisive documents about avoid system failures in the same way as safety policies address the need for absence of human injuries, quality policies address the need for absence of product discrepancies, environmental policies address the need for avoiding spills and releases. Management needs to also say by a policy statement a reliability policy which may read like this: We will build an economical and failure free process which will operate for 5 years between planned outages. This statement will clearly communicate that failures to the process (which is the money machine) are to be abhorred and avoided!

The Reliability Engineering Toolbox

Events and Incidents

Events/incidents are single events or occurrences that happen, especially one that is particularly significant, that results in a failure from a non-aging mechanism for reliability purposes.

Highly Accelerated Life Test - HALT

Highly accelerated life test (HALT) is an offspring of older environmental stress screening (ESS) tests and it is a testing process for ruggedization of pre-production products by heavily stressing the product to identify failure modes quickly and to verify weak links in the system.

The Reliability Engineering Toolbox

Weibull Analysis

Weibull analysis is the tool of choice for most reliability engineers when they consider what to do with age-to-failure data.

The Reliability Engineering Toolbox

Weibayes Estimates

If you've got one piece of failure data and nothing else, you're a poor person without much hope.

The Reliability Engineering Toolbox

Exponential Distribution

The probability of survival and of failure of components or equipment is under the condition of chance failure which means a constant instantaneous failure rate where the die-off rate is the same for any surviving (unfailed) population.

Effectiveness

The potential or actual probability of a system to perform a mission for a given level of performance under specified operating conditions defined as the product of reliability*availability*maintainability*capability. Many variants of the effectiveness equation exist, e.g., OEE, and others.

Critical Items List

The critical items list is a top level summary of problems/cost used for discussions with management about key reliability issues. The summary list converts technical details to a summary of costs and time while placing the issues into a Pareto distribution explained in terms of money and the vital few problems to be solved for competitive reasons.

The Reliability Engineering Toolbox

Normal Distribution

A fundamental frequency distribution that produces a symmetrical bell-shaped diagram based on the Gaussian distribution to form a normal law of errors.

ChatGPT with
ReliabilityWeb:
Find Your Answers Fast
Start