IMC is set to revolutionize how we think about Asset Management. Happening in Marco Island, Dec 16th - 19th 2024

IMC 2024 is designed to equip you with the knowledge, strategies, and tools needed to lead with foresight and innovation.

Sign Up

Please use your business email address if applicable

reliability engineering for maintenance

Reliability Tools - Maintenance

All actions necessary, both technical and administrative, for retaining an item in or restoring it to a specified condition so it can perform a required function. The actions include servicing, repair, modification, overhaul, inspection, reclamation, and restored condition determination.


The measure of the ability of an item to be retained in or restored to specified condition when maintenance is performed by personnel having specified skill levels, using prescribed procedures and resources.

The Reliability Engineering Toolbox

Weibull Database

The smartest way to maintain a reliability database is in Weibull format and Weibull databases are available.


Lognormal distributions are continuous life functions that have long tails to the right (display positive skewness) in time or usage. A lognormal distribution plotted on semi-log papers would appear as a normal curve.

A weekly collection of recommended articles and videos to boost your reliability journey. Right in your inbox

Life Cycle Cost

Life cycle cost (LCC) are all costs associated with the acquisition and ownership of a system over its full life. The usual figure of merit is net present value (NPV). Projects are considered most favorable for large positive NPVs. However for many cost individual cases, decisions are made for the least negative NPVs. In all cases, the default position for accounting is to know the NPV for making no change and this is usually the last alternative for most people associated with change.

The Reliability Engineering Toolbox

Overall equipment effectiveness (OEE)

Overall equipment effectiveness (OEE) is a manufacturing index to reduce complexity of discrete systems for problem solving and benchmarking.

Reliability-Centered Maintenance

Reliability-Centered maintenance (RCM) is a systematic planning process used to determine the maintenance requirements for a system. RCM expects the system has an inherent reliability and maintenance requirements are imposed upon the baseline of inherent safety and inherent reliability which can be no better than the worst than designed into the system.

The Reliability Engineering Toolbox


Reliability is the probability that a device, system, or process will perform its prescribed duty without failure for a given time when operated correctly in a specified environment.


The International Electrical Congress (IEC) defines dependability as "Dependability describes the availability performance and its influencing factors: reliability performance, maintainability performance and maintenance support performance." MIL-HDBK-338 defines dependability differently as a measure of the degree to which an item is operable and capable of performing its required function at any (random) time during a specified mission profile, given that the item is available at mission start. (Item state during a mission includes the combined effects of the mission-related system R&M parameters but excludes non-mission time; see availability.) Dependability is related to reliability with the intention that dependability would be a more general concept than the measurable issues of reliability, maintainability, and maintenance.

The Reliability Engineering Toolbox

Failure Forecast

Failure forecasting is a projection of failures into the future based on assumed or documented failure details

Life Units

A measure of use duration applicable to an item. For example, the life units may be starts-stops, run hours, hot-cold cycles, distances traveled, emergency starts or starts, shelf life, and other measurements which motivate failures.

Environmental Stress Screening (ESS)

A series of screens are conducted under environmental stresses to disclose weak parts and workmanship defects which require corrections and this requires and understanding of burn-in testing and ESS of which both techniques identify weak points and eliminate them by motivating early failures. Burn-in is usually a long process of operating under load(s) and at fixed temperature (in short, this is a special case of ESS) or it can be operated at varying loads and accelerated temperatures to achieve a shorter burin-in period, whereas ESS is a scientifically planned and conducted test which is usually conducted under accelerated loads to produce the same test/use results in a shorter period of time by increasing the stress on the components or assemblies. The objective of these screens is to produce a failure free product when released into operations. ESS is not intended as a test to validate compliance to a design, however it is intended to force latent defects into becoming defects before the end user finds them in day-to-day usage.

The Reliability Engineering Toolbox

Pareto Distribution

Vilfredo Pareto, and Italian economist in the late 1800s, who described the unequal distribution of wealth in the world.

The Reliability Engineering Toolbox


Failure is the loss of function when you needed the function to occur.


Data is the informational energy which runs the reliability improvement machine. Data is acquired at great cost. Data needs to be retained and used to prevent future failure events. Proper use of data provides an understanding of failure mechanisms and prevents reoccurrence of bad events which cause safety or high cost failures to occur. Reliability data requires definition of a failure. Failures can be catastrophic failures or slow degradation-you decide by defining the failures. The units of the measure for the data must be in units of the degradation-sometimes it is hours, some times it is miles, and so forth-in short, what ever motivates the failure. Reliability always ceases with a failure or a removal from service in some aged condition which then generates a category of data called a suspension or censored data. Data is information in the form of facts, figures, or engineering databases which is obtained from engineering tests, experiments, or actual operating conditions. Reliability data is often incomplete as the exact times to failure are rarely known or recorded with much precision so that only partial information is available for analysis. Reliability data comes in two forms: 1) age-to-failure data, and 2) censored/suspended data such as occurs when unfailed items are removed from service or when they fail due to a different failure mode than we are studying-this is useful information and part of the data set. Some data is better than no data for resolving reliability issues.

FMEA is part of the Reliability Strategy Development toolbox

Failure Mode and Effect Analysis - FMEA

Failure mode and effect analysis (FMEA) is the study of potential failures that might occur in any part of a system to determine the probable effect of each failure on all other parts of the system and on probable operations success.

The Reliability Engineering Toolbox

Quality Function Deployment

Quality Function Deployment or QFD is a bad translation of a good reliability technique for getting the voice of the customer into the design process so the product delivered is the product the customer desires.

Total Productive Maintenance

Total productive maintenance (TPM) is a corporate-wide effort involving all employees to fully use equipment to the maximum limit employing an equipment-oriented management concept to reduce failures and increase utilization of equipment and processes in a productive manner. TPM programs are teamwork programs and require a corporate culture of teamwork devoid of us vs. them issues. All employees are expected to accept ownership of the equipment and processes to do many small things all the time to insure high levels of availability by eliminating failures in the early stages with low cost actions. The employees approach the process equipment as owners rather than renters.

The Reliability Engineering Toolbox: Poisson Distribution

Poisson Distribution

Poisson distributions are discrete distributions and the simplest statistic process where Poisson events are random in time which describes a stable average rate of occurrence of counted events.

The Reliability Engineering Toolbox


Failure reporting and corrective action systems (FRACAS) is an organized database for aiding in solving reliability problems using a common sense approach by systematically and permanently removing failure mechanism.

ChatGPT with
Find Your Answers Fast