Reliability growth models are important management concepts for making reliability visual with simple displays. The simple log-log plots of cumulative failures on the Y-axis against cumulative time on the X-axis often make straight lines where the slope of the trend line is highly significant for telling if failures are coming faster (b>1) which is undesirable, slower (b<1) which is desirable, or without improvement/deterioration (b=1), which usually drifts toward undesirable results. The reliability growth models are frequently called Crow-AMSSA plots in honor of Larry Crow's proof of why the charts work as described in MIL-HDBK-189 when he worked with AMSAA.
For expensive components and expensive tests, sudden death tests involve a few components that tie-up a test frame as they are heavily loaded under the same test loads/conditions with several items being run at the same time. When one of the items fails the entire test frame is shut down so that you have 1 failure (this is the sudden death!) and several suspensions because the unfailed units are survivors as the test is halted until the test frame is loaded with new samples for resumption of the life test. Opening the test frame (instead of tying up the frame until all samples have failed) is cost effective. If three units can be tested simultaneously and the test is halted on the first failure, then perhaps we will literally have only 4 failures and 8 suspensions for preparing the Weibull analysis. Will the 4 sample + 8 suspension data set be different than if all 12 samples had been run to failure?-the answer is yes, they will be different, but will they be significantly different-the answer is no to the significant difference. So, as with simultaneous testing the suspensions (censored data) become important details for use in the statistical analysis. Most sudden death tests are accelerated to generate the data in a short period of time although this carries the risk of introducing unexpected failure modes (but this can also be useful information for anticipating field failures).
Software does not wear out but it does fail and most failures are due to specification errors and code errors with only a few errors in copying or use. The only software repair is by reprogramming and adding safety factors is almost impossible. Software reliability improves by finding errors and fixing the errors but estimating the number of errors which canse failures is extremely difficult as many branches of software code may lie dormant and unused until special events occur to make the latent failures obvious. Software failures are not often time related but are more software code page dependent. Software reliability is improved by extensive testing to disclose the failures and then fixing them to repeat the test all over again to validate the fix did not generate more failures and to continue the search of other latent defects.
Configuration control is involved with the management of change by providing traceability of failures back into the design standard. If the design details are not specified, the design will not contain the requirements and thus implementation of the project will be hit or miss for achieving the desired end results beginning with the conceptual design and resulting in the operating facility.
A weekly collection of recommended articles and videos to boost your reliability journey. Right in your inbox
All actions necessary, both technical and administrative, for retaining an item in or restoring it to a specified condition so it can perform a required function. The actions include servicing, repair, modification, overhaul, inspection, reclamation, and restored condition determination.
For reliability successes, loads must always be less than strengths. When loads are greater than strengths, failures occur. The issue is determining the probability of load-strength interference which is a joint probability of when loads exceed strengths. The loads should include expected conditions plus the foolishness of people to violate rules and overload equipment, plus the vagaries of Mother Nature to impose unexpected static and dynamic loads from hurricanes, tornadoes, earth quakes, wild fires, and so forth.
Lognormal distributions are continuous life functions that have long tails to the right (display positive skewness) in time or usage. A lognormal distribution plotted on semi-log papers would appear as a normal curve.
Reliability-Centered maintenance (RCM) is a systematic planning process used to determine the maintenance requirements for a system. RCM expects the system has an inherent reliability and maintenance requirements are imposed upon the baseline of inherent safety and inherent reliability which can be no better than the worst than designed into the system.
Reliability is the probability that a device, system, or process will perform its prescribed duty without failure for a given time when operated correctly in a specified environment.
The International Electrical Congress (IEC) defines dependability as "Dependability describes the availability performance and its influencing factors: reliability performance, maintainability performance and maintenance support performance." MIL-HDBK-338 defines dependability differently as a measure of the degree to which an item is operable and capable of performing its required function at any (random) time during a specified mission profile, given that the item is available at mission start. (Item state during a mission includes the combined effects of the mission-related system R&M parameters but excludes non-mission time; see availability.) Dependability is related to reliability with the intention that dependability would be a more general concept than the measurable issues of reliability, maintainability, and maintenance.
A measure of how well the product performance meets objectives. In short how well are the outputs actually accomplished against a standard? Capability is frequently the product of efficiency * utilization.
The concept is derived from the human life experience involving infant mortality, chance failures, plus a wear out period of life since data for births and deaths is accumulated by government agencies. Most equipment lacks the birth/death recording by government agencies and most non-human systems can be regenerated to live/die many times before relegation to the scrap heap.
Publisher's note: When a person has a hammer - everything looks like a nail. Once a maintenance engineer learns techniques like Reliability Centered Maintenance (RCM) or Weibull analysis, it seems like they apply the technique to every potential area of failure they can find - whether RCM or Weibull analysis can add value or not. Reliability tools must be used in the proper context to create the best result and the more tools we understand the better we can apply them.
We asked our favorite reliability guru, Mr. H. Paul Barringer to help us understand what reliability tools are available to us as maintenance professionals, when we can and should use them and what results we can expect if we apply them correctly. - Terrence O'Hanlon, CMRP, Publisher
ChatGPT with ReliabilityWeb: Find Your Answers Fast