by Thomas Van Hardeveld
Reliability has become such an integral expectation in our society that it is difficult to imagine a world where things do not work as expected. The first use of the word reliability was by poet Samuel Taylor Coleridge, who bestowed the word on his friend, the poet Robert Southey, to praise his steadfastness.1 From this seemingly insignificant usage of the term, reliability has grown enormously to a broadly accepted, if not entirely understood, property that everyone expects for a wide range of situations. Online searches for reliability and related terms result in thousands of references in papers and manuscripts and literally millions of hits on the Internet.
The Origins of Reliability
The main pillars of reliability are the concepts of probability and statistics, which emerged earlier from the works of two Frenchmen, Blaise Pascal and Pierre de Fermat. The emergence of the need for quality became apparent with mass production and this evolved into statistical quality control and later statistical process control in the 1920s.
Reliability principles and practices became active as an engineering discipline around the 1950s, with a catalyst being the vacuum tube and the many failures that were being incurred. A key moment was the initiation of the Advisory Group on Reliability of Electronic Equipment (AGREE), jointly established in 1952 between the U.S. Department of Defense and the American electronics industry. The AGREE report of June 4, 1957, provided all the armed services with the assurance that reliability could be specified, allocated and demonstrated. The reliability engineering discipline has since come into existence. The first conference on quality control and reliability (of electronics) was held in 1954 and its proceedings evolved into a journal that is still being published by the Institute of Electrical and Electronics Engineers (IEEE) as the IEEE Transactions on Reliability. Another important development was the work of Wallodi Weibull, who pioneered the flexible statistical distribution that now carries his name.
Reliability came into further prominence in the 1960s when many military standards (MIL-STD) and specifications were developed to meet the needs of design and implementation of defense production in the United States. Worldwide industry acceptance of the MIL-STD was noted as the leading source of reliability knowledge and practices. The most well-known reference is the MIL-HDBK-217 Reliability Prediction of Electronic Equipment, which has been adopted in many countries and used by industry organizations as the framework methodology and basis for failure rate estimation. Other methods for testing, reliability growth and reliability analysis have originated from military standards.
Reliability engineering now encompasses statistical methods, techniques, such as failure mode and effects analysis (FMEA) and fault tree analysis, physics of failure, hardware, software and human reliability, probabilistic or quantitative risk assessment, and reliability growth and prediction, to name only a few. Databases of information have been widely established and their use has increased dramatically. Practically every engineering discipline has a focus on these aspects as a key component of business success.
The term reliability now has a much broader meaning and includes not only the specific meaning of reliability as the probability that something may fail, but also related concepts of availability, maintainability, supportability, safety, integrity and a host of other terms. This has led to a proliferation of aggregate terms, such as reliability and maintainability (R&M), reliability, availability and maintainability (RAM), RAMS, where the additional "S" is safety or sometimes supportability, and dependability, which is used by international standards.
In 1965, the International Electrotechnical Commission (IEC) established a technical committee (TC56) to address reliability. The initial title of IEC/TC56 was "Reliability of Electronic Components and Equipment." In 1980, the title was amended to "Reliability and Maintainability" to address reliability and associated characteristics applicable to products. In 1989, the title was further changed to "Dependability" to better reflect the technological evolution and business needs on a broader scope of applications based on the concept of dependability as an umbrella term. In 1990, following consultations with the International Organization for Standardization (ISO), it was agreed that the scope of TC56's work should be no longer limited to the electrotechnical field, but address generic dependability issues across all disciplines, thus making IEC/TC56 what is referred to as a horizontal committee.
The scope of IEC/TC56,2 according to its strategic business plan, covers the generic aspects of dependability management, testing and analytical techniques, software and system dependability, lifecycle costing and technical risk assessment. This includes standards and application guides related to topics, such as system and component reliability, maintainability and supportability, dependability of systems, technical risk assessment, integrated logistics support, dependability management and management of obsolescence.
The Concept of Dependability
Dependability is the "ability to perform as and when required."3 It applies to any physical item, such as a system, product, process, or service, and may involve hardware, software and human actions or inactions. Dependability is a collective set of time-related performance characteristics that coexist with other requirements of a system, such as output, efficiency, quality, safety and integrity, and, in fact, enhances them.4
Dependability does not have a single measure that can be attributed to it, but is instead a combination of relevant measures that vary with application. In a broad sense, dependability is trusting an item to provide its required functionality and expected value and benefits.
Dependability is the term that has been adopted internationally to cover the main attributes of availability, reliability, maintainability and supportability (see Figure 1). Quite often, the term reliability is used as a blanket term to include all these attributes. This proliferation of terms leads to considerable misunderstanding of this important engineering discipline, thus adding to the need for standardization.
The main dependability attributes of an item are:
- Reliability for continuity of operation;
- Maintainability for ease of preventive and corrective maintenance actions;
- Supportability for provision of maintenance support and logistics needed to perform maintenance;
- Availability for readiness to operate.
Reliability is an inherent result of the design and is sustained by proper operation within prescribed conditions of use and appropriate maintenance. Maintainability is dependent on the system design architecture and technology implementation and is guided by maintenance strategies. It is primarily a function of an item's design and installation. Supportability is the ability of an item to be supported from a maintenance perspective and consists of two components, maintenance support and the logistics required to deliver that maintenance support. The starting point for supportability is the maintainability of the item, which is then enabled with specific resources and logistics necessary for the use of the item. Availability is the result of a combination of reliability, maintainability and supportability appropriate for the application.
Thus, dependability is a general term that provides a framework for these attributes, as well as others, such as recoverability, durability, operability and serviceability. Safety is not considered a direct attribute of dependability, although the two are closely related. Safety is enhanced when dependability is integrated into the design and operation of an item.
Dependability and Risk and Asset Management
With the recent publication of the ISO55000 suite of standards, an increasing amount of emphasis is being placed on the concept and practice of asset management. Lifecycle management is the basis of asset management, including lifecycle costing and financial aspects. Risk management is also considered a major focus of asset management. Dependability shares most of the aspects of asset management, including risk management, the lifecycle, information management and quality. Without proper consideration of dependability, asset management objectives could not be achieved.
International standards are now leading the way in continuing to improve the very high levels of dependability that have already been achieved.
- Saleh, J.H. and Marais, K. "Highlights from the early (and pre-) history of reliability engineering," Reliability Engineering and System Safety Volume 91, 2006: 249-256.
- International Electrotechnical Commission. TC56 Dependability website: www.iec.ch/tc56.
- International Electrotechnical Commission. "International Electrotechnical Vocabulary - Chapter 191: Dependability and quality of service." IEC 60050-191.
- Van Hardeveld, T. and Kiang, D. Practical Application of Dependability Engineering. New York: ASME Press, 2012.
Thomas Van Hardeveld has 40 years experience in all aspects of the operation and maintenance of gas turbines, compressors and other gas transmission and process equipment. He is a specialist in maintenance management of all types of equipment as well as reliability techniques and risk and integrity management and conducts training courses on a variety of rotating equipment and asset management topics. Tom is actively involved in standardization activities with the IEC/TC56 Committee on Dependability.