Asset Reliability or System Reliability?

When explaining reliability, I usually ask this question: Where is an airplane naturally located — in the air or on the ground? Some people pick the first, because an airplane is made for flying, whereas others, influenced by the law of gravity, opt for the second. There are large graveyards of airplanes in the Mojave Desert, and I haven’t seen any of them in the sky yet. That’s because the airplane is naturally on the ground, and if it flies, it’s only because everything is working properly. Then, I ask the second question: What is “everything”? It’s at this point that a doubt arises: Is the reliability of the asset or of the system?

Let’s agree that a system is a combination of means, such as people, materials, equipment, software, facilities, data, etc., integrated in a way that they can perform a certain function in response to a specific need. In short, they are integrated means for the achievement of an objective. Every system is immersed in a context that conditions it, has a core that must achieve the objectives and has enablers that allow the core to do its work.

Let’s go back to the example of the airplane which, as an asset, is intrinsically reliable, but it is not easy to maintain and is difficult to operate; in fact, not everyone can be a pilot. It also has a “problem” with the context: During its operation, it changes its environment — it goes from the ground, where it is by nature, to the air, where it is because “everything” works, and if some part of that “everything” fails, it returns to its natural location, the ground. The problem is how it returns — it can land normally, it can make a forced landing or it could, unfortunately, crash to the ground.

Woodhouse 1 postulated Operational Reliability a few years ago — an excellent deduction that, in my opinion, is very focused on assets but was also very useful to me to highlight what an efficient system should be. My professional training has given me a systemic vision of things, and I am convinced that, for a system to be reliable, all its components must be reliable and each one of them must be in balance with the rest, with all of them operating within the same context that conditions them. Based on this, I postulated the System Equilibrium, where the focus is on the complete system immersed within the context in which it must operate. This concept is shown in Figure 1.

FIgure 1


This figure represents the components of the “everything,” which must be as reliable as the asset itself. In this scheme, the reliability of the assets is present through their intrinsic attributes — reliability, maintainability and operability — which are achieved from the design and must be sustained during their useful life. There are also people, whose reliability is defined by their competencies and the leadership that drives them; processes and their associated procedures, which are executed by people who must understand and operate within them; and finally, supportability, which I will explain in greater detail. Please note that, while availability of assets is not included in this diagram, this topic will be discussed later.

Supportability was implemented in 1997 by the U.S. Department of Defense in MIL-HDBK-502 Acquisition Logistics and defined as “the degree to which system design characteristics and planned logistics resources meet system peacetime and wartime requirements.” Later, it was added that “Supportability is the capability of a total system design to support operations and readiness needs throughout the system’s service life at an affordable cost.” Supportability has three interventions in the system life cycle:

  1. When assets are designed to be “supportable” throughout their service life;
  2. When the support system is designed and incorporated, which would allow sustaining the system throughout its useful life; and
  3. When the support is operated throughout the life of the system, to be available when needed.

And all this needs to be achievable at an affordable cost.

The DOD 2 Guide to Achieving Reliability, Availability and Maintainability defines that satisfactory system performance is measured in terms of RAM, which “refers to three related characteristics of a system and its operational support: reliability, availability, and maintainability.” It also notes that “Designing for RAM should address not only the system but also: the processes used to manufacture the system, the expected maintenance system, logistics system, and the operational constraints” and adds that “systems engineering activities can be directed to designing and manufacturing reliability and maintainability into the system, but availability is the function of this inherent reliability and maintainability as well as the system's supportability and producibility.”

In short, availability is sustained on the basis of the inherent attributes of the assets and the supportability provided by the system. When synergy is produced by the conjunction of the reliability of the human resources and processes according to the operational context in which it is working, the balance of the system is achieved and, as a consequence, is its reliability. Figure 2 illustrates this previous statement..

Figure 2

The main tool of analysis of supportability is the Integrated Logistics Support (ILS), which was defined by Benjamin Blanchard 3 as “A disciplined, unified, and recurring approach to the management and technical activities necessary to:

  • Develop support requirements that are aligned with enlistment objectives, with the design, and with each other;
  • Integrate support considerations into system and equipment design;
  • Acquire the required support; and
  • Provide the required support during operation with minimal cost.”

When is it necessary to conduct a supportability analysis? For example, in the following cases:

  1. When designing an asset.
  2. When incorporating a new asset into a system.
  3. When designing a system, such as an element that will be deployed to a remote and isolated location to build a gas pipeline.
  4. When planning an upgrade or modernization of an asset.

Going back to the “aviation system,” the core asset of the system, the aircraft, can be very reliable and maintainable, but if the crew does not have the necessary skills and the company does not have leadership, a problem arises; if the processes and procedures are not adequate and/or the crew does not know them, a problem arises; if maintenance is not doing its job properly, supportability suffers and a problem arises; and if the aircraft is going to operate in an aggressive environment, another problem arises. It also needs to be remembered that these “reliabilities” are multiplied, not added, and stay between 0 and 1. In other words, if the aircraft, an asset with 0.995% reliability, is in flight and problems arise with any of the other components of the system, the risk of it returning to the ground is high.

The only difference between the proposed system and the ones we usually operate is that if something fails, they don’t fall, because most of them don’t fly, but that doesn’t mean that there are no risks. In fact, there are risks, and their consequences can affect people, the environment, the business and/or the organization’s objectives, and in some cases — luckily not frequently — they can be catastrophic. When a problem arises in a system, it is usually the enablers that are most affected.

For example, if the core has assets with low reliability, is difficult to maintain, complicated to operate or is in an adverse environment, maintenance will have to put in more work to sustain availability; it will also have to focus on purchasing, as it will need more spare parts, transportation to carry them and human resources, as there is likely to be higher staff turnover and sales because it will have to justify production defaults.

The same happens if the core delivers defective products. The fault could be in the assets, in the incoming raw material, in the poor operation of a machine or in a poorly executed procedure. In such cases, maintenance should check the condition of the asset and, if necessary, correct it or explain to operations that an operator is doing their job incorrectly or that the raw material is not good, all while there is an increment of workload. Additionally, purchasing, sales, transportation, etc., will suffer the consequences.

In short, in situations of system stress, it is supportability that will have to react so that the core can continue to fulfill its function.

Therefore, we can conclude that a system is balanced and works with maximum availability when all its components, and not just assets, are reliable, and that it is the support that must be adapted to achieve this balance.

----

1. Reliabilityweb.com. La Cultura de la Confiabilidad Operacional. https://reliabilityweb.com/sp/articles/entry/la-cultura-de-la-confiabilidad-operacional
2. DOD Guide for achieving Reliability, Availability, and Maintainability. (2005). Department of Defense (DOD). USA.
3. Blanchard, B. (1995). Ingeniería Logística. (pp.17). ISDEFE (Sociedad Estatal Ingeniería de Sistemas para la Defensa de España).