Website update in progress! You might be logged out of your account. If this occurs, please log back in.

Website update in progress! You might be logged out of your account. If this occurs, please log back in.

Sign Up

Please use your business email address if applicable

Reliability in a Safety Sensitive Environment

In an airline environment, there is only one acceptable standard - perfection. Safety is everything, and since reliability is a large factor in safety, it gets a lot of attention.

An airline looks at reliability at every level of the operation, from performance of an aircraft to performance of the individual piece-parts of that aircraft. Airlines look at the impact of everything that touches an aircraft. Everything is calculated and recalculated to determine the impact to the operation. Small changes can impact the operation in a large way, and those impacts have to be predicted and dealt with.

During the recent ban on liquids and gels by the TSA, the airline added more drinks to the standard catering of the aircraft, expecting that since passengers could no longer carry their own drinks on board, the airline would need to have more drinks available on the airplane. After safety, passenger comfort and convenience comes next, and having not enough drinks available on board would impact passenger comfort and convenience. The impact? Storing four extra cans of soda on every flight cost more than $80,000 annually in additional fuel burn.

A commercial aircraft gets, on average, about 17 man-hours of maintenance for every hour the aircraft is flown. Every time the aircraft is on the ground, someone is looking at it or servicing it. It is serviced every night, and on a regular schedule based on times and cycles. About every 5 years, an aircraft goes in for a "heavy check", where the airplanes is gutted and almost every part on the aircraft is refurbished, repaired, overhauled, or replaced. A huge job. A "Light C" check performed on an MD 80 once a year requires 2,100 man-hours and three days to accomplish. A Heavy Check is significantly more - running four to six weeks of downtime.

Since I mentioned times and cycles in the previous paragraph, I should probably explain those two terms and their relevance. Some components are tracked for reliability against operating hours, while others are tracked against operating cycles. The difference is significant. An "hour" in this definition is a flight hour. A "cycle" in this definition is a takeoff, a flight, and landing.

For a component like a tire or a landing gear, cycles are used. Hours spent in flight have no impact on the reliability or wear of a tire or landing gear component, but the number of landings (cycles) does have an effect. For a component like an engine or a fuel pump, the number of landings (cycles) is irrelevant, but the number of operating hours (flight hours) is relevant.

This has a direct impact on the maintenance planning of an airline depending on their type of operation. Let's use Southwest Airlines and Qantas as examples. Southwest Airlines is an LCC (Low Cost Carrier) that specializes in short haul. The majority of their flights are about an hour (Dallas to Houston, for example), and they do many flight segments each day. In their type of operation, they will accumulate six or seven cycles in eight hours of flying.

Qantas is a major international long haul carrier (San Francisco to Sydney, for example). San Francisco to Sydney is about eighteen flight hours, during which they will accumulate only one cycle.

The significance of this is that Southwest will use more tires, brakes, landing gear, and other cycle controlled parts, where Qantas will use more time controlled parts and fewer cycle controlled parts.


Power-by-the-hour (PBH) is a concept that ties the price paid for a component to the performance of the component. This works quite well for some parts, and not at all for others. The concept is a simple one - pay less money for parts with low reliability. We know from reliability data about how long a component will last. Let's use an aircraft tire for this example. Just to keep the numbers simple, we'll use a base cost for the tire of $1,000 and say that the reliability data indicates this tire will last for 100 landings.

We make a deal with the manufacturer to pay them $10 for each landing (cycle) the tire makes before failing or wearing beyond limits. If a tire makes its reliability average 100 landings, the vendor gets $1,000 for that tire. If a tire lasts 110 landings, the vendor gets $1,100 for that tire. If the tire fails after only 70 landings, the vendor only gets $700 for that tire.

This gives the vendor some real incentive to improve their reliability figures. If they can double their reliability, they can double the price they get for their component. Of course, these rates are renegotiated when reliability figures indicate significant changes.


Let's say we have an aircraft system that includes four parts: a pump, a regulator, a valve, and a controller. This system operates the redundancy widget. The aircraft lands at a remote airport, and the pilot writes up a problem with the redundancy widget.

The technician there changes the regulator and sends the airplane on its way. The regulator is not immediately shipped to a repair vendor, but is instead placed into parts quarantine.

If the redundancy widget performs as advertised for the next three days, changing the regulator has fixed the problem. The regulator is shipped to the repair vendor at that point.

However - if a day or two later another pilot writes up the same system for the same problem, and this time a technician at another station changes the controller. Three days after the controller change the system is still working fine. The controller change fixed the problem, and the regulator was obviously not at fault.

The regulator is now listed as "time continued" and put back into stock for re-issue. Had the part gone directly to the repair vendor when it was removed, it would have come back as "no-fault-found" from the vendor - along with a bill for a bench check, of course. A properly functioning component was placed back into stock, thus saving the airline time and money. Obviously, there are some components that do not lend themselves well to the Ship-or-Shelf program, but for those that do, significant savings area realized.

The TCAS Adventure...

Commercial aircraft have a Traffic Collision Avoidance System installed that utilizes a TCAS Computer as a major component. Reliability detected a sudden and sharp rise in TCAS Computer usage. Teardown reports indicated that the computers were damaged by liquid, and the vendor considered it to be Customer Induced Damage and therefore not covered under warranty.

Investigation revealed that liquid from the aircraft galley was leaking through the floor seams and dripping onto the TCAS computer, which was located under the galley floor.

The source of the liquid was the galley trashcan. The trash bags leaked and allowed liquid to run onto the floor, through the floor seams, and onto the TCAS computer. As this was a sudden spike in usage, further investigation was accomplished.

A couple of factors came into play. First, there had been a change in the In-Flight procedures that called for the Flight Attendants to periodically dump out the coffee and make fresh. It was dumped in the only place available, the galley trashcan. That procedure had been in place for some time before the spike in TCAS usage, so we looked further.

We then discovered that in an effort to be environmentally and financially conscious, the airline had entered into a contract with a new provider of biodegradable trash bags that coincidentally cost less than the old ones.

The new trash bags were not as durable as the old ones, and they leaked. The combination of these two items had been the root cause of the TCAS usage spike.

But wait - it gets better. Maintenance first suggested going back to the old style trash bags. Purchasing said no, as they had a contract. Maintenance then suggested double bagging the trashcans. Purchasing didn't like that idea because it would effectively double trash bag usage and hence costs. Ground services didn't like it because it meant a few seconds of extra work for them.

Currently, our Engineering group is looking at modifying the airplane to put an "umbrella" over the TCAS unit to prevent them from being damaged.

In dealing with reliability, one of the things you need to consider is the significance and rationalization of an event or occurrence. You have to put things in perspective. For example, some things are seasonal. It is not coincidence that the airline goes through more tires and brakes in the summer than in the winter, nor is it coincidental that the airlines has more issues with aircraft deicing systems in the winter months than in the summer months. So - if I detect a spike in deicing component usage in October, I don't necessarily flag it as an issue.

Likewise, you have to look at operational issues in regards to reliability. If the airline parks 35 aircraft in the desert for long term storage, does that impact my reliability numbers? If the airline doubles the number of flights in a given city, does it impact my reliability numbers? If there is a major construction project on the ramp near the airline gates in Los Angeles, can it affect my reliability and dispatch numbers?

Personnel issues can also impact the reliability numbers. For example, a flight crew arrives at the airplane ten minutes late and during their preflight notice that the oxygen needs to be serviced. They call maintenance to perform the service. Maintenance arrives and services the oxygen system, but since they were called late the aircraft misses the D-0 departure time. Maintenance takes the "hit" for causing the delay due to the late servicing of the aircraft. This completely ignores the fact that had the flight crew arrived on time, maintenance would have been contacted ten minutes earlier and the aircraft would have departed on time. Where does the responsibility for this delay truly belong? On maintenance - or on flight?

In an environment where anything less than perfection is unacceptable, reliability is a major undertaking. Far more than I can cover in the limited time available, but perhaps you have a better understanding of the challenges now.

by Bill Brinkley AP / IA,
American Airlines

ChatGPT with
Find Your Answers Fast