What constitutes a component renewal? Should unplanned renewals that were not credited by maintenance schedulers be counted? The answers are dependent on the objectives of the analyst. It should be noted that it is a reality that some corrective maintenance actions, which may renew the life of a component as well as a planned restoration or replacement, are not always appropriately reported to maintenance schedulers. Also, equating a repair with a planned overhaul is a subjective decision and if repairs fall short in comparison to a planned overhaul, both in scope and quality, error is induced. With that in mind, two approaches in studying the effects of aging are possible.
The first approach is to study physical component health, where the analyst will attempt through all means possible to account for any action that reverses the effects of component aging. Sometimes the analyst may find that a corrective maintenance record, which renewed the life of the component, was not credited as such in the maintenance management system. So, in addition to accounting for planned and unplanned renewals that were credited by maintenance schedulers, the analyst would ensure that unaccredited renewals were accounted for as well. At SUBMEPP, the engineer would reset the "lifecycle clock" to zero by denoting "Renewal Yes" within the appropriate record of the application. If the unaccredited renewal only renewed a specific component part, the renewal would be credited only when studying the component part in isolation. Anyone studying the component to improve design would utilize the physical component health approach.
The second approach is to study the effect of the component's time directed maintenance plan action on system health. This approach measures the effectiveness of the maintenance plan. Only those renewal records, both scheduled and unscheduled, which are credited by planned maintenance schedulers, are used to reset the component lifecycle clock. The analyst would accept that there might be unplanned events during the lifecycle, which reverse the effects of component aging, improving system reliability. Those events were not caused by the existence of an engineered component periodicity, nor did they influence the execution of the time based maintenance plan. Therefore, they are treated as outside environmental influences, which may or may not affect system health. In fact, if non-credited renewals occurred at some relevant frequency and maintained system reliability within a random failure pattern, even though a physical component age and reliability relationship actually existed, a time directed task would not be worth accomplishing. Although, if it were known that improvements were being made to report and credit unscheduled renewals and controls were established to lessen the subjectivity of crediting unscheduled maintenance, the first approach to study physical component health would be appropriate. Bottom line is, don't measure with a micrometer if the cut will be made with a saw.
Next, the population for each age interval must be calculated to compute the failure rate. Given a particular age interval, what is the number of components that could have had an observed failure? For instance, there are many Los Angeles class submarine components that have operated at least a year, but there are far fewer with twenty years of service life. A factor complicating this determination is the often times limited duration of the corrective maintenance data "window". While sixty Los Angeles class boats have had components serve through at least one year of service time, many of those components served their first twelve month age interval prior to 1989, and 3-M OARS data can not be retrieved for dates earlier than that. The SSN 700 boat had components serve during their first year of service in 1982, but it is not known what failed during that time period.
Because the time span of 3-M OARS data has limits, and because an analyst in any industry may choose to study only a targeted calendar time frame, the beginning and end dates of the data window must be accounted for to enable accurate processing. The maintenance plan strategy for a particular component is often changed at a particular date, so for comparison, the analyst may wish to independently study age and reliability relationships both prior to and after the date of that change. By accounting for the data window span, the system will ignore those component service times outside of the window. An analyst should be careful not to confuse the time span of the data window with the overall age span of the population. There is no relationship.
For example, the data window for a particular analysis may commence January 1, 1990 and end on January 1, 2000. If the specified age interval duration was 12 months and the application was calculating the third age interval (25-36 months), then the population of components that experienced the age of 25 to 36 months, anytime between Jan 1 1990, and January 1, 2000, must be counted. All components whose lifecycles commenced between January 1, 1988 and January 1, 1997 would satisfy the requirement of having fully served the third age interval during that ten year data window, if they indeed lasted that long (see figure 3). If an existing component was placed in service on January 1, 1980, it would experience an age span of 121 to 252 months during the data window. That would represent age intervals 11 through 25. Of course, component lifecycles usually don't start and end at the same time of year as the data window boundaries, so SUBMEPP's application accounts for this by calculating fractional populations. The population for a specific age interval is not always a whole number.
Figure 3. Third Age Interval Population
Constrained data windows can induce error if not accounted for properly. If the analyst is studying physical component health, component ages, for those put into service prior to the start of the data window, can not be verified. In that case, those component lifecycles must be omitted from the study until the first known life renewal occurs within the data window. This is one of the reasons why most SUBMEPP analyses are conducted utilizing the system health approach.
Some people do not readily accept the premise that the entire life of some components in the study need not be observed. The process should be thought of as an age comparison. A good analogy of SUBMEPP's process would be the studying of a newspaper's obituary section. On any given day there are usually more eighty-year-olds listed than thirty-year-olds. If subsequent daily readings yielded the same result, the analyst may generally conclude that death at eighty is more likely than death at thirty. If the analyst then reviewed census data to estimate the regional population counts for those age groups, and normalized the results for the two age groups, the analyst's conclusions would be even more relevant. The analyst would not have to study eighty years worth of newspaper obituaries to accurately conclude that the probability of death at eighty is higher than the probability of death at thirty.
Finally, when all variables are accounted for properly, the failure rate is computed by dividing the total number of failures per age interval, by the population for that age interval.
Utilizing the Salvage Air Valve failure counts exhibited in table 1, the failure rates are calculated and displayed in table 2 based on actual age interval populations. Now the effects of age on reliability can be observed (figure 4). This is done through regression analysis where probability of failure is the dependent variable and age is the independent variable. SUBMEPP's application creates a scatterchart of plotted points, and fits both a line and 2nd order polynomial. The mathematical equation for these curves is generated as well. While a mathematical function can most always be created from scattered data, its relevance will be based on the results of a correlation analysis. The application conducts a correlation analysis as well for both curves to explain how well the plotted points fit about the generated curves. Coefficients of determination are calculated to indicate what portion of data variance is explained by the independent variable (age).
Table 2. Failure Rate Computations
Figure 4. Age and Reliability Graph for SSN 688 Class Salvage Air Valves
Results
SUBMEPP's Reliability Centered Maintenance group supports the organization's Engineering division and Maintenance and Availability Planning Programs division. In the capacity of process owners, the group works collaboratively with submarine system maintenance engineers, in a team environment, to conduct RCM analysis on specific system components. Another aspect of the group's mission is to train engineers in all areas of RCM. The data analysis application was created to be utilized by either professional data analysts or by maintenance engineers. Both approaches have worked well and each has its own advantages. To date, Age and Reliability graphs have been generated for fifty-two submarine component types. These components are as complex as communications equipment, refrigeration plants, turbine generators and towed array handling equipment. Simple, but vital components have been analyzed as well such as hull and backup valves, gas regulating valves, steam isolation valves and ship's whistle. Air dehydrators, switchboards, circuit breakers, hatches, compressors, pumps, condensers, motor generators, torpedo tubes, atmosphere control equipment, and propulsion shaft bearings are all examples of the type of equipment comprising this paper's fifty-two component sample.
71% of the components profiled by SUBMEPP experienced a steady state of random failure after their early years of operation. Some of the components in this group did experience infant mortality or short-lived increases in their rates of failure. This compares generally well with the UAL (89%), Broberg (92%) and MSP (77%) studies. As mentioned previously, UAL and Broberg are based on aircraft. MSP and SUBMEPP are based on navy vessel components and so it is logical that SUBMEPP's results parallel MSP much closer than UAL and Broberg. The concept of industry norms is reinforced here.
SUBMEPP's age and reliability characteristic findings are categorized in figure 5 based on sample population proportions. Only 12% of the sample supported the traditional belief that equipment operates at a steady state of reliability and then wears out at an identifiable time period. The remaining 17% that demonstrated age related wear out did so at an increasing but steady rate over their life span.
The differences between characteristics B and C may possibly be explained by the complexity of the component. The simpler the component and the fewer failure modes attributed to it, the more likely that sudden wear out occurs, if indeed there is an age and reliability relationship. Interestingly enough, all of the components in the sample that exhibited characteristic B were either valves or valve like in function. There was one component that matched characteristic A and, being an electro-mechanical device with numerous valves, it suffered predominately electrical type failures in its early years and predominately valve related failures in its later years.
Characteristic C components tended to be more complex then characteristic B. Complex components have multiple modes of failure and those individual modes may fit characteristic B when viewed in isolation. However wear out
patterns among these individual modes tend to occur at different times and when viewed in the aggregate, the overall failure rate pattern matches characteristic C.
Figure 5. Age and Reliability Characteristic Categories
Characteristic C represented a larger portion of SUBMEPP's sample than it represented for MSP. Conversely, characteristic B represented a much smaller portion of SUBMEPP's sample than it represented for MSP. The analytical approach may bear some responsibility. Recall the two possible approaches - physical component health and system health. The majority of SUBMEPP analyses were conducted utilizing a system health approach where only planned overhauls were considered life renewals. This inevitably resulted in some dampening of failure rate increases. For instance, SUBMEPP analyzed a desurger and found that it experienced an increased failure rate as it aged. When the physical component health was analyzed, where some unscheduled repairs or part replacements were considered life renewals, the failure pattern matched characteristic B. This was caused by the desurger's rubber bladder. The bladder had a pronounced failure rate increase at 133 months. However, when system health was analyzed, where only scheduled component renewals were credited, the failure pattern matched characteristic C. Under the physical component health approach, not all components last to the latter age intervals. The failure rate is termed the "conditional" probability of failure. The condition being, the component must survive to that age interval. However, under the system health approach, many more components survive to the latter age intervals, even though some measure of life renewal occurred along the way. It is that unaccredited measure of life renewal that results in an improved reliability outlook and tends to create a linear incline, vice an exponential incline. It tends to blur any sharp swing in the pattern.
Ideally, life renewal tasks are prescribed when a characteristic B situation occurs - just prior to the upswing in the probability of failure. Life renewal tasks might still be applicable and effective in a characteristic C situation if system health was analyzed. If, for instance, it is demonstrated that a failure rate beyond a certain percentage is undesirable, a maintenance task at that point should return the failure rate to that found at the x-axis origin. What is the return on investment? Figure 6 displays an unconditional probability of failure graph for an asset where only planned renewals were credited. The asset has a planned renewal every ten years. The probability of failure is termed "unconditional" since the entire population shall survive to an age of ten years, unless the component or system is removed from operation. Now, suppose that a 15% failure rate is deemed unacceptable. What would be the effect of a planned renewal at five years? The action would prevent the annual failure rate from increasing beyond 15%, of course, and the increased repair costs beyond five years would be avoided. The reliability effect would be quantifiable by subtracting the area under the curve prior to five years, from the area under the curve beyond five years. If cost were the sole determining factor, the analyst would quantify the costs associated with failure and the costs associated with a planned renewal to determine if there are savings worthy of an investment.
8% of SUBMEPP's sample population exhibited infant mortality characteristics. This differs significantly with the earlier findings of UAL and Broberg. As mentioned previously, navy vessels go through a lengthy test period prior to entering service. Infant mortality likely exists however those failures are not captured in 3-M OARS during those test periods. SUBMEPP's infant mortality statistics differ from MSP as well. 32% of MSP's sample suffered from infant mortality. Differences may be caused by the type of equipment analyzed. The majority of SUBMEPP's components fitting characteristics A and F were more electrical in nature, than mechanical. Electrical devices are more prone to sudden failure early in their life. The majority of components in SUBMEPP's sample were mechanical in nature, however, and that may differ from MSP and the other studies.
Figure 6. Reliability Effect of Time Based Overhaul
For continuation of article click here