What Am I Supposed to Do With This?

30 May 2024

A multi-stage, multi-year project to install a centralized hydraulic system had just been completed. This newly commissioned milling system had more than doubled the instrumentation and control network for the area. Everything from hydraulic cylinder position indicators to individual hydraulic circuit flowmeters to in-line hydraulic fluid particle counters were included. Both the project engineers and vendors who supplied the system stated this was the most advanced monitoring system ever installed for this kind of hydraulic system. All of the data coming off the instrumentation was sent to a centralized control system, with logic written for alerts and alarms that would send out notifications to other systems and individuals responsible for the item in question. Each and every condition had been tested and confirmed to be fully operational. The project team rejoiced as the official handoff was completed and went off into the sunset believing that everything would live happily ever after.

Everything went fine for the first few months, but slowly and surely operational and maintenance supervisors began to receive automated notifications on their cell phones and emails, while maintenance planners began to see repetitive work requests in the computerized maintenance management system (CMMS). All of the notifications and requests originated from the new hydraulic system. The supervisors met with the on-staff controls engineers to go into the programming and disable the hydraulic system’s notification system and the planners purged out the automated work requests. The system continued to run for several more weeks without sending out any more notifications.

The system went down at approximately 3:30 in the morning when a pump discharge filter collapsed, building pressure on a flexible hose until it ruptured, spraying hydraulic fluid all over. Luckily, no one was present. The room where the incident occurred also functions as a containment area, so no environmental spill occurred.

The facility manager was furious as he said, “How could this happen? This system has every bell and whistle known to man!” He ordered the reliability engineer to stop everything and conduct a root cause analysis.

It didn’t take long for the engineer to realize that the notification system was deactivated. When trying to understand why, the supervisors shared what the notifications were for. Most of the notifications were from the filter pressure, either exceeding or falling below a predetermined setpoint. These events were found to be from start-up or shutdown conditions, where the pressure would normally fluctuate.

Feeling the pressure from the facility manager, one maintenance planner threw a stack of work requests on a conference table with a header reading, “Pressure exceeding setpoint” and said, “What am I supposed to do with this?” The system map created by the project team included every possible function that the instrumentation could provide, but didn’t provide any instructions as to what to do, what the notifications meant, or whether the operations and maintenance teams even found them relevant.

Ultimately, the failure of the hydraulic system originated with the project team. Its members did not work with operations and maintenance to identify what system conditions warranted actions and responses. This led to people seeing the notifications as a nuisance or the system continuously crying wolf. Instead, the project team focused on delivering maximum messages based on full utilization of all available instrumentation. If the team had spent more time trying to figure out what was actionable rather than what was possible, this incident could have been avoided.

tales from the shop