REGISTER NOW! August 1, 2022. FREE 1–Hour Virtual Uptime Elements Introduction

Mission Critical

Mission critical facilities are like other facilities in that they have electro-mechanical equipment that must be maintained. The difference is that the operators of mission critical facilitiesowing to the extremely high availability requirements from managementhave to pay much more attention to the equipment so that it will not fail. This requires dual-path power supply systems (for redundancy) and regular testing of the systems.

Systems

* Dual-power technology requires two completely independent electrical systems tied together with switchgear. When the normal source of power fails, these dual-path power supply systems quickly switch to a back-up source. A UPS system keeps the power flowing until the normal source is restored or another source is brought on-line and synchronized. Usually, the UPS, through a PDU or power distribution unit (see figure 1, 2, 3), takes AC power, converts it to DC where a bank of batteries is tied in and then inverts it back to AC to feed the computer hardware. Since the systems often cannot be tested on-line, they must be tested during "maintenance windows", planned outages or times when the impact of testing is low, so that simulations can be run. By pulling power from a load bank, resistive load testing is used to fully simulate and test all equipment on the floor. Any problems that are encountered during an infrared survey are repaired immediately and the system is rechecked before putting the equipment back on-line.

Figure 1 – Typical PDU in a data center with load bank test being run.

Figure 1 - Typical PDU in a data center with load bank test being run.

Figure 2 – SCR connection on an inverter assembly at over 550º F.

Figure 2 - SCR connection on an inverter assembly at over 550º F.

Figure 3 – Bolted/crimped connector on an output filter.

Figure 3 - Bolted/crimped connector on an output filter.

* Battery back-up systems (see figure 4) must be checked in a real-time battery discharge situation to fully simulate an actual loss of the normal source of power. The batteries, connections, cables, switches and charging systems are checked for unwanted heating conditions.

* Uniform cooling of all data center server, storage, and computer equipment is essential for proper operation. The design objective of the cooling system is to provide a clear path from the source of the cooled air to the equipment and back to the cooling unit. This issue has received much attention lately as miniaturization of the equipment and economic pressures have increased the amount of heat that is generated per cubic foot of floor space and per cubic foot of rack space in the server rack panels. This hardware is sensitive to heat and humidity and some new designs are being tested so that failures do not occur solely due to environmental conditions (see figure 5). How perfect an application for IR!

* Utility main power supplies are typically owned by the local power company but are sometimes owned by the user. A looped system feeds power from two different power company substations and can be "back fed" if the power is out on the primary. No matter who the technical owner of the utility equipment is, it must be checked with IR like all other components. (See figure 6).

* Mechanical Systems have the same stringent requirements as the electrical system. Again, this is achieved by redundancy and failure prevention engineering.

Accountability

There must be a total accountability of all infrared survey results, especially all of the equipment associated with the UPS, computer and server systems. This can be accomplished by recording the entire survey on digital videotape and/or capturing fully-radiometric images of all equipment, whether problems exist or not. In either case, a data log of all equipment surveyed must be created including a time/date stamp reference for all equipment. Documentation is very important.

Figure 4 – Small battery bank with a loose lug connection on the main breaker.

Figure 4 - Small battery bank with a loose lug connection on the main breaker.

Figure 5 – Server rack designs being tested for heat dissipation.

Figure 5 - Server rack designs being tested for heat dissipation.

Figure 6 – Pad-mounted transformer with loose connection on line side.

Figure 6 - Pad-mounted transformer with loose connection on line side.

Summary

To achieve five nines availability, it is essential that competent IR testing be performed on all electrical and mechanical systems in conjunction with other testing and in cooperation with management and maintenance personnel.

If you maintain an office building, manufacturing facility or any other type of facility where uptime is important, you should take time to follow what is happening with data centers, as they are among the most mission critical of all operations.

Author Biography

Gregory R. Stockton is president of Stockton Infrared Thermographic Services, Inc. Based in Randleman, NC; the corporation operates six applications-specific divisions. Greg has been a practicing infrared thermographer since 1989. He is a Certified Infrared Thermographer with twenty-six years experience in the construction industry, specializing in maintenance and energy-related technologies. Mr. Stockton has published eleven technical papers on the subject of infrared thermography and written numerous articles about applications for infrared thermography in trade publications. He is a member of the Program Committee of SPIE (Society of Photo-Optical Instrumentation Engineers) Thermosense and Chairman of the Buildings & Infrastructures Session at the Defense and Security Symposium.

Copyright © November 2005

Stockton Infrared Thermographic Services, Inc. (www.stocktoninfrared.com) and Uptime® Magazine (http://www.uptimemagazine.com)

Gregory R. Stockton

Upcoming Events

August 9 - August 11 2022

MaximoWorld 2022

View all Events
banner
80% of Reliabilityweb.com newsletter subscribers report finding something used to improve their jobs on a regular basis.
Subscribers get exclusive content. Just released...MRO Best Practices Special Report - a $399 value!
DOWNLOAD NOW
Reliability Leader Fluid Cleanliness Pledge

Fluid Cleanliness is a Reliability Achievement Strategy as well as an asset life extension strategy

MaximoWorld 2022 Conference Austin Texas

Connect with leading maintenance professionals, reliability leaders and asset managers from the world's best-run companies who are driving digital reinvention.

“Steel-ing” Reliability in Alabama

A joint venture between two of the world’s largest steel companies inspired innovative approaches to maintenance reliability that incorporate the tools, technology and techniques of today. This article takes you on their journey.

Three Things You Need to Know About Capital Project Prioritization

“Why do you think these two projects rank so much higher in this method than the first method?” the facilitator asked the director of reliability.

What Is Industrial Maintenance as a Service?

Industrial maintenance as a service (#imaas) transfers the digital and/or manual management of maintenance and industrial operations from machine users to machine manufacturers (OEMs), while improving it considerably.

Three Things You Need to Know About Criticality Analysis

When it comes to criticality analysis, there are three key factors must be emphasized.

Turning the Oil Tanker

This article highlights the hidden trap of performance management systems.

Optimizing Value From Physical Assets

There are ever-increasing opportunities to create new and sustainable value in asset-intensive organizations through enhanced use of technology.

Conducting Asset Criticality Assessment for Better Maintenance Strategy and Techniques

Conducting an asset criticality assessment (ACA) is the first step in maintaining the assets properly. This article addresses the best maintenance strategy for assets by using ACA techniques.

Harmonizing PMs

Maintenance reliability is, of course, an essential part of any successful business that wants to remain successful. It includes the three PMs: predictive, preventive and proactive maintenance.