Data Center Journal

Volume 35 | December 2014

Issue link: https://cp.revolio.com/i/428453

Contents of this Issue

Navigation

Page 18 of 24

16 | THE DATA CENTER JOURNAL www.datacenterjournal.com v irtually all owners con- sider their data centers to be critical. If you operate such a facility, your highest objec- tive is to achieve continuous operation, because the consequences of an interruption to data processing are costly and painful. When they agree to construc- tion of a new data center, those approving the funding typically do so with an expec- tation that the new facility will outperform the previous one. Much pressure falls on the group that will operate the new build- ing's infrastructure, yet the startup process almost always fails to prepare them adequately for success. If we are uncomfortable handing the car keys to a new driver who has had no driver training, why do we assume the team that will operate and maintain the new data center needs no dedicated time to practice with the new systems they are tasked to keep running continuously? Our industry has indeed made great strides in validating new facility infrastruc- ture systems before construction is com- plete. e practices of factory acceptance testing and multilevel commissioning help ensure each component and system behave as expected under all conditions. Effec- tive commissioning also ensures that all of the systems interact properly when utility sources, individual infrastructure systems or components fail. Together with im- proved quality in the design and manufac- ture of many infrastructure systems, these commissioning practices have dramatically reduced the odds of equipment failures that can affect data processing over the facility's life span. But the above-mentioned process improvements largely fail to address the greatest risk of downtime: human error. Industry surveys have consistently found that mistakes made by those operating and maintaining critical facilities are the most frequent cause of disruptions. Examples include inadequate staff size, lack of pro- cedures, incorrect procedures, inadequate funding and, perhaps most importantly, inadequate training. To expand on the driver's education comparison, novice drivers are first given classroom instruction, then hands-on instruction in the presence of a qualified trainer; they then perform solo opera- tion while accompanied by the instructor. If you have not personally observed the commissioning and startup of a new data center, you might assume that the team tasked with operating the infrastructure systems would receive a similar training sequence. With rare exception, you would be quite wrong. Today's typical data center startup process offers the team that will operate the facility only the barest opportunity to understand how all systems should oper- ate, particularly in response to component failures and during system transfers—be- fore and aer maintenance. is situation owes largely to the significant tension that the sacrosanct construction completion date creates. Once established at the onset of the project, the general contractor and equipment suppliers are increasingly pres- sured to meet this deadline. Performance- based payments and reputations are tied to this milestone measurement. As a result of this significant dy- namic, any planned operator training is generally rushed. Operators oen refer to the experience as "trying to drink from a fire hose." If any part of the construction and commissioning process falls behind schedule, significant portions of scheduled training may be skipped altogether. Even in the best scenario, multiple vendor training sessions are scheduled in the same week and conducted with the entire group of new operators, instead of individually. us, no more than one team member receives "hands-on" practice, as the rest observe and try to hear over the noise of the equipment in operation—oen while commissioning is being conducted in the same room. Would you be confident driving for the first time if you had only observed the instructor from the back of a fully occupied vehicle, with all of the pas- sengers talking during the instruction? is common scenario represents the most glaring omission from our industry's efforts to continually improve reliability. Data center owners as a group have failed to recognize the true risk of expecting even qualified and experienced facilities operators to seamlessly operate their new systems despite inadequate training. Each team of data center facility op- erators requires extensive, individualized training and practice time, just as those who operate aircra, submarines, ships and other complex systems receive. We expect training for these professions to be comprehensive, but we ignore our failure to provide similar training to our own critical-systems operators. Once construction and commis- sioning are complete, each owner should provide the following: • System overview training conducted by the engineers of record • Systems training delivered by indi- vidual suppliers, with hands-on time provided to each operator for "nor- mal operation" as well as "configura- tion changes" for repair/maintenance • Four to eight weeks of practice time using system transfer procedures to make configuration changes, as well as simulating emergency-response scenarios It is truly within the owner's control to set a schedule that includes one to two months of rigorous hands-on training for the new facility's operators. A hand- ful of owners have done so. Many will assume they cannot afford this investment, owing to the cost of the labor involved and because of the pressure to begin data processing quickly. is decision should be made carefully and driven by the impact of downtime for the organization. On aver- age, recovery from a single downtime event requires four hours or more before all data processing is restored. For many compa- nies, this downtime equates to a loss of several million dollars. e cost to operate the new facility in "practice mode" for two months will be dramatically less. Over the lifespan of a data center facility, the payback for scheduling and implementing comprehensive site-specific training and practice time for your build- ing operators will be immense. ey will have no other opportunity to develop the confidence you expect them to acquire. n If we are uncomfortable handing the car keys to a new driver who has had no driver training, why do we assume the team that will operate and maintain the new data center needs no dedicated time to practice with the new systems they are tasked to keep running continuously?

Articles in this issue

Links on this page

Archives of this issue

view archives of Data Center Journal - Volume 35 | December 2014