Data Center Journal

Page 18 of 36

having a change-control process where any changes—whether IT or facilities related— are reviewed cross-functionally to ensure that none of them put the data center at risk," said Garcia. In some cases, however, certain changes are potentially problematic. Garcia recommends mitigation or contingency plans in such cases to prevent downtime once the work starts. Unfortunately, even if you've implemented a change-management policy, prudence demands verification of relevant system configurations before you begin critical work. Although doing so may cost some extra time and effort beforehand, you must weigh that perhaps unnecessary effort against the costs that your business might incur should you run into unexpected conditions once work begins. And because critical work requires careful scheduling, such an occurrence can easily throw off large chunks of the schedule, potentially ruining the entire plan. CONSIDER PERIPHERAL EFFECTS ON OPERATIONS Focusing on electricity, airflow and network connectivity are critical when performing maintenance on a live data center, but data center managers should be careful to remember other, more indirect aspects of how their work might affect operations. If the work involves some kind of physical construction, for example, Swedish construction company Skanska recommends sealing the work area with plastic to prevent dust from reaching IT and other sensitive equipment. Furthermore, depending on the location, workers may need to wear booties or other coverings to prevent dust from hitching a ride into the data center proper. In addition, temporary rearrangements of equipment or the presence of certain gear—if large enough—can cause changes in the normal airflow of the data center. The result can be dangerous hot spots that could lead to system failures. Depending on the scope and budget of the project, one option is to employ computational fluid dynamics (CFD) to model the airflow. Although doing so may or may not be practical for intermediate stages of the work, it can deliver solid returns if applied during the planning phase for the project—mainly if the work involves a new 16 | THE DATA CENTER JOURNAL cooling system or otherwise altered airflow dynamics, such as through rearrangement of server rows. PRACTICE WHERE POSSIBLE As far as possible—and practical, given the associated costs in employee time and so forth—conduct a practice run of your plan. The more critical the systems you're working on, the more beneficial practice can be in avoiding downtime. In addition to identifying potential trouble spots that you might not have considered or might otherwise been unaware of, a dry run can help employees and other involved parties gain confidence in what they'll be doing. It will also help data center managers govern the process more smoothly. SAFETY Uptime should always be second to employee and contractor safety. A dangerous shortcut during the process might have the potential to save the entire project, but it can also put lives at risk. Practically speaking, a serious injury or death in the data center is likely to cause more trouble than some downtime because of a problem that arises during the project. From a more compassionate perspective, it's better to lose some business than to risk the lives of those working in the data center. Safety considerations may not improve uptime, but they can improve morale and encourage responsibility among employees and data center managers alike. They can also help avoid regulatory hassles. Of course, there's always a balance to be struck: it's easy to go overboard with safety to the point of foolishness. Usually, however, a data center manager with some common sense will be able to identify areas of critical work on the live facility that require more care than others. LEARN FROM PAST EXPERIENCES Maybe your last effort at maintenance led to downtime. The only unforgivable failure is the failure to learn from the experience. If you've needed to perform live maintenance or upgrades in the past, you'll need to do them again in the future. Even if your last effort wasn't a resounding success, you can glean information from it that will help you avoid similar difficulties in future projects. In addition, experiences gained during maintenance projects—whether successful or not—enable opportunities to prepare beforehand for future projects. For instance, a data center manager might consider implementing a change-management policy to keep better track of the equipment configurations in the facility. CONCLUSIONS The central facet of any project involving critical work on a live data center should be planning. The more detailed the plan, taking into consideration likely contingencies that could arise during the project, the more likely staff and contractors will execute it successfully. Apart from simply planning ahead of particular projects, however, companies should plan from the very start: the design phase of the data center. Appropriate redundancy in critical systems not only avoids single points of failure, it enables maintenance and upgrades while the data center is still running. Brocade's Victor Garcia suggests, "From a design standpoint, future-proof your design to the next level by thinking through each discipline: what if you had to provide one more level of redundancy or what if your densities or number of racks had to increase, which increases your total system load. Make sure you can add an extra set of equipment from a space perspective, and being able to tie it into the distribution system without interruptions, for example installing maintenance bypass switches, bypass valves or isolation valves, putting in a tie breaker or simply reserving space in the mechanical or electrical room for expansion capabilities." This kind of planning and ongoing awareness of data center design and infrastructure not only enables scalability, it enables critical work that doesn't interfere with uptime. If your customers, whether internal or external, demand always-on access to IT resources, you can expect to face live maintenance and upgrade projects. By taking some steps beforehand—including but not limited to detailed planning—you can avoid the high costs of downtime while improving your data center. n www.datacenterjournal.com

Volume 28 | August 2013

Contents of this Issue

Navigation

Page 18 of 36

Articles in this issue

Links on this page

Archives of this issue