Data Center Journal

Volume 35 | December 2014

Issue link: https://cp.revolio.com/i/428453

Contents of this Issue

Navigation

Page 21 of 24

THE DATA CENTER JOURNAL | 19 www.datacenterjournal.com drill revealed that neither the data center operator nor the gas station owner had any provision to transfer the committed fuel to the data center facility. What would this have meant in the event of a true disaster? To put it in perspective, a 1MW generator can consume approximately 50 gallons per hour of fuel at approximately 75% loading. Needless to say, that's a lot of a 1 gallon gas containers (the only containers available to the gas station operator) someone has to carry between the two properties. In the event of a real disaster, the execution of this plan would have certainly resulted in a loss of critical generator power. e moral here is to practice execut- ing your emergency action plan. Prepared- ness is simply more than a plan, it is a well- practiced plan. So, when you are auditing your operations, or you are interviewing your outsourced provider, be sure to ask if they practice executing their emergency action plan. It may be one of the most important questions you ask. keeP uP with ChanGe Best practices change constantly. As a result, it is important to constantly assess hazard risks and the procedures to address hazard risk are always up-to-date. To un- derscore the changing risks, the California Geological Survey has released maps as recently as November 6 of this year for parts of Los Angeles County. Moreover, code is also one of the best examples of changing best practices as illustrated by the code changes as a result of the Northridge quake. Since the understanding of risks and regulations associated with risks are constantly changing, it is only natural that emergency action plans constantly need to be reviewed for required updates. A well run data center naturally un- dergoes significant changes over the course of operation to keep up with changing information technology needs. However, not all data centers keep up to date with documentation. e lack of up-to-date documentation may result in costly down- time due to errors in responding to a disas- ter related emergency or lengthy downtime associated with root cause analysis without the proper information available for the trouble shooting. Once your as-builts are up-to-date, make sure your emergency action plan exists on paper. If it does not exist on paper, it does not exist. Simply put, it is common to find one or two individuals that know what is going on. ey may be able to talk a good talk, but to truly ensure the correct actions, it is important to be well documented as well as well-practiced so the entire organization can be on the same page. Once it is on paper, make sure it is regularly reviewed to ensure that the procedures are current. invest in human CaPital e industry has made great headway in the area of designs of high reliability and high availability systems. ere are now multiple organizations that have established standards for infrastructure design and configurations. However, in the area of operations, our industry is still under devel- oped; and, it is here where the battle against disaster preparedness is won or lost. Despite the complex equipment, sensing devices, control systems, and other components of critical infrastructure that comprises a high reliability availability data center, data centers continue to rely on the most powerful of all information processors, the human brain. As much as data centers support inhuman information technology, uninterruptible power systems, and critical air conditioning systems, people continue to be the most powerful infrastructure component that results in a truly 24X7 facility. ese systems, despite how automated and smart they may be, rely heavily on the actions of operators to ensure uninterrupted operation. Of utmost importance is to have a response team in place. Human capital does not necessarily have to be in-house but may extend to strategic relation- ships that exist to ensure your response team consists of access to the immediate resources necessary to address a disaster. For example, in California where seismic disaster risk exists, it is important to have a structural engineer that can be called to the site to perform an immediate post event assessment. In the event of a major seismic event if your building structural system is compromised or suspected to be compro- mised, a Google search to find a nearby structural engineer capable and willing to deploy immediately will most likely be less fruitful. e example illustrates that it is importance of dedicated response team members. e best infrastructure, design, stan- dards, processes, and plans are meaning- less if an organization's culture does not support high availability and reliability. Much of the capability of dealing success- fully in response to a disaster lies in our human capital. It is not as much about the development of the plan document, or a technical fix to something broken. Rather, successful disaster recovery lies in the ability to manage people, train people, and have the right people resources in support of the machines the data center supports. And, there you have it, what data center managers have been oen missing while lost in a sea of N, N+1, 2N infrastructure. Among the most important assets for high availability and reliability in a data center has always been, and will continue to be, for the foreseeable future, is human capital. Culture is the Key Fundamentally, what all this boils down to is organizational culture. In order to establish a data center run with high availability and reliability, the organiza- tion has a culture of no downtime. If the culture is well established, the result shows in all aspects of the data center from the planning, the implementation, day-to-day operations, and disaster response. e systems designed and con- structed for a data center cannot establish a culture. erefore, with all the recent attention on disaster recovery planning, it is also a call for management to re-assess their organizations operations to ensure that among all things, that the culture of zero downtime exits. n about the author: Jun Yang, PE, LEED AP - Jun is the Managing Principal at Building Networks Group. He is a licensed professional engineer specializing in the planning, design, implementation, and operation of high availability and high reliability infrastructure. Mr. Yang, a two time graduate of UCLA with his BSME and MBA, is a trusted consultant and advisor to a variety of Fortune 500 companies. about the Contributing author: Imran Hoque, PE, LEED AP - Imran is a Principal at Building Networks Group. He is an electrical engineering graduate of UCLA with a diverse background in design and planning of critical power systems. As a licensed professional engineer, he has used his expertise and unique insight in support of improving reliability for data centers, hospitals, airports, and other critical environments.

Articles in this issue

Archives of this issue

view archives of Data Center Journal - Volume 35 | December 2014