Data Center Journal


Issue link:

Contents of this Issue


Page 17 of 20

N atural disasters can busi- nesses at considerable risk of extended downtime, and downtime is costly. Ac- cording to a 2016 Ponemon Institute study, an average data center outage costs $740,357, and those num- bers continue to climb. Organizations that manage their own data centers must formulate contingency plans that keep their mission-critical operations running. at's equally true for organizations that rely on outside data centers to man- age important operations. In the case of external data centers, the business might develop its own plan or ask its co- location provider to prepare a detailed disaster-recovery (DR) plan to ensure it's shielded from substantial downtime. TENETS OF A GOOD DR PLAN Whether a company houses its data in a public data center, on premises or at a colocated data center, the main tenets of any good disaster-recovery plan apply. Contingency plans should be outlined at the beginning of any data center–customer relationship. More- over, a solid understanding of the data center operations' emergency-commu- nications practices, staff and facility preparedness, ongoing maintenance checks, and disaster training is es- sential before signing any service-level agreement. COMMUNICATION IS KEY Although communicating day- to-day operational status is typically straightforward, communicating during an expected or unexpected disaster is obviously more urgent and critical. Be- cause unexpected disasters are possible, customers should establish a strong relationship with their data center operations manager and should spend time carefully reviewing proposed emergency-communications plans. While natural disasters are unplanned events, companies certainly should pre- pare for them. For weather events such as hurricanes and major snowstorms, forecasters typically provide several days' warning, allowing data center op- erators to monitor these situations and inform tenants. In the case of anticipated disasters, data center operators should be in touch at least 48 hours in advance, commu- nicating the disaster-recovery proce- dures. Data center customers should remain available for regular updates. Companies and data centers should also provide an emergency bridge between their local and national operations team to communicate and update the status of equipment, power and staffing and to ensure that mission-critical applications are working and uptime is maximized. A communications checklist is also critical and should include data center staff coverage, provisions, equipment testing and generator refueling. Data center personnel should plan to provide status updates for all checklist items every four hours to ensure all parties understand the procedures. Doing so provides the confidence and peace of mind that their mission-critical opera- tions are running without a hitch. STAFFING UP FOR DISASTERS Open channels of communication are critical during a disaster, but staffing is also imperative, and it starts at the top. e data center's VP of operations, facility director, local site manager and technicians should devise a coverage plan that can be implemented during a natural disaster. With planning, emergency staff can be prepared at a moment's notice to weather the storm. at outline should include preparations to engage extra staff, checking emergency kits for flashlights and water, and ensuring cots or beds are available for staff to hunker down in the data center for the dura- tion of the disaster. e key is having qualified staff safely inside the facility before the disaster strikes, being ready to continue operations as well as protect and closely monitor equipment. If local staff members are un- available to run the data center, a well-planned operation will fly in staff from other locations to ensure smooth operations. And in the event that the local data center is knocked out, a contingency plan for moving mission- critical operations to another location is imperative. If data centers have standard operating procedures (SOPs), employ- ees from other locations should be able to quickly jump in and seamlessly run the facility until the disaster abates. KEEPING MOPS AND SOPS UP TO DATE Preparedness also means review- ing the center's methods of procedures (MOPs) and SOPs regularly. Data centers should have multiple levels of security, industry-specific compliance, and 24x7x365 on-site staff adhering to Bell Systems MOPs and SOPs to provide lifeline services for business continuity. One question any organization should ask of its data center opera- tor is, "When did you last perform an integrated-systems test (a complete power shutdown from the utility)?" In this scenario, the power is turned off at the box outside the building to ensure battery backups and generators react as intended. Good data center managers maintain logs and can say exactly when they last conducted this test. Whether companies use colocation data centers or run their own, maintain- ing proper procedures documenting every test is critical. It's important to audit testing procedures at least once annually to ensure backup power and cooling can maintain the facility's cur- rent power load. A strategic data center will also conduct a facility walkthrough every day and review disaster checklists as part of that ritual. During the walk- throughs, the team should check every piece of infrastructure, such as the CRAC units, all environmental systems, doors, roofs, fuel-tank environments, entry points and the building's perim- THE DATA CENTER JOURNAL | 15

Articles in this issue

Links on this page

Archives of this issue

view archives of Data Center Journal - VOLUME 56 | AUGUST 2018