Be prepared when disaster strikes
By TS Khoo March 14, 2013
- 27% of companies declared at least one disaster during past 5yrs; in 2012 from Jan-Oct, 83 disasters hit Asia
- Organizations need solid disaster recovery plan that outlines the steps an IT department can take to quickly resume operations
BUSINESS disasters can happen at any time and anywhere, and can include anything from power outages to human errors to natural disasters, which can all significantly affect an organization’s IT and business operations.
A study by research firm Forrester dispels the myth that disaster declarations are rare occurrences with the statistic that 27% of companies had declared at least one disaster during the past five years. In 2012, from January to October alone, 83 disasters hit Asia.
With Asia Pacific declared as the most disaster-prone region in the world (International Federation of Red Cross and Red Crescent Societies, Disasters in Asia: the Case for Legal Preparedness), to keep companies operating in 24/7/365 environments, the first thing an organization needs is a solid disaster recovery plan that outlines the steps an IT department can take to quickly resume operations in the event of a major service interruption or outage.
In this article, we’ll address common disaster forms and how an organization can be prepared if it is hit.
Despite advances in computer technology, power outages continue to be a major cause of PC and server downtime. Many power problems originate in the commercial power grid, which is subject to conditions such as lightning storms, equipment failure and major switching operations.
Power problems affecting today’s technological equipment are often generated locally within a facility from any number of situations such as local construction, heavy start-up loads and faulty distribution components.
So how can an organization safeguard against these occurrences?
A great way to be prepared for outages is to purchase a UPS with extended runtime capability and a generator, if required.
Several surveys (including from Bell Labs and IBM) in the United States revealed that 10% of outages are longer than five minutes and 1% longer than one hour. Given this, purchasing a UPS with extended runtime capability is helpful.
In cases where hours of runtime are required, a generator is recommended. However, you will still require a UPS to maintain the load until you can be connected to the generator.
Organizations can also protect their network equipment with UPSes. Power protection for hubs, routers, and switches is an essential but sometimes overlooked ingredient in ensuring availability of applications.
Lastly, companies can take power outage precautions by accommodating each server’s individual time requirement for shutdown. The time required to properly shut down the operating system varies between systems – some email servers with many accounts have been known to take upwards of 20 minutes to shut down.
It’s important to make sure the UPS software’s settings take each of your computers’ specific requirements into account and are correctly set. Without shutdown software installed on the protected computers, the net effect of the UPS is simply to delay the inevitable.
The small effort involved in installing and configuring such software can be well worth it in the event of an extended power outage that exceeds the runtime of the UPS.
It’s good to remember that disasters aren’t just of the natural variety. Many disasters are man-made, often the product of human errors and avoidable problems. That is true of the types of failures that can affect data centre power systems.
Here are some tips to help minimize human errors in the IT department:
1. Take care of your generators. Not conducting weekly tests can result in deposits building up that prevent the generator from developing full power under load – a state known as “wet stacking.” To avoid wet stacking, generators should run for two to four hours under full load, which allows the deposits to be blown out. When maintaining your generator, also ensure oil is replenished regularly and coolant levels don’t fall too low.
2. Maintain your automatic transfer switches that start the generator when utility power is not available. These switches contain parts and connections that can and will fail, therefore ongoing maintenance is vital.
3. UPSes are essential at mission-critical facilities. However, their batteries do have a finite life span and must be tested regularly to prevent failure. To increase reliability, battery monitors are a good idea.
4. When in doubt, quality is king. A number of IT components may actually be involved in bringing power to the data center floor, including power distribution units, remote power panels and distribution panels. This is no place to skimp on quality. The most state-of-the-art generator, utility feeds and UPSes will do no good if the final connection relies on poor cabling, breakers and distribution systems. In short, don’t leave millions of dollars in infrastructure at the mercy of a $4 power strip with a 10-cent push-out circuit breaker and 20-cent switch.
Natural disasters can result in wide-spread destruction and in some cases, days of downtime for an organization. Just look at the blackout New York faced during Hurricane Sandy.
With Asia Pacific being particularly disaster prone and flooding common in areas like he Philippines and Thailand, we can and should learn lessons from these events that can affect our data centers.
For data centers, a comprehensive design is vital in achieving uptime in a mission-critical facility. In addition to the pointers mentioned previously, when preparing against natural disasters, here are three top considerations that need to be taken into account:
1. Site selection – It’s important to be aware of local hazards (floods, fires, typhoons) in an organization’s location and how they can potentially affect your business. It’s important to ask “if there is a disaster, will there be enough resources to cope?” Organizations need to ensure that their utilities are scalable as the company grows, and that there will be enough capacity and redundancy.
2. Control system design – The brain of a business’ mission-critical facility is the many systems controlling the generators, switchgear, UPS systems, chillers, fire alarms, security and other mechanical and electrical systems. It’s vital that these systems are designed to be fault-tolerant and redundant, and that loadable copies of the software (including passwords) are secure and available. In times of an emergency, the company needs to be able to reload and restore key systems independent of outside sources.
3. Monitoring system design – Technology now allows automated monitoring of virtually any type of infrastructure equipment and environmental condition. Automated monitoring can gather trending information that will help predict equipment failure and inform a member of staff when a piece of equipment changes modes of operation or goes into alarm.
The ability to trend data from multiple points allows management to gather data sufficient for predictive analysis. With this data a company can determine when to replace pump bearings, service batteries or rotate equipment. From generator vibration to chiller performance, monitoring the right data saves time and money while increasing system reliability.
Business continuity is directly tied to the availability of an organization’s computer systems. And in today’s day and age, the common goal of companies is to accept no tolerance for downtime.
These tips to protect against power outages, human errors and natural disasters are designed to assist organisations in disaster prone countries across Asia Pacific to prepare for unexpected events and in doing so, minimise downtime and safeguard critical information.
TS Khoo is the vice president for the Asean IT Business at Schneider Electric