Disasters are inevitable and can strike at any time, and create an abrupt disruption to all or part of its business operations, which directly or indirectly results in revenue & reputation loss. It is important to have a good disaster recovery plan in place, to return to normal after the disaster.
In this blog, we’ll look into how a business needs to be prepared to handle disaster scenarios. This article will be generic in nature, hence the guidelines mentioned in this blog can be applied to any IT business.
In an IT environment, disaster recovery and high availability are not the same. Though both concepts are related to business continuity, high availability is about providing undisrupted continuity of operations whereas disaster recovery is to bring the business back online for an IT or natural disaster and it involves some amount of downtime, typically measured in mins, hours or days.
Best Practices for protecting your data center from disaster
The foremost thing is to implement the best DR plan/strategy that suits your business.
When a disaster strikes, your normal business operations might be affected, and immediately they have to be replaced with the operations from the disaster recovery plan built for your business. The disaster recovery system is not a replacement for your normal operation, it only supports for a short period of time. At the earliest possible time, the disaster recovery process must be decommissioned and the business should return to normalcy.
The process of preparing a good disaster recovery plan begins by:
- Identifying the causes of the disaster
- Analyzing its likelihood & severity
- Defining the resources or processes that need to be recovered immediately (Priority basis)
- Restore the business operations to the normal state
- Test and improve the disaster recovery system constantly
- Hardware and equipment failure
Failure of hard drive or a computer system would be the most obvious example of hardware failure. While modern hardware is much more durable these days, but no device can perform perpetually. In other words, all hardware will fail eventually
- Power outages
A power outage can disrupt your ability to continue your business. It can also cause damages to your computer system and its hardware and you may be required to repair or replace it. If you don’t have a DR plan, it may lead to system downtime and you can even lose your business-critical data
- Viruses, & Malware attacks
Viruses like Trojans can infect and delete critical data, and the Ransomware like crypto lockers affect and encrypt the critical data. An infected system that remains connected to the network will systematically affect all network files it has access to, rendering them useless. Sometimes the damage can be caused by internal forces such as ill-intentioned employees can also lead to IT disasters
- Natural Disasters
Natural catastrophes such as earthquakes, floods, and fire accidents may stop you from continuing your business. While you have no control over the weather or seismological activity, you can prepare for it by having a good disaster recovery plan in place
- Human error and Improper training
An employee may delete a file by accident or save a new version of a file overwriting an existing one without considering the consequences of this change. A simple dialog box pop-ups may also lead to downtime due to improper training of the employees. For example, many users click anything just to make the box go away so they can shut down and leave for the day
Let’s look into these steps in detail below:
1) Identifying the causes of the disaster
The disaster recovery planning for any business starts by identifying the root cause of the disaster. Here are 5 common causes of disasters in an IT environment
2) Analyzing its likelihood & severity
Disaster can occur at any time and affect your all or part of your business operations. After listing out all the possible causes/threats for business interruption, it needs to be examined properly for its probability of occurrence and level of disruption it could cause to your business. Also, take note of all the potential consequences & remedial actions that need to take for every type of disaster.
We recommend you to mark the probability of occurrence and level of disruption on your business against every type of disaster in a scale of 1 to 5 (1=Very High, 2=High, 3=Medium, 4=Low, 5=Very Low)
3) Defining the resources or processes that need to be recovered immediately once the disaster strikes
Create a checklist of all the essential functions whose interruption would considerably disrupt your business operations and may result in financial or reputation loss. After listing out all the essential functions, it needs to be prioritized.
Create a list of sequential assignments that need to be done at the time of disaster and assign it to the emergency team/DR team members. It is very important to train the DR personals with their assignments and educate them on the criticality of the business operations and its priority.
Let’s take a telecom service provider as an example: billing and help desk operations are two essential functions where billing is more essential. Hence, high priority must be given to the billing for immediate recovery at the time of disaster than helpdesk operation.
4) Restore the business operations back to a normal state
Once the possibility of the occurrence of the disaster and its severity is listed and prioritized, then determine the best suitable recovery method to restore the business operations by analyzing all the available recovery methods for each type of disaster.
Based on the criticality of the business data, these recovery operations may vary. For critical business data, you can perform the recovery by using the instant recovery mechanism of your backup solution or bring the replicated data to live which was replicated somewhere already. For less critical business data, you make use of your spare computers and configure them with the required applications
The main factors that need to be considered while recovering the critical systems during a disaster are:
- Cost of setup and maintenance
- Recovery time
- Ease of bringing the operation back
5) Test and improve the disaster recovery system
Testing the DR plan is essential in developing the perfect DR plan for your business. Organize DR drills and rehearse the DR plan in a regular interval of time. Only during these kinds of rehearsals, you will get to know the practical difficulties, bottlenecks in your DR plan and you also get a chance to fix. This ensures that your emergency team is familiar with the assignments given to them and ensures confidence in their capabilities during the actual disaster period.
We are hosting a webinar on “Best Practices for protecting your data center from disaster” to help you understand, implement and setup the best DR strategy.
Reserve your spot and join us on November 28, Thursday at 11.00 AM PDT | 7.00 PM CET.