Have you ever wondered what a disaster really looks like? It kind of depends on the point of view you have when something (bad) happens, 9 out of 10 times it depends on the plan created upfront to how long the disaster is going to last, and thus the time it takes to recover from it. But most important in Disaster Recovery is the ability to steer away from disaster by pre-planning and being ready when disaster strikes.
Last week Max wrote a piece on Business Continuity and everything that is associated with this. If you haven’t read this already I urge you to do so quickly (link to post)
One of the quotes used by Max for Disaster Recovery is the following:
Disaster Recovery is the discipline that dictates how an enterprise can immediately recover from a disaster (a catastrophic loss or failure of part or all systems at a given facility), how it can recover and which tools/processes need to be used).
A couple of keywords in this quote will be discussed in the following paragraphs.
First of all, it is a discipline. It is always a good idea to know what a word truly means. It should be very clear that discipline in an IT environment is a MUST and not an option.
The Cambridge English Dictionary has two descriptions for the word discipline, but both are important and needed for swift recovery after a disaster:
- training that makes people more willing to obey or more able to control themselves, often in the form of rules, and punishments if these are broken, or the behaviour produced by this training
- the ability to control yourself or other people, even in difficult situations
Starting with the second description, the ability to control yourself (or other people) in difficult situations is only possible by training. The key to a disciplined recovery, and thus successful, recovery is the ability to control the (difficult) situation and get yourself and others on the path of a quick recovery. In case of a disaster it is not the IT department that counts on your discipline to recover, but the whole business and in some cases even the lives of others…
Another word that is very important in case of discipline, disaster and recovery is dictation. When a disaster occurs, it is of the utmost importance that there is a plan. In the post, Max wrote he talked about the Business Continuity Plan, and in case of a disaster this plan needs to describe the strategy and plans needed to continue, or shut down, the business process until the recovery is completed.
In the Disaster Recovery plan for IT needs to be described what steps need to be taken when a disaster happens, and what the procedure is to recover after the disaster happened. Working in a bank will probably mean other recovery plans than working in a hospital, although a lot of the recovery steps will be the same.
Ask anybody to describe disaster and you’ll end up with a million different answers. For a business that is the same, but in the end, it will mean the same thing:
A disaster is a serious disruption, occurring over a relatively short time, of the functioning of a community or a society involving widespread human, material, economic or environmental loss and impacts, which exceeds the ability of the affected community or society to cope using its own resource.
In other words, it is the catastrophic event that happens and disrupts the key functionality of a business. The disaster can be small and withhold only certain parts (or departments) of the business but need a plan and tools to recover to full functionality.
In a BC plan, multiple tools will be used to recover from a disaster, and in the IT department, it is no different. To recover after a disaster means using the right tools to recover to a preferred state so the business can continue with the tasks they are assigned to. This could mean you have one tool, but in a lot of cases, the Disaster Recovery Plan will dictate how certain aspects of the IT environment will be recovered.
If your business is a hospital the care for the patients in the hospital are of utmost importance, and therefore the infrastructure and applications needed to guarantee the welfare of the patience are primary systems in the recovery process, while other applications can be recovered after the recovery of the primary systems.
To do so a company should have a phased plan (roadmap) explaining which systems are primary, in what order they need to be started and what the Recover Time Objectives (RTO) and Recover Point Objectives (RPO) of these systems are. Not the IT department, but the Business (hence the BC/DR phrase) is responsible for determining what the order of recovery should be.
What tools are used in the process is also a choice of the business, as they provide the resources (read money) needed to implement the right Disaster Recovery tooling. Backup is Disaster recovery as is clustering, replication and application awareness are tools to use to reach the right RTO and RPO for your applications and data.
A disaster is something we cannot always prevent, although a lot of the disasters that occur in an IT environment are very preventable. Recovering is something we can, and should, plan for so that in case of a disaster we are able to recover back to normal in a swift and efficient way.
A lot can be achieved with resources provided by backup and DR tools vendors like Vembu but in the end a company needs to have a BCDR plan that dictates the steps that need to be taken when disaster strikes. These steps will need to be taken to make sure the business can recover from the disaster, and all employees need to have the discipline to take the actions they should in an order dictated by the plans created by the people that thought long and hard on what to do (and not do) when disaster strikes.