Make The Time For a Disaster Recovery Plan

Make The Time For a Disaster Recovery Plan

There are a lot of IT-related things we prefer not to think about as we go about our daily business. Maybe we’ve developed good habits that result in our systems always being up-to-date and secure, or maybe we’ve hired people to keep them that way. Every once in a while we’ve been able to restore a previous version of a file from a backup, and confidence is high that we’ll be able to do it again if we need to. The network is solid and we don’t have to pay much attention to it – it just works.

But there are some things we’re guilty of not thinking about because they’re unpleasant and difficult to deal with. For a lot of people responsible for their company’s IT that thing has a name: disaster recovery.

We’re bigger procrastinators when it comes to this than any of us would like to admit. Sage North America (yes, the software company) conducted a study in 2012 that pretty much reiterates what others have been saying for years: we back up our data (mostly on-site: 72%), but most of us (62%) don’t have formal disaster preparedness plans in place, because we either haven’t ever experienced a disaster (33%), haven’t really thought about it (30%), don’t think it’s important (27%) or don’t have the time to deal with it (20%). All of this in spite of some really sobering statistics: without a plan, 75% of companies that experience catastrophic data loss fail within three years, 43% never even reopen, and of those that do only 29% are still open two years later.

Echo’s clients tend to be smart people doing business in one of the most beautiful and seismologically active parts of the United States. I’m pretty sure that leaves “don’t have time to deal with it” as the reason you’re reading this blog post, so let’s tackle this one right now.

The first thing we should be clear on is that disaster recovery is a subset of business continuity. Somebody needs to spend time thinking about what constitutes a disaster, who calls it a disaster, and what the business will need to remain functional after one. Declaring the data offline and engaging the disaster recovery plan is a non-trivial exercise that can take a lot of time and careful attention to execute – and to reverse. Someone needs to identify the workflows that are critical to the continued operation of the business, their dependencies on data and the supporting systems beneath the data.

Once you’ve identified the needs of the business in the event of a disaster, the next step is determining how to provide for them in the interim while you’re recovering from the loss of your IT infrastructure. This is where it would admittedly be great if a one-size-fits-all approach to IT worked, but that’s often not the case. While it might be possible to replicate backups offsite and restore them in the event of a disaster, the devil is in the details:
· How long will it take to restore that data to new servers?
· Whose servers are you restoring that data to?
· How will the network those servers are running on be different from the one they’re in now?
· How up-to-date and application-consistent will that data be? (was the application aware a backup was happening?)

It might be that backups replicated to someone’s cloud and restored to someone’s cloud infrastructure are a good fit if:
· the data loss between replicated backups is acceptable
· the time to bring up servers from those backups is acceptable (with large datasets, this can take days)
· those servers can be reconfigured to be usable in the network they come up in

If one of those conditions is a deal-breaker, we start having to look at other solutions that are more costly to implement but offer advantages in recovery points and recovery time:
· Did you know that you can get space in a datacenter for around $250 per month? If your static capacity requirements make running in someone’s cloud expensive, you may want to consider hosting enough resources offsite to handle your infrastructure needs during a disaster.
· Did you know that applications like Active Directory, Exchange Server and Microsoft SQL Server have built-in replication mechanisms, as does Microsoft’s Distributed File System? Application-level replication to a “hot” disaster recovery site is almost always the best way to make resources available in a disaster with the least amount of data loss, and being able to leverage someone else’s work here is one of the things that makes taking applications to the cloud an attractive proposition if you’re not able to build out your own disaster recovery infrastructure.
· Backup software nowadays should be capable of snapshotting your running servers (they are virtual servers, aren’t they?) then backing up the crash-consistent (and in many cases application-consistent) snapshots, then replicating them to your disaster recovery site or shared cloud infrastructure. We like Veeam for this task in particular.
· Finally, you’d be surprised at how much the cost has come down on storage area networks that are capable of efficient snapshot replication – here we think Nimble is the standard against which to judge others’ proposals.

I hope this has given you enough to start thinking about how best to approach your organization’s need for disaster recovery as part of a business continuity plan. As always, Echo would love to help when questions come up along the way.