You may hear from time to time about how secure and efficient are services in public clouds. The cloud providers indeed take good care of clients’ data. They always give you guarantees that their services are maintained and secured properly. You can easily become calmed down by all of these assurances. You may consider your data safe in your chosen cloud service. Do you really have nothing to worry about since you successfully completed your migration to the cloud?
You may have heard about the disastrous fire in the data center of one of the most well-known cloud service providers – the OVHcloud company. Unfortunately, they kept the backups of the clients’ data in the nearby data center, which was also affected by the fire. As a result, recovering data for some of the services hosted there turned out to be impossible. The OVHcloud recommended these clients implementation of their disaster recovery procedures. Some of the clients were looking for this kind of option in the OVHcloud service management interface. But there was nothing like that there…
Unfortunately, this is how many cloud users learned the hard way that successful migration to the public cloud is not the end of the work. Unless it has been properly prepared beforehand, it’s just the beginning. As you may see in the It might look like another IaaS, PaaS, and SaaS explanation post, regardless of which service delivery model you choose, the data is always yours. It is therefore your responsibility to secure your data properly no matter how secure your provider promises it is.
A disaster recovery plan.
The biggest and most straightforward lesson learned from the OVHcloud fire is to have a disaster recovery plan. But what it actually means to have a disaster recovery plan? Well, you have to do four major things to prepare it for yourself or your organization:
- Predict what can go wrong.
- Decide how to mitigate each risk and implement the solution.
- Plan what to do when it goes wrong.
- Test your plans.
Predicting what can go wrong.
Predicting what wrong can happen to you is formally known as a risk assessment. You may feel like it hurts to do it just by the name of the process. And you’re most likely right about it. Not many people like to systematically, thoroughly, and strategically think of all the bad things, which can happen to their business, their IT infrastructure, and their systems. Although it’s not an easy nor nice thing to do, it may save you from after-effects from situations like OVHcloud fire. OK, but how to do it?
Basically, the first thing you have to do is list all the bad things, which can happen to your data. Next, you identify the cause of each of these hazards, and finally, you evaluate each of the hazards. Your evaluation should include at least two criteria:
- The possibility of the hazard appearing.
- The severity of consequences of each hazard in case it actually happens.
By multiplying these two you get a score for each hazard. You can also estimate money costs for the business in case of each hazard appear. It is very helpful in the next steps.
You can find some clues on the basis of the information security processes – the ISO 27001 standard. You may also want some clues in one of the hundreds of examples and templates around the Internet. There are even examples of risk assessments for different industries provided by some websites. You can find risks assessments approaches there and decide which one you would like to follow. As there are tons of resources on how to perform the risk assessment, we won’t cover it all here in detail.
Want any general clues about the assessment? Don’t avoid identifying hazards that seem not relevant or very unlikely. Burning down a number of data centers around Strasburg at the same time for example may look silly at first. No serious cloud provider would let this happen. And if it would happen, what would cause it? A nuclear bomb? A war? There’s not much of nuclear war expected in Strasburg, right?
As you can see from the OVHcloud case, in fact, these kinds of things may happen and if they do, they hurt affected clients a lot.
Deciding what to do with each of the possible scenarios.
OK, so you have all the bad things that can happen to your IT environment listed and evaluated. Next, you have to think about what you can do about each of them. Generally, there are four strategies at your disposal:
- Risk avoidance. You can decide not to do something which generates a particular risk. You may identify that in the process of developing application code, using some libraries may expose you to some vulnerabilities. Let’s say that the particular library may have had many critical vulnerabilities discovered in the past. What’s more, its developers hadn’t fixed them quickly enough. In this case, you may decide not to use this particular library despite that it provides useful code… You avoid exposure to this risk that way.
- Risk acceptance. For some of the issues, you probably can accept the risk. Not having access to the Internet for 5 minutes may be acceptable for some businesses for example.
- Risk transference. You might fail to fulfill SLA promised to your clients from time to time. As a result, you can be obliged have to pay some penalty fees. What you can do is buy insurance for these circumstances. That way you transfer your risk to the insurance company.
- Risk limitation. This is the most common strategy, as it lets you do your thing, but with a limited cost of bad consequences. As you know, using a computer exposes you to some risks caused by computer malware. You don’t stop using computers because of that though. What you can do is use anti-malware software. You can also educate yourself to use your computer wisely and act properly in case of noticing suspicious behavior. The threat of your data being hurt by malware doesn’t go away by these actions, but you significantly limit the size of consequences of that.
To make a decision on which approach you take and what solutions you implement to mitigate each hazard, your use your estimations about the possibility and severity of each hazard. It’s even better if you have them calculated in potential money losses for the company. In this case, you can compare them with the costs of the implementation of systems, which mitigate those risks. By having both of these listed in your currency in front of you or your management board, for most issues, it’s a no-brainer if you should or shouldn’t implement each of the protective tools.
Planning what to do when it goes wrong.
OK, so you have all your risk listed and evaluated. You also have all the decisions made regarding which systems you should implement to mitigate all the risks. Let’s say you already have them implemented. Is this all you have to do? Well, not exactly. You also need procedures on what to do with all of these tools when a disaster happens. These are the disaster recovery procedures, after all.
Let’s say you have all the backups checked and verified regularly. But what you should start with when something goes wrong? What you should recover first? Where you should look for infrastructure to recover everything on? When a disaster appears, time is very precious. Everyone is under tremendous pressure to recover all the services as quickly as possible. You don’t want to have to improvise in these conditions. As Chris Voss always says, “when the pressure is on, you don’t rise to the occasion – you drop to your highest level of your preparation”… You may have not heard about him, but he is a former FBI lead hostage negotiator. As you can imagine, he knows a few things about working under pressure.
Traning and overcoming resources limitations.
Preparing a disaster recovery plan may seem like a tremendously laborious and boring process. It’s even worse to train the planned procedures time after time to ensure you’re ready. What’s the hardest though is finding time to do it. You as an IT manager or IT employee are usually under constant pressure from your business to implement new features and services and solve current IT issues. It’s quite obvious that you and your team could do so much other stuff instead of preparing and testing the disaster recovery plans. Unfortunately, as the OVHcloud proved, disasters come. Your disaster will come someday too. You and your business will thank yourself then for well-prepared and well-tested disaster recovery plans regardless of the pain it took to prepare them.
Conclusion
In the following posts, we will cover some of the tools to protect your data in a public cloud. For today just please remember that no vendor will secure your data as well as you can do it regardless of all the promises. In case of an emergency, the provider will advise implementing your disaster recovery plan. So you better have one then… And you better have it deliberately planned and conscientiously tested.
[…] important is a backup of cloud services. Cloud vendors often ensure that they will do it for you. Their guarantees can put your backup mindfulness to sleep. You are a pro, so you never forget that as long as it’s your data, it’s always your […]