Sunday’s AWS outage demonstrates why you should have a cloud plan

Some cloud implementations have stormy weather ahead.
On Sunday afternoon during storms in Sydney, Amazon Web Services (AWS) had an outage on services running in their Sydney region after the weather damaged some of their hardware. Due to the outage, a number of organisations including The Iconic, Domino’s, Foxtel, Stan and Domain were left without websites or key services for a number of hours. A number of SMEs who use the platform were also affected.
The outage demonstrates that the cloud doesn’t (and can’t) deliver 100% uptime and putting all your critical system eggs in one basket will lead to problems should there be an outage. Those making a switch to the cloud need to plan for this contingency.
Many of the SMEs which were affected by the outage had consulted with a cloud partner and implemented a solution with no fail-over in the event of failure. This is a massive oversight and any cloud solution for a critical system (or a system which requires very high availability) needs to be able to run from elsewhere in case of system or network failure at the cloud provider. AWS provides multiple regions around the world to ensure services can be ran from another location in case of failure.
Unfortunately, even those with proper fail-over in place using the AWS API had issues as the API also went down – causing the fail-over not to kick in. Carsales, which was also affected managed to keep running as it uses its own API and shares its services between AWS and Microsoft’s Azure.
When considering a move to the cloud, businesses need to ask their cloud partner what the plan is when an outage occurs and must consider how important a particular system is to their business. Depending on how important the system is fail-over to another region, using multiple cloud providers or a hybrid on-premise/on-cloud approach may be needed.
Every critical or important business system needs to have a degree of redunancy to protect against failures. Unfortunately, many businesses cloud transition is wrongly centered on solely lowering costs. While this is understandable, the cost of outages also needs to be considered and can be used to determine how much redundancy should be built into your system.
The cloud itself can provide this redundancy – working alongside your on-premise systems or having fail-over features. It is also likely that faults will be repaired quicker as organisations large and small use cloud providers and those large organisations will put pressure on cloud providers to bring services back online as soon as possible.




Leave a Reply
Want to join the discussion?Feel free to contribute!