Big Companies Want to Move to the Cloud But Still Have No Idea How
- by 7wData
Given the rampant chatter in Silicon Valley about companies moving software and data from their data centers to a shared public cloud, you might expect they considered what that would mean for their critical business software applications.
But you would be wrong, according to one expert who helps corporations and government agencies make this journey.
"When companies move to the cloud, they're often not doing a basic engineering task which is to look at all their applications and figure out what happens if they fail," Vishwah Lele, chief technology Officer at consulting firm Applied Information Sciences, tells Fortune.
Most big companies are aware of basic best practices for cloud computing. For example, it is recommended they distribute applications and data across different data centers within one location—or "availability zone" in Amazon Web Services parlance. That means if there's a power failure in one facility, the application will keep running from another. This applies across different geographic regions as well, say from data centers between Oregon and Northern Virginia.
But because these massive data centers rely on tens of thousands of servers, the chances of a hardware failure are inevitable. The key is to figure out which of a company's applications are most critical to its ongoing operations and "build a reliability bubble" around them, Lele says.
In a blog written in the wake of two cloud outages earlier this month (one at Amazon Web Services (amzn) and another on Microsoft Azure (msft)), Lele wrote:
"You look at each application, see what can go wrong, and create a check list," Lele tells Fortunenow. "Then you look at the potential for failure, and figure out how to mitigate it for the most important applications."
IT professionals can think ahead to avoid a small cloud glitch from spiraling. They can, for example, add what Lele calls "reply logic" to an important application. If the application hits a snag, this logic triggers an automatic retry. It's similar to hitting reboot on your computer before calling the help desk. This logic means that if a resource is temporarily unavailable to an application, the application will know enough to retry in a few seconds instead of shutting down. This, he continues, is something that IT pros don't have to do when running applications in-house, where they manage all the systems.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Evolving Your Data Architecture for Trustworthy Generative AI
18 April 2024
5 PM CET – 6 PM CET
Read MoreShift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read More