We practitioners of the technological arts have a tendency to use specialized jargon. That’s not unusual. Most guilds, priesthoods, and professions have had their own style of communication, either for convenience or to establish a sense of exclusivity. In technology, we also tend to attach very simple buzzwords to very complex topics, and then expect the rest of the world to go along for the ride.
Take, for example, the tag team of “cloud” and “big data.” The term “cloud” came about because we systems engineers used to draw network diagrams of local area networks. Between the LANs, we’d draw a cloud-like jumble meant to refer to, pretty much, “the undefined stuff in between.” Of course, the Internet became the ultimate undefined stuff in between, and the cloud became The Cloud.
To Mom and Dad and Janice in Accounting, “The Cloud” means the place where you store your photos and other stuff. Many people don’t really know that “cloud” is a shorthand, and the reality of the cloud is the growth of almost unimaginably huge data centers holding vast quantities of information.
Big data is another one of those shorthand words, but this is one that Janice in Accounting and Jack in Marketing and Bob on the board really do need to understand. Not only can big data answer big questions and open new doors to opportunity, your competitors are using big data for their own competitive advantage.
That, of course, begs the question: what is big data? The answer, like most in tech, depends on your perspective. Here’s a good way to think of it. Big data is data that’s too big for traditional data management to handle. Big, of course, is also subjective. That’s why we’ll describe it according to three vectors: volume, velocity, and variety — the three Vs.
Volume is the V most associated with big data because, well, volume can be big. What we’re talking about here is quantities of data that reach almost incomprehensible proportions.
Facebook, for example, stores photographs. That statement doesn’t begin to boggle the mind until you start to realize that Facebook has more users than China has people. Each of those users has stored a whole lot of photographs. Facebook is storing roughly 250 billion images.
Can you imagine? Seriously. Go ahead. Try to wrap your head around 250 billion images.
So, in the world of big data, when we start talking about volume, we’re talking about insanely large amounts of data. As we move forward, we’re going to have more and more huge collections. For example, as we add connected sensors to pretty much everything, all that telemetry data will add up.
Or, consider our new world of connected apps. Everyone is carrying a smartphone. Let’s look at a simple example, a to-do list app. More and more vendors are managing app data in the cloud, so users can access their to-do lists across devices. Since many apps use a freemium model, where a free version is used as a loss-leader for a premium version, SaaS-based app vendors tend to have a lot of data to store.
Todoist, for example (the to-do manager I use) has roughly 10 million active installs, according to Android Play. That’s not counting all the installs on the Web and iOS. Each of those users has lists of items — and all that data needs to be stored. Todoist is certainly not Facebook scale, but they still store vastly more data than almost any application did even a decade ago.
Then, of course, there are all the internal enterprise collections of data, ranging from energy industry to healthcare to national security. All of these industries are generating and capturing vast amounts of data.
Remember our Facebook example? 250 billion images may seem like a lot. But if you want your mind blown, consider this: Facebook users upload more than 900 million photos a day. A day.So that 250 billion number from last year will seem like a drop in the bucket in a few months.