The on-going Big Data media hype stirs up a lot of passionate voices. There are naysayers (“it is nothing new“), doomsayers (“it will disrupt everything”), and soothsayers (e.g., Predictive Analytics experts). The naysayers are most bothersome, in my humble opinion. (Note: I am not talking about skeptics, whom we definitely and desperately need during any period of maximized hype!)
We frequently encounter statements of the “naysayer” variety that tell us that even the ancient Romans had big data. Okay, I understand that such statements logically follow from one of the standard definitions of big data: data sets that are larger, more complex, and generated more rapidly than your current resources (computational, data management, analytic, and/or human) can handle — whose characteristics correspond to the 3 V’s of Big Data. This definition of Big Data could be used to describe my first discoveries in a dictionary or my first encounters with an encyclopedia. But those “data sets” are hardly “Big Data” — they are universally accessible, easily searchable, and completely “manageable” by their handlers. Therefore, they are SMALL DATA, and thus it is a myth to label them as “Big Data”. By contrast, we cannot ignore the overwhelming fact that in today’s real Big Data tsunami, each one of us generates insurmountable collections of data on our own. In addition, the correlations, associations, and links between each person’s digital footprint and all other persons’ digital footprints correspond to an exponential (actually, combinatorial) explosion in additional data products.