At a recent TDWI conference, I was strolling the exhibition floor when I noticed an interesting phenomenon. A surprising percentage of the exhibiting vendors fell into one of two product categories. One group was selling cloud-based or hosted data warehousing and/or analytics services. The other group was selling data integration products.
Of course, when you think about it, this makes a lot of sense. The economics of cloud computing has shown benefits when using software-as-a-service products like Salesforce.com. Clearly, this paradigm significantly reduces the costs of developing and managing big data projects using tools like Hadoop without having to pop for purchasing the necessary hardware. But as data moves off-premise, it does not obviate the need for internal data accessibility for in-house reporting. That means being able to integrate data wherever the data lives.
Therein lies the problem: as organizational applications migrate to hosted environments, so does the data. And once that data is sitting in someone else’s environment, you begin to lose control over it. Think about this: When you access customer data sitting in an SaaS CRM product, you are not only bound to their internal data models, you're also constrained by their data accessibility methods.;