Do you feel like there is data everywhere, but no data that’s really usable? Well, look no further. IBM Bluemix Data Connect announced a brand new Design Data Flow beta capability to help you easily consume large volumes of data coming from disparate data sources in the cloud or on premises. The data can be curated in a self-service fashion through a series of operations and written to a target that can also be in the cloud or on premises. You can then derive trusted insights from the prepared data using your preferred analytics tools.
Now, you might be thinking that any extract, transform and load (ETL) tool can do this consumption, curatorial and deriving insights function. Well, imagine if each step in the flow ran automatically as soon as you added it, and you can see the state of your data instantaneously at any point in the flow? Further, imagine that a built-in, self-service data preparation capability exists that provides an innovative way to work with a data set to cleanse it and improve quality?
Consider a simple scenario that demonstrates how IBM Bluemix Data Connect works. Say you have a customer that procures information in a customer data set in an on-premises IBM DB2 database, and salary, contact and geo information in a prospect data set in cloud-based dashDB. Also assume the customer procures sentiment information—Twitter data—in a sentiment data set in an object store.