While the gears of research are turning fast developing new methods of machine intelligence, another, perhaps more impactful, trend is brewing in the field. Open source frameworks like Apache Spark are hitting their stride at the ideal time to put data analytics in the hands of the business development analyst without forgetting about the needs of the data scientist.
IBM’s new Project DataWorks is built with both Spark and IBM Watson at its core to prioritize speed and usability without sacrificing robust analytics. The best way to think about DataWorks is as a sort of Google Docs for data analytics. In practice, companies have huge data libraries that often end up in a variety of decentralized locations. IBM’s new product eats all this company data and puts it in one intuitively accessible place.
To keep all that data at the fingertips of those who need it, IBM has deployed a dashboard that displays data assets broken down with access, user, and categorical stats. IBM calls its technology for organizing data catalogs. With natural language search, users can pull up specific data sets from those catalogs much more quickly than with traditional methods. DataWorks also touts data ingestion at speeds of 50 to 100s of Gbps.
Leveraging technologies like Pixiedust and Brunel, users can produce data visualizations with as little as one line of code.