IT and data professionals are under increased pressure to deliver the goods when it comes to data – that is, insights that can help drive business decisions and boost profits. Toward that goal, many organizations are focusing on collaborative analytics to empower analysts and business users to get their jobs done with greater accuracy and speed.
Stephanie McReynolds, vice president at Alation, spoke with Information Management about what these trends mean, and her observations on the top data analytics and data management themes to emerge from the recent Strata & Hadoop World event in San Jose.
Information Management: What are the most common themes that you are heard at the conference?
Stephanie McReynolds: Re-defining data governance in the context of Hadoop and big data use cases was a key theme of Strata San Jose.
Sessions highlighted how the maturation of Hadoop implementations in the market requires some tooling to support governed self-service. Sessions included our customer eBay, who spoke about tooling for 1,000s of analysts to more effectively find, understand and trust their data stored in Hadoop and Teradata.
Trifacta, Navigator and Waterline presented a demonstration of how a realistic data governance workflow could look like in Hadoop. And Joe Hellerstein shared an open source metadata management project called Ground. Discussions around data stewardship, governed self-service analytics, and metadata were very topical for the community of attendees.
IM: What are the most common challenges that attendees were facing with regard to data management and data analytics?
SM: One, that machine learning delivers the true business value of big data. Machine learning has emerged as the most likely way that every organization will derive value from data stored in HDFS. No matter which processing engine is used to prepare the data and execute queries, machine learning algorithms are where big data value is derived. Two, that broad-based analyst interaction with big data is important. Big data analysis should be a collaborative endeavor. Detailed knowledge about what the data means, how it was derived and the appropriate business uses sit in the heads of lots of different individuals in the organization.