Expert Interview: HPE Vertica’s Steve Sarsfied on Big Data Innovations

Expert Interview: HPE Vertica's Steve Sarsfied on Big Data Innovations

Syncsort’s Paige Roberts caught up with Steve Sarsfield from Hewlett Packard Enterprise (HPE) at the latest Strata. Steve is the product marketing manager for HPE Big Data Software, focused on their Vertica for SQL on Hadoop product. Steve is also a notable name in the arena of data quality and governance, and authored the book The Data Governance Imperative. Enjoy some keen industry insight in this interview between Paige and Steve.

So, we’re at Strata, and you’re a Vertica person. What do you feel the intersection is for Hadoop and Vertica?

HPE and Hadoop really intersect quite a bit when it comes to some of the innovations that we’re working on. We have some great innovations that we’re showing [here at Strata]. One of the innovations is our big data reference architectures, which we’ve designed to work in partnership with Hadoop, specifically HDFS and YARN. One of the offerings we have are these reference architectures that allow you to use YARN labels to specify compute and storage, and break up compute and storage. So if you want to make that dynamic within the organization, you can use YARN labels to specify how much compute and how much storage you want to use for any job.

The second part is that we have HPE Vertica for SQL on Hadoop. That is a product that allows you to install our Vertica engine directly into the Hadoop cluster and perform SQL queries on Hadoop. It’s 100% TPC-DS compliant, fully ANSI SQL compliant and can be installed either in the Hadoop cluster or separately as a Vertica cluster. It’s a high-performance engine, and we’re happy to show that off here at Strata, too.

Syncsort and Vertica have been pretty tight over the years.

What do you see as the synergies? What makes it such a good partnership?

Our strength is in providing very fast analytics for massive amounts of data. We focus all of our effort, from the way we store data to the way we compress columns, so that the analysis happens fast. What Syncsort brings to the table is the basic concept of getting the data into the database. That’s really important, because although we ingest data, we don’t have that completely covered. If you have complex data or particularly tricky data, we rely on our partnerships like Syncsort. I think that’s a really important component, especially in today’s age when there are so many different file formats and unstructured data and a lot of options when it comes to storing data. We need a partner like you guys to do it.

This is a question I’ve been asking everyone to get different perspectives. What do you think Hadoop is for?

It’s a “make you think” question.

Hadoop is a general term that describes many projects that are going on in the open source community. Hadoop and specifically HDFS is primarily to store data at a very low cost. There’s data that companies gather that they aren’t really sure what it’s good for or what value it has. They need some low-cost place to put it. Hadoop, or at least the HDFS component of Hadoop, is a really good place for that. The whole Hadoop community is based on the fact that more and more data is coming at us. However, what we aren’t seeing is IT budgets growing by a lot. What I hear is data volumes growing by 25 to 50 percent, or more in certain companies, but IT budgets are growing by about 4 percent. So companies are looking for ways to store data at a low cost, and that’s one of the functions Hadoop does well. The other thing is around data discovery, understanding what data you have, getting into the data to see if there’s any value there. Those two components are what I think it’s for. Beyond that, it’s pretty exciting to see all the other things that the Hadoop community is incubating. Countless projects that help companies manage big data.

What do you think of Spark?

Spark is really exciting technology. It seems like something that will be really powerful in the future.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Automated Predictive Analytics – What Could Possibly Go Wrong?

19 Sep, 2016

Summary:  Will Automated Predictive Analytics be a boon to professional data scientists or a dangerous diversion allowing well-meaning, motivated but …

Read more

Execs Talk About Managing Multicloud: Complexity, Data, Vendors

12 Aug, 2021

CIOs are finding that they and their teams are increasingly and unavoidably living in a multicloud world. The question is: …

Read more

Stop Putting the AI Cart Before the Data Horse

2 Aug, 2019

Flush with Big Data and an accelerated way to capitalize on it (AI), many large enterprises are making a classic …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.