Benchmarks to prove you need an analytical database for Big Data Blog

Benchmarks to prove you need an analytical database for Big Data

by 7wData
November 5, 2016

When it comes to big data, it’s important to ask “What’s next?” Sure, users find lower licensing costs when storing data in Hadoop—although they often do pay for subscriptions. Storing data efficiently in a cluster of nodes is the table stakes for data management today. However, it’s important to remember what happens next. The next step is often about performing analytics on the data as it sits in the Hadoop cluster. When it comes to this, our internal benchmarking testing reveals limitations of the Apache Hadoop platform.

We set up a 5-node cluster of Hewlett Packard Enterprise DL380 ProLiant servers. We created 3 TBs of data in ORC, Parquet, and our own ROS format. Then, we put the TPC-DS benchmarks to the test with Vertica, Impala, Hive on Tez, and even Apache Spark. We took a look at CDH 5.7.1 and Impala 2.5.0 and HDP 2.4.2 Hawq 2.0 in comparison to Vertica.

We first took note of whether all the benchmarks would run. This becomes important when you’re thinking about the analytical workload. Do you plan to perform any complex analytics? In our benchmarks, Vertica completed 100% of the TPC-DS benchmarks while all others could not.

Initially, the results were very poor for the Hadoop-based solutions until we found some rewritten queries on github for some tools.

For example, if you want to perform time series analytics and the queries are not available, how much will it cost you to engineer a solution? How many lines of code will you have to write and maintain to accomplish the desired analytics.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Benchmarks to prove you need an analytical database for Big Data

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

6 Key Ingredients for a Successful Social Media Marketing Strategy

5 Things to Know About Machine Learning

Healthcare Data Storage Options: On-Premise, Cloud and Hybrid Data Storage

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

IT Engineer

Data Engineer

Applications Developer

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

Benchmarks to prove you need an analytical database for Big Data

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change