Why Google BigQuery excels at BI on big data concurrency
- by 7wData
If you're trying to do Business Intelligence (BI) on big data and the capability to handle large number of concurrent queries is a key issue for you, Google BigQuery may well be the way to go, according to a new Business Intelligence Benchmark released Thursday by AtScale, a startup specializing in helping organizations enable BI on big data.
"Concurrency has been an Achilles' Heel, a challenge for SQL-on-Hadoop," says Josh Klahr, vice president of product management for AtScale.
But AtScale's benchmark found concurrency to be BigQuery's greatest strength. Its serverless model means concurrent query performance on small data sets shows no query degradation, even at query volumes above 25 concurrent BI users.
"Concurrency, I think, was the biggest one," Klahr says. "But the user experience with BigQuery was also really nice. Maybe this isn't a surprise because Google has focused so much on consumer products over the years: Everything about using the product was really nice. The thing that actually took the longest was loading the data from our local network onto the cloud. Once we had the data there, the creation of the tables was really easy."
For its benchmark, AtScale used the same model it deployed last year for its benchmark tests of SQL-on-Hadoop engines on BI workloads. For that test, the idea was to help technology evaluators select the best SQL-on-Hadoop technology for their BI use cases. The goal was the same for the Google BigQuery benchmark.
"The AtScale benchmark provides enterprise leaders with useful comparisons they need to make BI work on big data," Doug Henschen, vice president and principal analyst at Constellation Research, said in a statement Thursday. "As the data grows more complex and diverse, these benchmark stats help enterprises understand leading big data query options and make better decisions critical to supporting BI infrastructure.
AtScale's testing team used the Star Schema Benchmark (SSB) data set, based on widely used TPCH data, modified to more accurately represent a typical BI-oriented data layout. The data set allowed the test team to test queries across large tables: The lineorder table contains close to 6 billion rows and the large customer table contains over a billion rows.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Evolving Your Data Architecture for Trustworthy Generative AI
18 April 2024
5 PM CET – 6 PM CET
Read MoreShift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read More