In the worlds of Big Data, NoSQL and relational databases, Splice Machine's name doesn't come up that often. But a closer look at the company's product, architectural approach and CEO put them on my radar a while back. And Version 2 of the product, which is being announced today, has made that radar dot much brighter.
Also read:The NoSQL community threw out the baby with the bath water
Also read:Full SQL on Hadoop? Splice Machine opens up its database for trials
Have RDBMS cake, eat NoSQL scaling, tooBefore we look at version 2, ket's cover the motivation behind v1. Specifically, Splice Machine looked long and hard at some pressing database conundrums:
The solution: create an ACID-compliant, SQL relational database on top of Apache HBase -- a NoSQL database that uses HDFS as its storage layer. Now you've got SQL, the relational model, ACID/transactional consistency, horizontal scaling and HDFS, all in one product.
Also read: Splice Machine's SQL on Hadoop database goes on general release
Sparking v2 So version 1 is pretty cool but version 2 of the product ups the ante considerably: it on-boards another important data technology -- Apache Spark -- as an additional execution engine.
Splice Machine's CEO, Monte Zweben, gave me the lowdown on v2. Zweben is an alumnus of Stuyvesant High School, Carnegie Mellon, Stanford and the AI branch of NASA's Ames Research Center; he's also Rocket Fuel's Chairman of the Board.
Clearly no dummy, Zweben explained that the product employs a cost-based optimizer to enlist the services of Spark for queries that are long-running, have lots of scans and/or multiple phases of execution. Analytical queries often fit that profile, and will be well-handled by Spark. Simpler, operational queries will still be executed via HBase.
Gentlemen, you don't have to choose your engines Splice Machine users need not concern themselves with these implementation details; they just query the database in SQL and Splice Machine handles the rest. And, by the way, Splice Machine will use the core Spark engine, rather than going through Spark SQL, which would just add an unnecessary layer.
Open source = Open Sesame?Splice Machine is a well-kept secret though; Zweben told me the company has about 10 customers.
Chief Analytics Officer Spring 2017
15% off with code MP15
Big Data and Analytics for Healthcare Philadelphia
$200 off with code DATA200
10% off with code 7WDATASMX
Data Science Congress 2017
20% off with code 7wdata_DSC2017
20% off with code AIP17-7WDATA-20