In recent years the field of Business Intelligence (BI) has taken a dramatic turn from IT-managed offerings to self-service solutions that allow end users to load their own data and build their own dashboards. An important part of this “democratization of BI” has been a vendor trend toward allowing interactive data exploration by allowing users to dynamically create aggregates and apply filters. Unfortunately this shift toward ad-hoc exploration taxes even the fastest database backends, which cannot keep up with the deluge of data coming into organizations from increasingly diverse sources.
The traditional solution to the “speed gap” was to employ cubing or indexing strategies in the hope that the majority of users’ queries could be precomputed on the BI backend. Unfortunately this strategy quickly breaks down in the face of the infinite customization that modern BI systems allow - every user on the system is likely building their own dashboards with their own custom aggregates and filters, rendering pre-computation mostly ineffective.
To overcome these limitations vendors pursued a second route - throwing more servers at the problem. In recent years it has become common to see BI tools propped up by backends consisting of tens or even hundreds of servers running the fastest analytics databases. However the immense costs and complexity of implementing such systems, not to mention the diminishing gains of scaling large distributed systems, has its own substantial downsides.
MapD takes a very different approach.