On the surface, a search engine and an analytics engine seem to be serving very different purposes, and they are usually accessed through vastly different interfaces. But when you scratch under that surface, how different are search and analytics, really?
Search: Find an answer in unstructured data
Fundamentally, a search engine ingests content and indexes it by parsing and storing data to facilitate fast and accurate retrieval of information. Most search engines focus on full-text indexing of natural language documents, but media types such as video, audio and graphics can also be indexed.
An efficient search engine interface is minimalist, it provides a single field for the user to enter their search terms, and it is up to the query interpreter to make sense of these terms and not only retrieve the most appropriate answer in its index, but also direct the request to the right set of indexes.
The user of a search engine usually comes with a simple question, but has no or little idea of where the answer may be found.
Analytics: Find an answer in structured data
At the other end of the spectrum, the user interface of a typical analytics tool is rich, presenting all types of predefined reports and charts to the user, and offering direct access to the underlying data structures that are available for them to drill into.
Most analytics solutions are designed to function with structured data: relational or hierarchical databases, multi-dimensional data structures (OLAP cubes), or “modern” data platforms — which include the different types of NoSQL databases, Hadoop data lakes storing structured data, and more.
The user of an analytics tool typically understands the structure of the data available to them, and comes with a complicated question that requires slicing and dicing of pre-determined data sets.
Expanding analytics to unstructured data
The “modern” data platforms I referenced above, and especially Hadoop, close the gap between structured and unstructured data. With relational databases, it was possible (but cumbersome) to store unstructured data such as documents, media, etc. but it was next to impossible to parse and index these elements with the same efficiency as structured data, in the same data structure. Hadoop blurs the line: not only does the data lake contain both structured and unstructured data in the same place, but the mixed workload engines available on top of this data lake provide indexing capabilities that make it possible to use unstructured data in some analytics use cases.
Expanding search to structured data
Going back to the other end of the spectrum, search engines have long sought to index non-textual data by exploring databases and other structured data sources in order to enrich their results with data not explicitly contained in textual files, or to anticipate dynamic web content where pages don’t always exist on a server but are created on the fly.
Two worlds coming together?
A search engine does offer an ease of use that can be appealing to users who are simply “looking for an answer.” This answer can be found in indexed data, but also in pre-defined reports or data aggregates, and presented in an easy and intuitive way to the knowledge worker — who probably does not care how this information was retrieved and presented to them.