Ten years ago, the phrase “big data” was rarely used outside of a few select academic circles, even though the quest to quantify the volume of information in the world – and the rate at which that volume is growing – dates all the way back to 1941 (Forbes has fairly succinct historical timeline here, if you’re interested). The term made its way into mainstream only when the problems it referenced affected a larger population, and those problems increased in severity. Nowadays, even my 15-year-old nephew knows what “big data” means, and my 64-year-old mother has a fairly good idea too.
I find the origin and uptake of idioms pretty interesting, especially in an industry like ours that’s in rapid flux. We sometimes need new language to effectively communicate data requirements as environments and processes change with technology and culture. We at Appuri think now is just one of those times.
Why? Because we’re witnessing rampant growth in a certain kind of business problem for which the solution has no adequate name. So we’ve decided to coin one ourselves, borrowing a couple concepts from analogous solutions. We’re calling it the Edge Analytics Cache.
In the next few articles in this series, we’ll compare and contrast this idea with related concepts like the data warehouse, data lake, data sandbox, and data cube. But for now, we’ll start with a definition.
The Edge Analytics Cache (EAC) is a self-contained data platform that manages data ingest, storage, processing, and validation to maintain data in a business decision-ready state. It sits between the central data repository (such as an enterprise data warehouse) and internal constituents (business users) and acts as their local data store.
“Edge analytics” is a phrase typically used in the context of IoT, where data processing happens at the “edge” of the network to reduce loads on core computing environments. In this context “edge” refers to the connected devices. We’re adopting this phrase here to refer to the edge of the centralized data repository, but still within the organization — akin to a hub-and-spoke structure. This edge, the EAC, is designed to support the specific needs of a business group or team. For example, when talking to a customer asking for a refund, the Support team can pull purchase activity off of their own EAC.
“Cache” you’re probably more familiar with, likely in a web context, such as a browser cache. In this and other computing contexts, “cache” is information storage technology designed to speed up information retrieval. Information that’s frequently accessed, such as your favorite website, is stored locally to reduce the latency that would result from requesting files from the webpage each and every time it loads. Thus caching improves performance by speeding up read operations.