How corporations use big data will literally be the difference between business success or business failure. Now, the fast data architecture promises more possibilities, with the right data management to support it.
Data generation is increasing at mind-boggling rates, and the evidence surrounds us: 21 million tweets and 9 billion email messages are sent every hour. Soon, even more information will be created. Sensors will collect performance data on items such as light bulbs, personal medical devices will monitor insulin rates and inventory will be tracked as it moves from place to place.
As a result, analyst firm IDC expects data volumes to double every two years and reach 40 zettabytes — a zettabyte equals one million petabytes — in 2020. Enterprises want to do more than collect information for future analysis — they want to evaluate it in real time, a desire that is dramatically changing the data management market.
Recently, big data systems have been all the rage. In fact, IDC projects that the market will grow at 23.1% annually and reach $48.6 billion in 2019. Big data systems have been gaining traction for a few reasons. They allow organizations to collect large volumes of information and use commodity hardware and open source tools to examine it. Businesses can then justify deployments that are much less expensive than traditional proprietary database management systems (DBMSes). Consequently, Hadoop clusters built from thousands of nodes have become common in many organizations.
With competition increasing, management is placing new demands on IT. “Knowledge is power, and knowledge of yesterday is not as valuable as knowledge about what’s happening now in many — but not all — circumstances,” said W. Roy Schulte, vice president and analyst at Gartner. Businesses want to analyze information in real time, an emerging term dubbed fast data. Traditionally, acting on large volumes of data instantly was viewed as impossible; the hardware needed to support such applications is expensive. But such thinking has recently been changing. The use of commodity servers and the rapidly decreasing cost of flash memory now make it possible for organizations to process large volumes of data without breaking the bank, giving rise to the fast data architecture. In addition, new data management techniques enable firms to analyze information instantly. For example, transaction systemsinclude checks so that only valid transactions take place. A bank would not want to approve two transactions entered within milliseconds that took all of the money out of a checking account. Analytical systems collect information and illustrate trends, such as more time being taken by call center staff handling customer inquiries. By linking the two, corporations could build new applications that perform tasks, like instantly approving a customer’s request for an overdraft because the client’s payment history is strong. Technically speaking, a zettabyte is 10 bytes, or a billion terabytes — but data capacity numbers that large can be hard to digest. In more practical terms, a zettabyte could be expressed by the equivalent of 152 million years of high-definition video. Forty zettabytes — the level IDC expects data volume to reach in 2020 — split among the 7 billion people on Earth equates to about 5.7 terabytes per person.
Traditional data management systems worked only with data at rest, storing information in memory, on a disk, in a file, in a database or in an in-memory data grid and evaluating it later. Emerging products, which are being labeled as streaming systems, work with data in motion, information that is evaluated the instant it arrives. The new streaming platforms use various approaches, all with the goal of delivering immediate analysis. “You don’t need any DBMS at all for some fast data applications,” noted Gartner’s Schulte. In certain cases, traditional DBMS products have morphed to support the fast data architecture.