2015 was the year machine learning emerged from the academic closet. No longer was it an esoteric discipline commanded by the few, the proud, the data scientists. Now it was, in theory, everyone’s business.
2016 was the year theory became practice. Machine learning’s power and promise, and all that surrounded and supported it, moved more firmly into the enterprise development mainstream.
That movement revolved around three trends: new and improved tool kits for machine learning, better hardware (and easier access to it), and more cloud-hosted, as-a-service variants of machine learning that provided both open source and proprietary tools.
Once upon a time, if you wanted to implement machine learning in an app, you had to roll the algorithms yourself. Eventually, third-party libraries came onto the field that saved you the trouble of reinventing the wheel, but still required a lot of heavy lifting to be productive. Now the state of the art involves frameworks designed to make machine learning an assembly-line process: Data in one end, trained models and useful results out the other.
What better way to implement such items than through existing data-processing frameworks? To that end, superhot (and superfast) data framework Spark not only kicked up its performance with its 2.0 release, but added a revised machine learning library that better complements Spark’s new internal architecture.
Another trend in the same vein: Products that handled data, but previously didn’t have a direct connection to machine learning, started offering machine learning acceleration as a new feature. Redis, the in-memory data caching system that pulls double duty as a database, added Spark-powered machine learning as one application for its new modular architecture.
A third trend in the field is the rise of new support tools for developing machine learning software. Sometimes it’s an entirely new language; for example, Liftwas created for writing high-speed, parallel algorithms that run well on CPUs, GPUs, and other hardware. Other times it’s tool kits for existing languages; to wit, Milk enhances C/C++ applications that use the OpenMP tool set, speeding up access to big data sets.
Machine learning wasn’t made possible by the blisteringly fast computational power that GPUs provide, but GPUs certainly provide a performance boost that current-generation CPUs can’t even begin to approach.
To that end, two big movements in machine learning in 2016 involved GPUs. First was the accelerating use of GPUs in machine learning products, and not only in frameworks like Spark. GPU speedups also started getting more notice in database applications, especially those marketed as methods to feed data-hungry machine learning systems.
The other big GPU-related change was that every single major cloud vendor could now boast of having GPU-accelerated instances as part of its product lineup.
Big Data Innovation Summit London
$200 off with code DATA200
Data Innovation Summit 2017
30% off with code 7wData
Enterprise Data World 2017
$200 off with code 7WDATA
Data Visualisation Summit San Francisco
$200 off with code DATA200
Chief Analytics Officer Europe
15% off with code 7WDCAO17