Spark, Kafka & machine learning: 10 big data start-ups taking analytics to the next level

List: What start-ups are worth watching as they grow in the big data market?

The rise of both structured and unstructured data has created a booming market that is expected to be worth around $41.5 billion by 2018.

The rapid growth of the big data market has resulted in the creation of a large crop of vendors that are all looking to take a slice.

Amid the plethora of vendors competing for market position are a number of start-ups that are aiming to help organisations collect and analyse data. CBR identifies 10 companies that are worth watching.

Founded in 2014, the company has over $30 million in capital raised so far from investors such as LinkedIn, Index Ventures, Benchmark Capital and The Data Collective.

The company was founded by the developers behind Apache Kafka, a real-time messaging and streaming big data engine. After its creation inside LinkedIn it was then contributed to the Apache Software Foundation and spun out as a separate company.

Confluent is basically a commercial provider and supporter of the Apache Kafka software. It worked at LinkedIn by going through the process of fully instrumenting everything that happens in a company and making it available as a real-time Kafka feed that is fed to data systems like Hadoop, Search, Newsfeed and so on.

Read Also:
Big Data and Healthcare

The company is focused on building a stream data platform to help companies get access to enterprise data as real-time streams.

Confluent provides toolsin languages that include Java, C, and C++ and allows messages to be produced or consumed with any network connected tool using its REST proxy.

The company, which used to go by the name 0xdata, was founded in 2011 and has raised $33.6m in capital from investors such as Nexus Venture Partners, Paxion Capital Partners, and Transamerica Ventures.

Founded by SriSatish Ambati, co-founder of Platfora and Cliff Click, lead developer of the Java Virtual Machine, H20 started with the idea of making it easier for developers and data scientists to use machine learning algorithms in their applications.

The company offers an open source machine learning platform that is designed to work with Hadoop and Spark while being used through a Web UI or in different programming environments such as R, Java, Python, Scala and JSON.

Read Also:
6 Ways Companies Can Leverage Machine Learning Algorithms

The platform supports database and file types such as Microsoft Excel, R Studio, and Tableau.

H2O can help develop models to build machine learning capabilities so that data can be parsed, ingested and modelled. At its most basic, the technology helps to quickly create and deploy machine learning algorithms.

AtScale was founded in 2013 and has so far raised $9m in capital from investors such as AME Cloud Ventures, Storm Ventures, and UMC Capital.

The company was created with the idea of solving the problem of using familiar business intelligence (BI) tools and interfaces, such as SQL and Tableau, with technologies such as Hadoop – it basically bridges the gap between the business users, their visualisation tool and their underlying Hadoop platform.

The goal is for companies to be able to perform analysis with the data in place, removing the need to move it to a specialised analysis tool, which can be time consuming and costly.

Read Also:
The Reality for Smart Businesses is Big Data

AtScale was created by Hadoop and BI veterans who have developed the ability to turn Hadoop clusters into scale-out OLAP servers.

It supports BI tools that can talk to SQL or MDX.

Labelling itself as delivering behavioural analytics for event data, the company helps firms to make data driven decisions.

Co-founded in 2013 by CEO Ann Johnson and Bobby Johnson, CTO, the company has raised $28.2m, including $20m in a Series B round that was led by Index Ventures.

Interana’s focus is on providing interactive analytics that help businesses to answer questions about how their customers behave and how products are being used.

The company uses a proprietary database that allows it to deal with billions of events with speed.

Read Full Story…

Leave a Reply

Your email address will not be published. Required fields are marked *