Many of the largest companies — like Facebook with FBLearner — in the world have built an edge for themselves by making the process of studying their users and data science ongoing, tracking closely how things change over time. But for other businesses, those studies might just end with a single chart or predictive analysis report.
Ian Swanson started DataScience about two and a half years ago in order to provide those businesses with that kind of information. The company employed data scientists internally that would work with outside businesses. But over the past year, DataScience has been working to build an array of tools — once that they’ve used internally — that businesses can hand to data scientists internally to essentially get the same thing done, which is launching today in the form of DataScience Cloud.
Here’s the short version: with DataScience Cloud, employees are able to look up a range of information across a wide array of sources — ranging from internal, unstructured databases to Salesforce accounts — with single SQL queries that have been optimized to work across all those buckets. They can then write predictive models for that data in the form of code, and deploy that code internally so other parts of the company can use to run simulations or additional tests against in order to predict better outcomes.
“Data scientist, and data science, is a pretty narrow focused term like a lot of roles: statistician, actuary, and so on,” Swanson said. “The common term we might hear is data science. They spend a lot of time performing engineering tasks, many times they fail to make an impact in their business. They might create from a simple standpoint some SQL queries, start to build a model using python or R, but once they create that model, one they can predict a user is leaving business, what do they do with it? You have to become algorithmic.”
Part of the challenge that led to DataScience was the process of actually building out those models. For example, it might be easier to put together a predictive model using Python, R or MATLAB for a data scientist, but it may need to be implemented in Java to be used across the organization.