On Tuesday, IBM announced the first cloud-based development environment for near real-time, high performance analytics, giving data scientists the ability to access and ingest data and deliver insight-driven models to developers.
Available on the IBM Cloud Bluemix platform, the Data Science Experience provides 250 curated data sets, open source tools and a collaborative workspace to help data scientists uncover and share meaningful insights with developers.
Building on its $300 million investment in developing Apache Spark as a type of “analytics operating system,” IBM created the Data Science Experience to extend the speed and agility of Spark to more than two million members of the R community through new contributions to SparkR, SparkSQL and Apache SparkML. As a result, data scientists who work in R should have faster access to more data, and in turn, more insights delivered from the IBM Cloud.
The Data Science Experience’s open environment allows data scientists to accelerate and simplify data ingestion, curation and analysis by bringing together the content, data, models and open source resources from IBM and others including H2O, RStudio, Jupyter Notebooks on Apache Spark in a single security-rich managed environment.
“With Apache Spark, we see an opportunity to significantly transform the role of the data scientist by providing access to curated data sets, open source tools and a collaborative platform to accelerate innovation,” said Bob Picciano, Senior Vice President, IBM Analytics.
IBM is already working with organizations to use data science applications built on Apache Spark.