A methodology for solving problems with DataScience for Internet of Things

A methodology for solving problems with DataScience for Internet of Things

A methodology for solving problems with DataScience for Internet of Things

This two part blog is based on my forthcoming book:  Data Science for Internet of Things.

It is also the basis for the course I teach  Data Science for Internet of Things Course.  I will be syndicating sections of the book on the Data Science Central blog.  Welcome your comments.  Please email me at ajit.jaokar at futuretext.com  - Email me also for a pdf version if you are interested in joining the course

Here, we start off with the question:  At which points could you apply analytics to the IoT ecosystem and what are the implications?  We then extend this to a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  I have illustrated my thinking through a number of companies/examples.  I personally work with an Open Source strategy (based on R, Spark and Python) but  the methodology applies to any implementation. We are currently working with a range of implementations including AWS, Azure, GE Predix, Nvidia etc.  Thus, the discussion is vendor agnostic.

I also mention some trends I am following such as Apache NiFi etc

As we move towards a world of 50 billion connected devices,  Data Science for IoT (IoT  analytics) helps to create new services and business models.  IoT analytics is the application of data science models  to IoT datasets.  The flow of data starts with the deployment of sensors.  Sensors detect events or changes in quantities. They provide a corresponding output in the form of a signal. Historically, sensors have been used in domains such as manufacturing. Now their deployment is becoming pervasive through ordinary objects like wearables. Sensors are also being deployed through new devices like Robots and Self driving cars. This widespread deployment of sensors has led to the Internet of Things.

Read Also:
Analytics for the Masses: Five Things to Consider

Features of a typical wireless sensor node are described in this paper (wireless embedded sensor  architecture). Typically, data arising from sensors is in time series format and is often geotagged. This means, there are two forms of analytics for IoT: Time series and Spatial analytics. Time series analytics typically lead to insights like Anomaly detection. Thus, classifiers (used to detect anomalies) are commonly used for IoT analytics to detect anomalies.  But by looking at historical trends, streaming, combining data from multiple events(sensor fusion), we can get new insights. And more use cases for IoT keep emerging such as Augmented reality (think – Pokemon Go + IoT)

Meanwhile,  sensors themselves continue to evolve. Sensors have shrunk due to technologies like MEMS. Also, their communications protocols have improved through new technologies like LoRA. These protocols lead to new forms of communication for IoT such as Device to Device; Device to Server; or Server to Server. Thus, whichever way we look at it, IoT devices create a large amount of Data. Typically, the goal of IoT analytics is to analyse the data as close to the event as possible. We see this requirement in many ‘Smart city’ type applications such as Transportation, Energy grids, Utilities like Water, Street lighting, Parking etc

Read Also:
3 Ways to Tear Down Silos and Boost Your Business Intelligence

Once data is captured through the sensor, there are a few analytics techniques that can be applied to the Data. Some of these are unique to IoT. For instance, not all data may be sent to the Cloud/Lake.  We could perform temporal or spatial analysis. Considering the volume of Data, some may be discarded at source or summarized at the Edge. Data could also be aggregated and aggregate analytics could be applied to the IoT data aggregates at the ‘Edge’. For example,  If you want to detect failure of a component, you could find spikes in values for that component over a recent span (thereby potentially predicting failure). Also, you could correlate data in multiple IoT streams. Typically, in stream processing, we are trying to find out what happened now (as opposed to what happened in the past).  Hence, response should be near real-time.

 



Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
9 Promising Cloud Security Startups To Watch
Read Also:
Florian Douetteau, Dataiku Founder, on Empowering Data Scientists

Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
Qualitative Data: The Context that Gives Meaning to Your Big Data

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
Emerging Jobs: Predictive Analytics Professionals in Demand

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
3 Ways to Tear Down Silos and Boost Your Business Intelligence

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
A method based on artificial intelligence allows to diagnose Alzheimer's or Parkinson's

Leave a Reply

Your email address will not be published. Required fields are marked *