In the last two decades, more than six billion devices have come online. All those connected “things” (collectively called – The Internet of Things) generate more than 2.5 quintillion bytes of data daily. That’s enough to fill 57.5 billion 32 GB iPads per day (source Gartner). All this data is bound to significantly impact many business processes over the next few years. Thus, the concept of IoT Analytics (Data Science for IoT) is expected to drive the business models for IoT. According to Forbes, strong analytics skills are likely to lead to 3x more success with Internet of Things. We cover many of these ideas in the Data Science for IoT course.
Data Science for IoT has similarities but also some significant differences. Here are 10 differences between Data Science for IoT and traditional Data Science.
1. Working with the Hardware and the radio layers
This may sound obvious, but it’s easy to underestimate. IoT involves working with a range of devices and also a variety of radio technologies. It is a rapidly shifting ecosystem with new technologies like LoRa, LTE-M, Sigfox etc. The deployment of 5G will make a big difference because we would have both Local area and Wide area connectivity. Each of the verticals (we track Smart homes, Retail, Healthcare, Smart cities, Energy, Transportation, Manufacturing and Wearables) also have a specific set of IoT devices and radio technologies. For example, for wearables you see Bluetooth 4.0 in use but for Industrial IoT, you are likely to see cellular technologies which guarantee Quality of Service such as the GE Predix alliance with Verizon
In traditional Data Science, Big data usually resides in the Cloud. Not so for IoT!. Many vendors like Cisco and Intel call this as Edge Computing. I have covered the impact of Edge analytics and IoT in detail in a previous post: The evolution of IoT Edge analytics.
IoT needs an emphasis on different models and also these models depend on IoT verticals. In traditional Data Science, we use a variety of algorithms (Top Algorithms Used by Data Scientists). For IoT, time series models are often used. This means : ARIMA, Holt Winters, moving average. The difference is the volume of data but also more sophisticated real time implementations of the same models ex (pdf) : ARIIMA: Real IoT implementation of a machine learning architecture …. The use of models vary across IoT verticals. For example in Manufacturing : predictive maintenance, anomaly detection, forecasting and missing event interpolation are common. In Telecoms, traditional models like churn modelling, cross sell, upsell model , customer life time value could include IoT as an input.
If you consider Cameras as sensors, there are many applications of Deep Learning algorithms such as CNNs for security applications, eg from hertasecurity. Reinforcement learning also has applications for IoT as I discussed in a post by Brandon Rohrer for Reinforcement Learning and Internet of Things
IoT datasets need a different form of Pre-processing. Sibanjan Das and I referred to it in Deep learning – IoT and H2O.