1. Introduction

The stereotypical environment for a data scientist is decidedly not heavy industrial. Sleek workplaces furnished with beanbags may seem like a far cry from a factory, a mine, or other environments relying on heavy equipment to do work.  However, the Internet of Things (IoT) - a term referring to the estimated billions of devices that can collect data with sensors and transmit that data - will change this image for some practitioners. Heavy equipment can contain hundreds or thousands of sensors, and, with the rise of IoT, the data collected by these sensors can be accumulated and analyzed to create economic value.

Increased connectivity of heavy equipment, and, more generally, connectivity of any device with a sensor, is driven by a number of factors. Decreasing costs of bandwidth, accessibility of Wi-Fi and cellular networks, and robust cloud infrastructures are making sensor data collection, transmission, storage, and analysis easier; see the study by Goldman Sachs. This study estimated there were around two billion connected devices in 2000 and a projected 28 billion connected devices by 2020. Consumer products such as exercise bracelets and smart thermostats may be the most visible examples of this phenomenon; however, this same study estimated the opportunity for IoT in the industrial space alone to be $2 trillion in 2020. These estimates, of course, are based on assumptions and data collected in 2014; therefore, some caution is warranted when interpreting these numbers. However, more recent estimates in Gartner, IoT Analytics, and Ericsson, further indicate the market for IoT is large and growing.

Given these new opportunities, traditional industrial companies, tech companies, and a host of startups are competing for space in the industrial IoT market. To do this, many are relying on data scientists to analyze, visualize, and create predictions from these new data streams. Uptake, the company we work for, is one startup that focuses on equipment reliability and productivity. This article focuses on our experience building data science solutions for industrial IoT applications. We first present our approach to framing problems in industrial IoT.  Next, we discuss predictive maintenance, a method of using IoT data to improve maintenance practices. In particular, we use predictive maintenance to highlight the challenges present in working with sensor data and describe our approaches to overcoming these challenges.  Finally, we discuss training for aspiring industrial data scientists.