2. A Top Down Approach to Creating Value From Industrial IoT Data
Sensor data from heavy equipment are, materially, no different from many other data sources. For example, GPS measurements on construction vehicles could be used in a consumer application to provide motorists with more accurate traffic predictions. However, the new opportunity presented by these data is to improve the efficiency and operation of businesses within traditional industries. Industry analysts have written that the Fourth Industrial Revolution will be enabled in part by the data availability that comes with the industrial IoT. We focus our discussion on a small piece of this transformation. Specifically, we discuss the question: If a company relies on heavy equipment to be productive, how can a data scientist use sensor data to enhance that productivity? We describe an overall approach to answering this question that can be applied to individual companies or industries.
To begin solving problems in industrial IoT, we encourage data scientists to start from the basics of a company's business. Our view is that it is critical for data scientists to understand the details of how a company creates value and, more generally, the key performance indicators (KPIs) that companies often measure themselves on. Data scientists then measure their performance by showing improvement on appropriate KPIs.
For example, in the rail industry, failures per locomotive year (FLY) is a core KPI that gets tracked. Mechanical failures not only result in expensive repairs, but the associated unplanned downtime can be even more costly. Revenue lost due to unplanned downtime has been estimated at $160,000 per locomotive per year, and it has been estimated that Class 1 railroads (those generating a minimum of around $400 million in revenue per year) can realize an annual savings of $80 million if only 10% of unplanned maintenance is converted to planned maintenance. Reduced FLY lowers both maintenance and unplanned downtime costs by catching failures before they get serious and before they affect the overall operation of a rail network. Data scientists in this area can then be confident they are creating value by focusing on reducing failures.
Data scientists won't necessarily be asked to tie their work to specific KPIs - a data scientist working in a purely consultative capacity may simply need to solve a set of problems already defined by a stakeholder - however, we believe there are a number of advantages to proactively defining problems and solutions in this way. First, a data scientist's work is clearly tied to a company's mission and bottom line. Second, focusing on business drivers can provide self-evident success criteria for the project and can improve communication across all stakeholders. And third, issues of scale and solvability tend to be surfaced earlier, potentially saving data scientists and others time and effort.
An example in electrical power transmission illustrates the third point. System Average Interruption Duration Index (SAIDI) is a measure of power outages and severity. However, outages and equipment failures in this industry are frequently caused by squirrels. The American Public Power Association has even written a tongue-in-cheek "Open Letter to Squirrels" as a tribute to their ubiquity. It may be possible for data scientists to estimate spatiotemporal averages of 'squirrel-risk' as an attempt to protect against squirrel-related outage events, but of course a data scientist cannot, on any given day, predict whether such an event will happen.
With the hype around both data science and the Internet of Things, data scientists will be under extra pressure to create compelling solutions. We believe a concerted focus on the mechanics of how solving a data science problem leads to business value will help ensure that the problems attempted are realistically solvable and valuable.