3. Predictive Maintenance
3.2. Data and Implementation Challenges
As with many data science problems, the core of solving a predictive maintenance problem involves gathering data, conducting analysis, building and deploying a model, and tracking outcomes and feedback to ensure the model is performing appropriately. A host of technical and statistical issues make this challenging. We enumerate a set of challenges here and refer to them in the following subsections containing our recommendations.
(A) Data quality is difficult to guarantee
A1 Lack of complete and quality failure information is perhaps the most difficult problem to solve. Unlike sensors that collect data automatically, documentation of failures and their fixes usually depends on mechanics in a shop doing this documentation. Unsurprisingly, data quality for data science is not a priority for most mechanics. Some shops work on paper records as well. This adds another layer of complexity to getting the right data into the hands of data scientists.
A2 Many mobile machines rely on cellular or satellite connections to transmit data. For older nonmobile machines, sending data to the cloud often means retrofitting to older hardware. In both cases, drops in data can occur and connection can be expensive. This creates data that can be spotty and out of order, contain duplicates, and can force tradeoffs to be made on what data to collect even before a data scientist has seen data samples. In addition, critical or erratic machine operation can also cause issues for sensors, creating a scenario where data gaps exist precisely during the critical periods where data is needed.
A3 Outside of connectivity, sensor configurations also cause headaches because not all sensors are installed in precisely the same way on even the same types of machines. This causes modeling issues because some type of central calibration may be needed before a model can be applied confidently at the desired scale. In addition, not all components have sensors that can be used for predictive maintenance. Many parts won't have sensors and other parts may have sensors that do not serve predictive maintenance purposes.
(B) Clean data isn't always revealing or easy to work with
B1 Replacements are not equivalent to failures. For example, planned maintenance, such as changing oil every 3,000 miles, results in replacements without failure. Part failures also cause working parts to fail. For example, a flat tire could cause a collision, further causing other part replacements. The consequence of this is that even a perfect record of part replacements may not provide a consistent target to train against when building a machine learning model.
B2 High value failures are rare. Machine prognostics mirror medical prognostics and survival analysis where events are rare or censored. While this is good news for companies operating these machines, data scientists may find difficulty when gathering even years of data yield only a handful of failure examples to work with. In addition, for the most complicated machines, there are a wide range of failure types. This can mean that the value of preventing a single failure type may be negligible, but value grows significantly as more failures are prevented.
B3 In contrast to rare failure data, sensor signal data can be enormous. Vibration sensors collect data many thousands of times per second. Nonvibration sensors will collect data once per second or more frequently. This puts a strain on computation when doing exploratory analysis, and in some cases, practitioners will need to work only on summaries of the underlying data as opposed to the raw data itself.
B4 Many sources of data are highly dependent. In a statistical sense, all data coming from a single piece of equipment are dependent. All data coming from groups of equipment in the same geographic area are dependent. Even data generated by different pieces of equipment but with the same operator will be dependent. Data dependencies affect and may limit modeling and validation approaches.
B5 Machine context matters. Machines age, operate in hot and cold climates, go into tunnels and through mud and work in otherwise very extreme conditions. Each of these modes of operation can change the signatures of data coming off a machine.
(C) A perfect prediction alone doesn't directly translate into value
C1 If failure signatures are detected by a model, acting on a model prediction requires manual work and logistics. For example, to replace a failing part on a machine, the right new part must be available at the right maintenance shop. Inventory management presents tremendous challenges on its own; see, for example, Williams and Tokar for an overview. Creating the right prediction and delivering it in such a way that enables the right follow-up workflow can be a challenge.
C2 Predictive maintenance problems can be "high-stakes" problems. High dollar amounts - and in some cases, human safety - are connected to actions both taken and not taken. Consumers of predictions must be able to trust a prediction in order to confidently take the right actions.
Successful approaches to predictive maintenance and prognostics will confront some of these issues head-on and side-step others.