3. Predictive Maintenance
3.3. Recommendations for Model Building
Prognostics models may be as simple as creating a rule - for example, a simple low-fuel indicator is a rule that helps operators prevent fuel outages - or may involve complex physical simulations to determine acceptable bounds for mechanical parameters. Machine Learning approaches fall somewhere in the middle of these extremes in terms of complexity and focus directly on developing functions of the data to optimize empirical performance metrics. We give recommendations around machine learning model building based on our experience. Given many of the data challenges discussed in Section 3.2, our recommendations involve collecting the right data to enable model building and handling that data carefully so that the right conclusions from the data can be drawn.
- Focus on good cross-validation. Cross-validation is generally good practice for modeling. However, points A1, A2, A3, B1, and B5 make it especially difficult to trust a predictive maintenance model on training performance alone. Dependency in the data (point B4) creates additional overfitting concerns. Various forms of blocking (grouping certain data together so dependent data doesn't end up in both training and testing procedures) are practical ways to deal with dependent data and will ensure offline performance metrics more accurately reflect performance of a deployed model. Machine learning methods that are more robust to overfitting - for example, random forests - are not a replacement for good cross-validation in our experience. Glickman et al. show similar findings.
- Gather information from subject matter experts (SMEs). Given that failure records may not be dependable for model building (points A1 and B1), input from SMEs can help cover gaps in records. Their input such as understanding of physical properties or operational context can also help modelers make better sense of high dimensional data and rare failures (B2 and B3). SMEs' input won't address all informational and data gaps; for example, we have gone into the field to directly gather the right data in some cases. However, leveraging the holistic experience offered by many SMEs can help data scientists build and contextualize their models faster.
- Systematically gather contextual data. Per A3 and B5, individual machines can experience a wide range of conditions. Collecting data on these conditions - and making that data available at model runtime - allows models or modelers to account for different modes in the data. One interesting example we encountered involved gathering data on locomotive tunnels. Running a locomotive inside a tunnel causes average temperatures to rise and creates spikes in other signals. Under nontunnel conditions, these signatures could indicate impending part failures. After determining the existing model would not be able to properly differentiate between problems and tunnels, our team (1) built a map of all tunnels in the associated rail networks and (2) made this information available as a feature to control our model. The additional contextual data in this example helped reduce false positives from our model. Contextual data can be used to enhance visualizations as well.
- Seek out or create data sources that measure direct component degradation or performance. As a simple example, consider again the fuel gauge. If running out of fuel is a failure condition, the fuel gauge provides a direct measurement of remaining operational life. In this example, solving the predictive maintenance problem is almost as simple as checking the fuel gauge. Machine components will rarely have measures this direct (point A3), but if they do, they should be found and used. If a degradation measure does not exist, it can be created in some cases. Vibration sensors, for example, are added to equipment to co-indicate degradation of bearings and other components. Using vibration data, root mean square (usually called RMS), a measure of vibration energy, can be measured and trended to find systems or components that are not operating properly.
The final points associated with turning model predictions into real-world value (C1, C2) may be best addressed through clear model interpretations and communication.