3. Predictive Maintenance
3.4. Recommendations for Communication
Good predictions alone do not immediately translate into value (points C1 and C2). Building trust in a single prediction requires clear interpretations and clear evidence. Building trust in a set of predictions may additionally require experimentation and A/B testing.
When communicating a single predictive maintenance prediction, we prefer to express predictions in binary terms. We also attempt to automatically present clear evidence in support of both positive and negative predictions. We pair our predictions with written interpretations beginning with a phrase like, 'There is evidence of a problem,' with the evidence presented in well-thought-out figures or a series of plots. Or, to communicate a negative prediction, we might write 'there is no evidence of a problem,' a statement that should be readily verified with accessible data. More generally, our aim is to present model predictions as simply another data source. Like any data source, it relies on assumptions and can be wrong. Like any data source, consumers should be familiar with the assumptions and premises resulting in a prediction. We believe thinking through and expressing predictions in this way - even in cases where a significant amount of uncertainty about a prediction exists - empowers consumers to evaluate predictions themselves so that ultimately the right actions can be taken. As potential 'high stakes' predictions, we believe this approach also aligns with approaches discussed in Rudin, which call for deeper interpretability of models used in scenarios like these.
The value of clear, interpretable predictive maintenance predictions will be self-evident in many contexts. When it is not, an ideal way to quantify value is to conduct an experiment. Understandably, industry stakeholders do not jump at the chance to have data scientists run experiments involving their multimillion-dollar assets. Likewise, data scientists should not feel completely free to conduct any experiment they like, since they will not bear the full cost of running those experiments. We find that when experimentation is possible, impactful experiments depend on trust built with stakeholders.
As one example, our team was able to run an A/B test to identify a large set of mis-calibrated machines for one customer. To start this experiment, we worked with the manager of these machines to identify a subset to give a special calibration as a treatment. This subset was chosen to maximize measurement capability and minimize potential impact on operations. The machines outside this subset were left untreated. By tracking productivity of these machines posttreatment, we proved that the special calibration improved output. Consequently, all machines were given the calibration, creating a measurable bump in output for that population of machines. This was a great outcome for both parties. We achieved this by building trust through quality communication and finding an acceptably small but measurable way to get to our goal.