Proactive Supply Chain Performance Management with Predictive Analytics

Read this article. A predictive performance management model is introduced to manage complex business network collaborations and minimize uncertainty. Pay attention to the innovative performance management systems characteristics. What other attributes would you add to the list?

Data Mining Model for KPI Prediction

While proliferation of reporting and multidimensional analytics has greatly benefited many organizations of all sizes, the next step in promoting business agility and operational efficiency is to make the leap from retrospective analysis of historical data to proactive actions based on predictive analysis of business data and to embed intelligent, fact-based decision making into business processes. The key to accomplishing this is to use powerful data mining algorithms to analyze large data sets, compare new data to historical facts and behaviors, identify classifications and relationships between business entities and attributes, and deliver accurate predictive insights into all of the systems and users who make business decisions.

Building a data mining model is a part of a larger process that includes everything from defining the basic problem the model will solve to deploying it into a working environment. A model typically contains input columns, an identifying column, and a predictable column. Data type for the columns can be defined in a mining structure based on which algorithms process the data. Depending on the case, a column can be the following:

(i) continuous column: this column contains numeric measurements typically the product cost, salary, account balance, shipping date, and invoice date having no upper bound.

(ii) discrete column: these are finite unrelated values such as product category, location, age, and telephone area codes. They do not need to be numeric in nature and typically do not have a fractional component.

(iii) discretized column: this is a continuous column converted to be discrete, for example, grouping salaries into predefined bands.

(iv) key: the column which uniquely identifies the row, similar to the primary key.

Different models for a given business problem could be used for analyzing various business scenarios, identifying the analytical requirements, tuning the parameters, and evaluating the results of the models to make a business decision.

Predictive models can be used to forecast explicit values, based on patterns determined from known results. For example, we can define the target customer service level KPI and set the target to 95%. Then, based on the historical data, a model can be built that predicts this KPI in the future.

There is a variety of techniques developed to achieve that goal - typically applying different models to the same data set and then comparing their performance to choose the best one. For the KPI prediction, different DM models and algorithm can be used depending on the goal and business case. Here, the details about considering various models and choosing the best one based on their predictive performance (i.e., explaining the variability in question and producing stable results across samples) are briefly explained.

(i) Classification algorithms (such as decision trees) predict one or more discrete variables, based on the other attributes in the dataset.

(ii) Regression algorithms predict one or more continuous variables, such as profit or loss, based on other attributes in the dataset.

(iii) Time series algorithms forecast the patterns based on the current set of continuous predictable attributes.

(iv) Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis or KPI analysis.

Choosing the right algorithm to use for a specific business task can be challenging. While it is possible to use different algorithms to perform the same business task, each algorithm produces a different result, and some algorithms can produce more than one type of result. For example, decision trees algorithm can be used not only for predictions, but also as a way to reduce the number of columns in a dataset, because the decision tree can identify columns that do not affect the final mining model. The type of algorithm depends on the type of prediction. For example, if we are predicting a discrete attribute (i.e., out-of-stock) we can use naïve Bayes, decision trees, or neural networks. If we are predicting a continuous attribute (i.e., supply chain sales amount), a time series algorithm can be used. If we are predicting a sequence (i.e., analyzing the factors leading to delivery failure), a sequence clustering can be used. Using more than one data mining model over the same mining structure is a good practice, since the best model can be selected for predictions. Lift charts can be useful tool to check the accuracy of the data mining models once built on the input data.

In contrast to standard KPIs that only report past or at best the present performance, predictive KPI looks forward and use data mining to show what the situation will be in the next month, quarter, or year. For example, we can use customer data to predict a future demand and thus better plan the production and inventory management processes. Or we can predict disruption in delivery and proactively plan alternative delivery modes.

This allows organization to react before certain disruption happens. Predictive KPI can give the insight into emerging trends or into potential opportunities or problems.

The advantage of using the OLAP-based KPIs is that they are server based and can be consumed by a variety of clients. This means that each client throughout the supply chain will access a single version of the truth, thus making coordination efforts much easier. Also, making a complex calculation on the server can have performance benefits.


Building Prediction Data Mining Models

In this section, two approaches for building KPI prediction models within the UDM are introduced:

(i) using OLAP data mining dimensions,

(ii) using prediction tables.

Data mining dimensions are results of predictive calculations which are saved into the cube as new dimensions. These dimensions can be browsed or even sliced and diced by results of predictions just as with any other dimension. The special MDX predict function can be used to perform predictions. This allows performing prediction joins against data mining models from queries within the cube. When calculating KPI elements such as value, target, or status, we make use of the data mining dimension.

This procedure can be used for various supply chain analysis tasks such as inventory out-of-stock prediction, supplier lead time prediction, and forecasting of customer demand or order fulfillment time. For these tasks, different data mining algorithms such as decision trees, time series, or neural networks can be used. Models can be evaluated and compared in terms of accuracy and precision.

The alternative way for making KPI predictions is using the prediction tables. Prediction tables are just any other tables used for an OLAP cube. They can be a measure group or a dimension, but typically they will be a measure group. In this approach data mining predictions are performed within the ETL (extract, transform, and load) process. ETL package pulls data from the data source, performs a prediction task, and loads the results into a new prediction table. This gives us greater flexibility because we can add a new table to the data warehouse. It is also more flexible for scheduling the training of the model and for maintaining the model. The model can be defined outside the cube and does not need to be processed along with the cube. So, the model can be replaced if we find the better model without altering the cube.

Figure 5 shows an example of the ETL data and control flows that perform data extraction, integration, cleansing, and data mining prediction.

Figure 5 Data mining ETL package.



In the first step ETL package pulls data from different supply chain sources and passes it to a certain data processing component and then into the prediction component and data mining query. This can call any OLAP server or data mining model and return the results in the ETL pipeline. Then, we can perform different operations: populate measures, split predictions into good and poor predictions, or define any type of filtering or modifications. In order to design such ETL packages we used special ETL components such as a data mining model training destination for training data mining models and a data mining query transformation that can be used to perform predictive analysis on data as it is passed through the data flow. The MDX expressions for building KPIs over prediction tables are just the same as for any other KPIs.

Both presented predictive modeling approaches can be used for making the KPI predictions, so the designers have possibility chosen appropriate design approach. The selection depends on the particular scenario. Generally, data mining dimensions offer slightly better performance and slice and dice capabilities, but data mining models must be within the same cube which means it is not possible to use data mining models from another cube or server. On the other hand, the approach with prediction tables performs predictions within the ETL service, instead of the OLAP service. This imposes some additional load on the server during ETL package execution, whereas OLAP cubes can be preprocessed before deployment. However, prediction tables offer more flexibility in terms of scheduling and maintenance, and the models can be defined and maintained outside the cube or replaced with better (more accurate) models without altering the cube. Also, this approach is more suitable for integration scenarios, where supply chain partners can have different analytical systems.


Validation of Data Mining Prediction Models

Before deploying and utilizing prediction models into production, they must be validated. This is a very important step in the data mining process because we need to know how well models perform against actual data. For the validation of the proposed predictive models, a real-world data set from the automotive company was used.

There is no single all-inclusive method which can prove quality of the data and the model. There are several approaches for evaluating the quality of a data mining prediction model. We can use various statistical techniques or involve supply chain domain experts to analyze the prediction results. Furthermore, we can split existing data set into training and testing sets in order to check the accuracy of the model. The training set is used to create the mining model. The testing set is used to test model accuracy.

These approaches are not mutually exclusive but can be combined together during design and testing phases to refine models through series of iterations. Various tools can be used for testing data mining prediction model: lift charts, profit charts, scatter plots, classification matrix, cross-validation, and so forth. Figure 6 illustrates sales quantity trends and predictions deviation for a single product at three different geographic regions.

Figure 6 Forecasting data with deviations.



Validation needs to include different measures which relates to accuracy, reliability, and usefulness of the prediction models. Accuracy tells us how well the model correlates the results with the attributes in the data set. Reliability is also very important characteristic of the prediction models which shows how effective the model is with different data sets. This is especially significant in supply chains that include different divisions and organizations with various data sets. If the model produces similar types of predictions or kinds of patterns, it can be considered reliable.

And finally, data mining models have to be useful, meaning that they need to provide certain answers and to support the decision-making process. For example, if percentage of orders that are fulfilled on the customer's originally committed date is decreasing, we need to know why. This is where the key influence analysis comes into action.


End-User Analytics

Once we define OLAP-based predictive KPIs, we can use different client applications for browsing and for slicing and dicing based on various criteria. For example, UDM model enables slicing data by different dimensions (i.e., organization, geography, product, etc.) or dimension hierarchies. Furthermore, data can be filtered by particular values. For example, we can display prediction of the cash-to-cash cycle time KPI for particular year and quarter, country, and organization.

Additionally, predictive analysis can detect attributes that influence KPIs. Business users can monitor trends and analyze key influencers in order to identify those KPIs (attributes) that have a sustained effect or significant positive or negative impact, for example, identifying whether price discount on a certain product has long-term impact on sales or only produces a short-term effect. Such actionable insights enable companies to better plan improvement strategy and improve their responsiveness.

The UDM also allows the option to define actions in relation to query results. It provides a way to define actions that a client can perform for a given context. This feature goes further than traditional analytical applications which only present data. Furthermore, it provides mechanism to discover problems and deficiencies, thus improving the supply chain performance. An action can start a specific application or load information from a database or a data warehouse. For example, a drill-through action can show detailed rows behind a total, or a reporting action can launch a report based on a dimension attribute's value (parameters can be passed via URL). Hyperlink actions can open specific pages or applications such as a web page showing SCOR recommended best practices for particular process. Actions can be specific to any displayed data, including individual cells, dimension members, or KPIs, resulting in more detailed analysis or even integration of the analysis application into a larger data management framework.

After using the data mining for predictive KPIs, it is possible to use different client applications and technologies such as web portal dashboards, scorecard systems, spreadsheets, web services, or feeds to display and analyze information.