Read this article. A predictive performance management model is introduced to manage complex business network collaborations and minimize uncertainty. Pay attention to the innovative performance management systems characteristics. What other attributes would you add to the list?
Data Mining Model for KPI Prediction
While proliferation of
reporting and multidimensional analytics has greatly benefited many
organizations of all sizes, the next step in promoting business agility
and operational efficiency is to make the leap from retrospective
analysis of historical data to proactive actions based on predictive
analysis of business data and to embed intelligent, fact-based decision
making into business processes. The key to accomplishing this is to use
powerful data mining algorithms to analyze large data sets, compare new
data to historical facts and behaviors, identify classifications and
relationships between business entities and attributes, and deliver
accurate predictive insights into all of the systems and users who make
business decisions.
Building a data mining model is a part of a
larger process that includes everything from defining the basic problem
the model will solve to deploying it into a working environment. A model
typically contains input columns, an identifying column, and a
predictable column. Data type for the columns can be defined in a mining
structure based on which algorithms process the data. Depending on the
case, a column can be the following:
(i) continuous column: this column contains numeric measurements typically the product cost, salary, account balance, shipping date, and invoice date having no upper bound.
(ii) discrete column: these are finite unrelated values such as product category, location, age, and telephone area codes. They do not need to be numeric in nature and typically do not have a fractional component.
(iii) discretized column: this is a continuous column converted to be discrete, for example, grouping salaries into predefined bands.
(iv) key: the column which uniquely identifies the row, similar to
the primary key.
Different models for a given business problem
could be used for analyzing various business scenarios, identifying the
analytical requirements, tuning the parameters, and evaluating the
results of the models to make a business decision.
Predictive
models can be used to forecast explicit values, based on patterns
determined from known results. For example, we can define the target
customer service level KPI and set the target to 95%. Then, based on the
historical data, a model can be built that predicts this KPI in the
future.
There is a variety of techniques developed to achieve
that goal - typically applying different models to the same data set and
then comparing their performance to choose the best one. For the KPI
prediction, different DM models and algorithm can be used depending on
the goal and business case. Here, the details about considering various
models and choosing the best one based on their predictive performance
(i.e., explaining the variability in question and producing stable
results across samples) are briefly explained.
(i) Classification algorithms (such as decision trees) predict one or more discrete variables, based on the other attributes in the dataset.
(ii) Regression algorithms predict one or more continuous variables, such as profit or loss, based on other attributes in the dataset.
(iii) Time series algorithms forecast the patterns based on the current set of continuous predictable attributes.
(iv) Association algorithms find correlations
between different attributes in a dataset. The most common application
of this kind of algorithm is for creating association rules, which can
be used in a market basket analysis or KPI analysis.
Choosing the
right algorithm to use for a specific business task can be challenging.
While it is possible to use different algorithms to perform the same
business task, each algorithm produces a different result, and some
algorithms can produce more than one type of result. For example,
decision trees algorithm can be used not only for predictions, but also
as a way to reduce the number of columns in a dataset, because the
decision tree can identify columns that do not affect the final mining
model. The type of algorithm depends on the type of prediction. For
example, if we are predicting a discrete attribute (i.e., out-of-stock)
we can use naïve Bayes, decision trees, or neural networks. If we are
predicting a continuous attribute (i.e., supply chain sales amount), a
time series algorithm can be used. If we are predicting a sequence
(i.e., analyzing the factors leading to delivery failure), a sequence
clustering can be used. Using more than one data mining model over the
same mining structure is a good practice, since the best model can be
selected for predictions. Lift charts can be useful tool to check the
accuracy of the data mining models once built on the input data.
In
contrast to standard KPIs that only report past or at best the present
performance, predictive KPI looks forward and use data mining to show
what the situation will be in the next month, quarter, or year. For
example, we can use customer data to predict a future demand and thus
better plan the production and inventory management processes. Or we can
predict disruption in delivery and proactively plan alternative
delivery modes.
This allows organization to react before certain
disruption happens. Predictive KPI can give the insight into emerging
trends or into potential opportunities or problems.
The advantage
of using the OLAP-based KPIs is that they are server based and can be
consumed by a variety of clients. This means that each client throughout
the supply chain will access a single version of the truth, thus making
coordination efforts much easier. Also, making a complex calculation on
the server can have performance benefits.
Building Prediction Data Mining Models
In this section, two approaches for building KPI prediction models within the UDM are introduced:
(i) using OLAP data mining dimensions,
(ii) using
prediction tables.
Data mining dimensions are results of
predictive calculations which are saved into the cube as new dimensions.
These dimensions can be browsed or even sliced and diced by results of
predictions just as with any other dimension. The special MDX predict
function can be used to perform predictions. This allows performing
prediction joins against data mining models from queries within the
cube. When calculating KPI elements such as value, target, or status, we
make use of the data mining dimension.
This procedure can be
used for various supply chain analysis tasks such as inventory
out-of-stock prediction, supplier lead time prediction, and forecasting
of customer demand or order fulfillment time. For these tasks, different
data mining algorithms such as decision trees, time series, or neural
networks can be used. Models can be evaluated and compared in terms of
accuracy and precision.
The alternative way for making KPI
predictions is using the prediction tables. Prediction tables are just
any other tables used for an OLAP cube. They can be a measure group or a
dimension, but typically they will be a measure group. In this approach
data mining predictions are performed within the ETL (extract,
transform, and load) process. ETL package pulls data from the data
source, performs a prediction task, and loads the results into a new
prediction table. This gives us greater flexibility because we can add a
new table to the data warehouse. It is also more flexible for
scheduling the training of the model and for maintaining the model. The
model can be defined outside the cube and does not need to be processed
along with the cube. So, the model can be replaced if we find the better
model without altering the cube.
Figure 5 shows an example of
the ETL data and control flows that perform data extraction,
integration, cleansing, and data mining prediction.
Figure 5 Data mining ETL package.

In
the first step ETL package pulls data from different supply chain
sources and passes it to a certain data processing component and then
into the prediction component and data mining query. This can call any
OLAP server or data mining model and return the results in the ETL
pipeline. Then, we can perform different operations: populate measures,
split predictions into good and poor predictions, or define any type of
filtering or modifications. In order to design such ETL packages we used
special ETL components such as a data mining model training destination
for training data mining models and a data mining query transformation
that can be used to perform predictive analysis on data as it is passed
through the data flow. The MDX expressions for building KPIs over
prediction tables are just the same as for any other KPIs.
Both
presented predictive modeling approaches can be used for making the KPI
predictions, so the designers have possibility chosen appropriate design
approach. The selection depends on the particular scenario. Generally,
data mining dimensions offer slightly better performance and slice and
dice capabilities, but data mining models must be within the same cube
which means it is not possible to use data mining models from another
cube or server. On the other hand, the approach with prediction tables
performs predictions within the ETL service, instead of the OLAP
service. This imposes some additional load on the server during ETL
package execution, whereas OLAP cubes can be preprocessed before
deployment. However, prediction tables offer more flexibility in terms
of scheduling and maintenance, and the models can be defined and
maintained outside the cube or replaced with better (more accurate)
models without altering the cube. Also, this approach is more suitable
for integration scenarios, where supply chain partners can have
different analytical systems.
Validation of Data Mining Prediction Models
Before
deploying and utilizing prediction models into production, they must be
validated. This is a very important step in the data mining process
because we need to know how well models perform against actual data. For
the validation of the proposed predictive models, a real-world data set
from the automotive company was used.
There is no single
all-inclusive method which can prove quality of the data and the model.
There are several approaches for evaluating the quality of a data mining
prediction model. We can use various statistical techniques or involve
supply chain domain experts to analyze the prediction results.
Furthermore, we can split existing data set into training and testing
sets in order to check the accuracy of the model. The training set is
used to create the mining model. The testing set is used to test model
accuracy.
These approaches are not mutually exclusive but can be
combined together during design and testing phases to refine models
through series of iterations. Various tools can be used for testing data
mining prediction model: lift charts, profit charts, scatter plots,
classification matrix, cross-validation, and so forth. Figure 6
illustrates sales quantity trends and predictions deviation for a single
product at three different geographic regions.
Figure 6 Forecasting data with deviations.

Validation
needs to include different measures which relates to accuracy,
reliability, and usefulness of the prediction models. Accuracy tells us
how well the model correlates the results with the attributes in the
data set. Reliability is also very important characteristic of the
prediction models which shows how effective the model is with different
data sets. This is especially significant in supply chains that include
different divisions and organizations with various data sets. If the
model produces similar types of predictions or kinds of patterns, it can
be considered reliable.
And finally, data mining models have to
be useful, meaning that they need to provide certain answers and to
support the decision-making process. For example, if percentage of
orders that are fulfilled on the customer's originally committed date is
decreasing, we need to know why. This is where the key influence
analysis comes into action.
End-User Analytics
Once we
define OLAP-based predictive KPIs, we can use different client
applications for browsing and for slicing and dicing based on various
criteria. For example, UDM model enables slicing data by different
dimensions (i.e., organization, geography, product, etc.) or dimension
hierarchies. Furthermore, data can be filtered by particular values. For
example, we can display prediction of the cash-to-cash cycle time KPI
for particular year and quarter, country, and organization.
Additionally,
predictive analysis can detect attributes that influence KPIs. Business
users can monitor trends and analyze key influencers in order to
identify those KPIs (attributes) that have a sustained effect or
significant positive or negative impact, for example, identifying
whether price discount on a certain product has long-term impact on
sales or only produces a short-term effect. Such actionable insights
enable companies to better plan improvement strategy and improve their
responsiveness.
The UDM also allows the option to define actions
in relation to query results. It provides a way to define actions that a
client can perform for a given context. This feature goes further than
traditional analytical applications which only present data.
Furthermore, it provides mechanism to discover problems and
deficiencies, thus improving the supply chain performance. An action can
start a specific application or load information from a database or a
data warehouse. For example, a drill-through action can show detailed
rows behind a total, or a reporting action can launch a report based on a
dimension attribute's value (parameters can be passed via URL).
Hyperlink actions can open specific pages or applications such as a web
page showing SCOR recommended best practices for particular process.
Actions can be specific to any displayed data, including individual
cells, dimension members, or KPIs, resulting in more detailed analysis
or even integration of the analysis application into a larger data
management framework.
After using the data mining for predictive
KPIs, it is possible to use different client applications and
technologies such as web portal dashboards, scorecard systems,
spreadsheets, web services, or feeds to display and analyze information.