Predictive Analytics and Consumer Loyalty

Using big data to target brand success and build equity has become valuable. Review the results of this predictive analysis research and assess how the loyalty rules were derived from this model. Was the classification of the consumers predictive or reflective?

Results and discussion

Apply classification algorithms

Having segmented using grades and recognizing loyalty for each segment, at this stage, the causes of loyalty were needed, i.e. The behavioral features of customers in each segment. The behavioral 220 features were taken and the descriptions resulting from the segmentation process as an input to the classification algorithms to identify the causes of loyalty and to identify the influential features at each level of loyalty. The other benefit of applying classification algorithms was to build an accurate predictive model for classifying new users by loyalty. Multiple and binary classifiers were built and the results were compared using different criteria. It is the highest accuracy classifier that gave us the best correlation between behavioral features and loyalty categories and gave the best behavioral features that were described categories (classes) and thus assist in decision-making in building marketing presentations for each category thus increasing the company’s profit.


Performance measurement

The correlation matrix shown in Table 12 contains information on the actual and predicted classifications made by the binary classification system where (Loyal 1, Not Loyal 0). Each term corresponds to a specific situation as follows:

Table 12 TFM score for total call time and number of messages

Positive (1)

Negative (0)

Positive (1)

TP

FP

Negative (0)

FN

TN


  • True Positive (TP) is expressed as an example when the prediction is yes (the customer has loyalty to the company), and the truth has loyalty to the company.

  • True negative (TN): When the prediction is no (no customer loyalty to the company), in fact the customer has no loyalty to the company.

  • False Positive (FP): The prediction is yes (the customer has loyalty to the company), but the customer left the company is also known as "Type 1 error".

  • D False negative (FN): When the prediction is not (the customer has no loyalty to the company), but the customer has loyalty and did not actually leave the company. Also known as "Type 2 error".

Some performance measures can be calculated directly from the confusion matrix.

\begin{aligned} Rcall\, (True Positive Rate\, (TPR))\,=\,& \frac{T_P}{T_P+F_N} \end{aligned}
(1)

\begin{aligned} Precision\, (Positive Predictive Value)\,=\,& \frac{T_P}{T_P+F_P} \end{aligned}
(2)

\begin{aligned} False Positive Rate\, (FPR)\,=\,& \frac{F_P}{F_P+T_N} \end{aligned}
(3)

\begin{aligned} Accuracy\,=\,& \frac{T_P+T_N}{T_P+F_P+F_N+T_N} \end{aligned}
(4)

TPR is also known as recall or allergy.

The accuracy standard does not rate the rate of correctly classified cases from both categories. It is expressed by the following equation:

\begin{aligned} Accuracy\,=\,& \frac{T_P+T_N}{T_P+F_P+F_N+T_N} \end{aligned}
(5)

Area under the curve (AUC): measures the effectiveness of the work can be calculated by:

\begin{aligned} AUC\,=\,& \int\nolimits_0^{1}TPR(x)dx \end{aligned}
(6)

\begin{aligned} TPR\,=\,& \frac{T_p}{T_p+F_n} \end{aligned}
(7)

F1-measure: harmonic mean of the precision and recall. Can be calculated by:

\begin{aligned} F1{\text­}measure\,=\,& \frac{2*Precision*Recall}{ (Precision + Recall)} \end{aligned}
(8)


Compare binary classifiers

Confusion matrix

An example of the confusion matrix for the Multilayer Perceptron Classifier algorithm (Tables 13, 14).

Table 13 Confusion matrix for multilayer perceptron classifier

Positive (1)

Negative (0)

Positive (1)

TP/1777.0

FP/108.0

Negative (0)

FN /261.0

TN/145.0


Loyalty categories (loyal 1, not loyal 0)

Table 14 Comparison of binary classes

Algorithm

Accuracy

Precision

Recall

areaUnderROC

F1-score

Multilayer Perceptron Classifier (MLPC) Confusion matrix 1777.0 108.0 261.0 145.0

0.83

Precision (0.0) = 0.92 Precision (1.0) = 0.64

Recall (0.0) = 0.93 Recall (1.0) = 0.61

0.64

F1-score (0.0) = 0.93 F1-score (1.0) = 0.62

Decision Tree Classifier (DTC) Confusion matrix 1757.0 134.0 151.0 234.0

0.87

Precision (0.0) = 0.92 Precision (1.0) = 0.63

Recall (0.0) = 0.92 Recall (1.0) = 0.60

0.76

F1-score (0.0) = 0.92 F1-score (1.0) = 0.62

Random Forest Classifier (RFC)Confusion matrix 1817.0 68.0 164.0 226.0

0.87

Precision (0.0) = 0.91 Precision (1.0) = 0.76

Recall (0.0) = 0.96 Recall (1.0) = 0.57

0.77

F1-score (0.0) = 0.93 F1-score (1.0) = 0.66

Gradient-Boosted-Tree (GBT) Confusion matrix 1796.0 83.0 137.0 265.0

0.87

Precision (0.0) = 0.92 Precision (1.0) = 0.76

Recall (0.0) = 0.95 Recall (1.0) = 0.65

0.80

F1-score (0.0) = 0.94 F1-score (1.0) = 0.70


After comparing the classifiers, Gradient-boosted-tree classifier was found to be the best.


Comparison of multiple classes

Example of confusion matrix for binary Classes

Avrage recall = [R(A) + R(B) + R(C) + R(D)]/4 = 0.775: 4, number

Recall calculation from the correlation matrix of model TP: 100, FN:100 R(A) = 100/200 TP:9, FN: 1 R(B) = 9/10 TP:8, FN:2 R(C) = 8/10 TP:9, FN:1 R(D) = 9/10

Recall = TP/(TP + FN) Table 15.

Table 15 Example of confusion matrix for a multiple classes

A

B

C

D

A

100

80

10

10

B

0

9

0

1

C

0

1

8

1

D

0

1

0

9


Multi-classification (1, 2, 3, 4, 5)

It reflects the loyalty levels wherethe 5 is very high loyalty, 4 is high loyalty, 3 is medium loyalty, 2 is low loyalty, 1 is very low loyalty.

Note 1: Multilayer Perceptron Classifier Input: 220 feature with 4 layers, 5 node in each layer Output: 5 classes

Note 2: Gradient-boosted tree classifier currently only supports binary classification (Table 16).

Table 16 Comparison of multiple classes to predict loyalty

Algorithm

Accuracy

Precision

Recall

F1-score

Weighted

MLPC Confusion matrix 1246.0 0.0 0.0 0.0 0.0

97.0 0.0 0.0 0.0 0.0

616.0 0.0 0.0 0.0 0.0

334.0 0.0 0.0 0.0 0.0

17.0 0.0 0.0 0.0 0.0

0.55

Precision (1.0) = 0.53

Precision (2.0) = 0.0

Precision (3.0) = 0.0

Precision (4.0) = 0.0

Precision (5.0) = 0.50

Recall (1.0) = 1.0

Recall (2.0) = 0.0

Recall (3.0) = 0.0

Recall (4.0) = 0.0

Recall (4.0) = 0.50

F1-score (1.0) = 0.70

F1-score (2.0) = 0.0

F1-score (3.0) = 0.0

F1-score (4.0) = 0.0

F1-score (5.0) = 0.0

Weighted precision: 0.26

Weighted recall: 0.51

WeightedF1 score: 0.34

Weighted false positive rate: 0.51

DTC Confusion matrix 995.0 8.0 130.0 66.0 0.0

74.0 13.0 4.0 2.0 0.0

211.0 7.0 297.0 97.0 0.0

54.0 0.0 70.0 211.0 0.0

8.0 0.0 2.0 2.0 0.0

0.67

Precision (1.0) = 0.74

Precision (2.0) = 0.46

Precision (3.0) = 0.59

Precision (4.0) = 0.55

Precision (5.0) = 0.59

Recall (1.0) = 0.82

Recall (2.0) = 0.13

Recall (3.0) = 0.48

Recall (4.0) = 0.62

Recall (5.0) = 0.60

F1-score (1.0) = 0.78

F1-score (2.0) = 0.21

F1-score (3.0) = 0.53

F1-score (4.0) = 0.59

F1-score (5.0) = 0.60

Precision: 0.65

Recall: 0.53

F1 score: 0.65

False positive rate:0.22

RFC Confusion matrix 1073.0 2.0 80.0 44.0 0.0

73.0 12.0 7.0 1.0 0.0

185.0 3.0 341.0 83.0 0.0

40.0 0.0 46.0 249.0 0.0

1.0 0.0 6.0 5.0 0.0

0.74

Precision (1.0) = 0.74

Precision (2.0) = 0.70

Precision (3.0) = 0.71

Precision (4.0) = 0.65

Precision (5.0) = 0.74

Recall (1.0) = 0.89

Recall (2.0) = 0.12

Recall (3.0) = 0.55

Recall (4.0) = 0.74

Recall (5.0) = 0.74

F1-score (1.0) = 0.83

F1-score (2.0) = 0.21

F1-score (3.0) = 0.62

F1-score (4.0) = 0.69

F1-score (5.0) = 0.66

Precision: 0.73

Recall: 0.74

F1 score: 0.72

False positive rat


After comparing the Classification algorithms, it turns out that Random Forest Classifier is the best. An example of the distinctive features of each level of loyalty derived from the binary classification model (Table 17):

Table 17 Results for gender prediction

Model tree (rules)

Features

If (feature 79 ≤ 3.3846153846153846)

If (feature 103 ≤ 2.2285714285714286)

If (feature 214 ≤ 2093.0)

If (feature 105 ≤ 0.17407765595569782)

If (feature 78 ≤ 3.0344827586206895)

Predict: 0.0 Else (feature 78 > 3.0344827586206895)

Predict: 1.0 Else (feature 105 > 0.17407765595569782)

If (feature 184 ≤ 2518.617404647386) Predict: 0.0

Feature79 = std dur per day holiday out

Feature103 = std dur per call work time out

Feature214 = avg trans per cnt

Feature105 = avg dur per callworkday

Feature78 = std dur per day holiday