Predictive Analytics and Consumer Loyalty

Syriatel data sources

Call details record of CDRs

Each time a call is made, a message is sent, the Internet is used, or an operation is performed on the network, the descriptive information is stored as a call details record (CDR). Table 1 illustrated some types of call logs, messages, and Internet details available in Syriatel that were used in this research to predict customer loyalty:

Table 1 CDR sample fields in Syriatel company

Call type

GSM (A)

GSM (B)

Direction

Cell identifier

Duration

Date

...

Call

+963********8

+963********5

Out

C83

56 s

10/10/2018 23:30:26

...

Call

+963********5

+963********8

In

C203

56 s

10/10/2018 23:30:26

...

SMS

+963********9

+963********3

Out

C322

Null

10/10/2018 23:59:11

...

SMS

+963********3

+963********9

In

C164

Null

10/10/2018 23:59:11

...

...

...

...

...

...

...

...

...


Rec: Call log, SMS message log, MMS Multimedia Message log MMS multimedia messaging log, DATA internet data usage log, Mon fee log,Vou recharge log, Mon monthly log information, web metadata information, EGGSK tab In roaming.


Detailed data stored in relational databases

The Call details record was linked to the customer detailed data stored in the relational databases using this GSM, which is detailed as follows: Customer Management Database, Customer Complaints Database, Towers Information Database, Towers Information Database, Mobile Phone Information Database.


Customer services

All services recorded by the client were collected and classified manually based on the type of service such as political related services News, sports news, horoscopes, etc…, these categories are treated as a customer Advantages. As a result, a customer service table is produced. Table 2 is a sample.

Table 2 Example of customer's services in Syriatel company

GSM

Economy

Education

Health

Horoscopes

Duration

Sport

...

+963********9

0

1

0

0

1

0

...

+963********5

0

0

1

1

0

0

...

+963********8

1

0

0

0

0

0

...

+963********3

0

0

0

0

0

0

...

...

...

...

...

...

...

...

...


Customer contract information

Customer contract information was fetched from the CRM system, and contains basic customer information (gender, age, location …) and customer subscription information, as a single customer. You may have more than one subscription (two or more GSM networks) with different types of subscriptions: pre-subscription, prepay, 3G, 4G …subscription.


Database of cells and sites

Telecommunications companies related to location data and their components were stored in the relational database. This data was used to extract spatial features. Table 3 is a sample.

Table 3 Sample of cells and sites database

Cell identifier

Site identifier

Longitude

Latitude

...

C147

S73

**. ******2

**. ******7

...

C23

S119

**. ******0

**. ******6

...

C64

S14

**. ******1

**. ******0

...

...

...

...

...

...


Demographics data for customers

Building such a predictive system requires a data containing the real demographics such as gender and age for each GSM, whatever the demographics of the GSM owner sometimes the real user and the GSM owner, not much. Table 4 is a sample.

Table 4 Demographics data for customers

Age group

Year range

Percentage (%)

A

18–27

32

B

28–39

41

C

40–60

27


Extraction of features

The features were engineered and extracted based on our research and our experiment in the telecom domain. 223 features were extracted for each GSM. These features belong to 6 feature categories; each category provided with examples.

  • Segmentation Features T, F, M (3 features)

    total of calls and Internet duration in a certain period of time (Fig 2).

    Frequency (F): use services frequently within a certain period (Fig. 3).

    Monetary (M): The money spent during a certain period (Fig. 4).

    Classification Features (220 features)

  • Individual Behavioral Features

    Individual behavior can be defined as how an individual behaves with services.

    For example:

    Calls duration per day: calls duration per day for each GSM.

    Duration per day: calls and sessions duration per day for each GSM (Figs. 5, 6, 7).

    Entropy of duration

    High entropy means the data has high variance and thus contains a lot of information and/or noise.

    Daily outgoing calls: for each GSM the daily outgoing calls.

    Calls incoming daily night: for each GSM the daily outgoing calls at night (Fig. 8).

    SMS received daily at work time, …

  • About (200 features).

  • Social behavior features

    Is behavior among two or more organisms within the same species, and encompasses any behavior in which one member affects the other. This is due to an interaction among those members.

    Some examples about this features:

    Number of contacts: for each customer number of contacts with other customers.

    Transactions received per customer: number of calls, sms,Internet sessions received by each customer.

    Transactions sent per contact, etc. (20 features).

  • spatial and navigation features

    Features about the spatial navigation of customers

    holiday navigation: Customer Movements on holiday.

    Home zone: location of customer home.

    Antenna location: location of antenna.

    Daytime antenna: antenna which are used by customer transactions in daytime.

    Workday antenna: antenna which are used by customer transactions in workday.

    Vacation antenna: antenna which used by customer transactions in vacation, …, etc. Their number is about (21 features).

  • Timestamps for each working day (Sunday to Thursday), on holidays, or during the day (9 to 16) and at night: average number of SMS received per day (17 to 8) on holidays, etc. (165 features).

  • Types of services registered

    technical news services, educational services, sports news services, political news services, entertainment services, etc., (13 features).

  • Contract information tariff type

    GSM type, (2 features).

    The total number of features listed is 421, but there are about 201 features belonging to more than one category, so the total number of features is 220. Mass distribution for Some features with loyalty.

Fig. 2 Mass distribution for T in our Study

Fig. 2 Mass distribution for T in our Study


Fig. 3 Mass distribution for F in our Study

Fig. 3 Mass distribution for F in our Study


Fig. 4 Mass distribution for M in our Study

Fig. 4 Mass distribution for M in our Study


Fig. 5 Mass distribution for "Avg dur per day daylight"

Fig. 5 Mass distribution for "Avg dur per day daylight"


Fig. 6 Mass distribution for "Avg dur perday night"

Fig. 6 Mass distribution for "Avg dur perday night"


Fig. 7 Mass distribution for "Avg dur per day worktime"

Fig. 7 Mass distribution for "Avg dur per day worktime"


Fig. 8 Avg dur per call

Fig. 8 Avg dur per call


Features engineering-ways to choose features

Feature engineering is the process of using domain knowledge of data to create features that make machine learning algorithms work well. The most important reasons to use the selection of the features are:

  • It enables machine learning algorithm to train faster.

  • It reduces the complexity of the model and makes it easy to interpret.

  • It improves the accuracy of the model if the correct subset is selected.

  • It reduces overfitting.

Next, we'll discuss several methodologies and techniques that you can use to set your feature space and help your models perform better and more efficiently.

Attribute Selection Algorithms (Features)

Feature selection algorithms discovered and reported in the literature.

The feature selection algorithms are categorized into three categories such as filter model, embedding model (or aggregator) and embedded model according to mathematical models.


Filter model

It depends on the general characteristics of the data and the evaluation of features without involving any learning algorithm. Filter model Algorithms are Relive F, Information Gain. Entropy (H) was used to calculate the homogeneity of a sample.

Information gain is decrease in entropy after dividing the data set by an attribute. Gain index, Chi-Squared, Gain Ratio.


Wrapper model

A predefined learning algorithm is required and its performance is used as a benchmark for evaluation and feature identification.


Embedded model

It chooses a set of features for training and building a model, then test its feature importance depending on the goal of learning model. You can get the feature importance of each feature of your dataset by using the feature importance property of the model. Feature importance gives you a score for each feature of your data, the higher the score more important or relevant is the feature towards your output variable. Feature importance is an inbuilt class that comes with Tree Based Classifiers.

The big data analytics is used technological advances are based on memory usage, data processing and scrutiny of big data include handling high volume data with decreasing cost of storage and CPU power, the effective usage of storage management for flexible computation and storage and Distributed computing systems through flexible parallel processing with the development of new frameworks such as Hadoop. And also big data frameworks such as the Hadoop ecosystem, No SQL databases efficiently handle complex queries, analytics and finally extract, transform and load (ETL) which is complex in conventional data warehouses. These technological changes have shaped a number of improvements from conventional analytics and big data analytics.

The feature selection aims to select a feature subset that has the lowest dimension and also retains classification accuracy so as to optimize the specific evaluation criteria. feature selection methods can be filter and wrapper. Filter methods make a selection on the basis of the characteristics of the dataset itself. The selection may be fast but it leads to poor performance. Wrapper methods takes a subsequent classification algorithm as the evaluation index, it achieves high classification accuracy but results in very low efficiency for a large amount of computation. Backward feature elimination and Forward feature construction have been used. Feature selection methods such as backward elimination, Forward selection.

Backward elimination: In contrast to the forward selection strategy, the backward elimination strategy starts with the complete attribute set as initial subset and iteratively (and also heuristically) removes attributes from that subset, until no performance gain can be achieved by removing another attribute.

Forward selection initially uses only attribute subsets which exactly one attributes. Then additional attributes are added heuristically until there is no more performance gain by adding an attribute. Both of the methods are handled carefully Hadoop ecosystem.