"Faris Avdagic", "github"=>"FarisDhatim"}" />

Machine Learning for data services optimization

Telecom Operators provide data services like private chat or email synchronization to users. They are charged for using the services but also Telecom Operators are charged by service providers. The cost of a service is in between 0.2€ and 6€ per month per device. However, some users don’t use the services they have subscribed for. If a system could detect these people, the services could be deactivated and the Operator would not be charged anymore. In this post we propose a system based on machine learning which deactivate users on real time and make better decision iteratively.

Why Machine Learning?

The number of service subscribers for a Telecom Operator is around 3,000,000. Manual deactivation is impossible. Plus the information about users is important (few hundreds of features). Machine learning can help to find correlations that we don’t expect and scale. The objective of the system is to maximize the number of deactivations while keeping the number of reactivations acceptable. In fact, if the system deactivates someone who is really using the service, the user is going to complain to the customer support and the services will be reactivated.

Exploration

As mentioned above, the feature space and the number of users are large. All the history of users is recorded. For example, the type of mobile phone, the number of mailboxes configured and the date of the last transaction are some of the available features.

When we started the project, deactivation processes have already started. Cost-cutting experts have defined “deactivation rules” based on a few features and they have been pretty successful. The challenge was to improve something already very good. So we started by clustering data and with the help of the experts we were able to interpret clusters and take decisions. Clustering prediction performances were not as good as expected but at least, it enriched the training set.

From there, the training set was diversified enough to build a supervised classifier. The feature space was explored enough and we had some positive (successful deactivation) and negative examples (reactivation) to build the classifier.

Iterative Learning process

From a technical point of view, this problem is a binary classification:

Note that there is no way to be sure that a user will never re-use the services. We need to define a threshold (=number of days) where we consider the suspension successful. Below, you can see what a reactivation distribution looks like (in this case, median value is 8 days).

A large part of the model provided by scikit-learn, xgboost, keras with TensorFlow have been tested in an empirical way. Models parameters were setted by cross-validation and models comparated on validation set. For parameters optimization we used the “Distributed Asynchronous Hyperparameter Optimization in Python Hyperopt. In order to help the model, data have been split by population and different models have been learned by sub-population. For example, the predictive model for non BlackBerry phones (people having a non BlackBerry phones) is independant of the predictive model for BlackBerry phones. It sounds contradictory but it’s possible to use BlackBerry services on non-BlackBerry phones. Doing this we mixed our pratical knowledge and machine learning to build a better system because we knew that this two different populations (BlackBerry phone users vs non-BlackBerry phone users) are different.

As soon as we suspended some users, we observe which ones are reactivated. Successful suspension and non-successful suspension become new training example and the model is re-trained. Dhatim is about cost-cutting automation, so it’s important to build a system which learns iteratively by himself. Below we present what a learning scheme looks like with one specific classifer: decision tree.

To conclude, building a system which minimizes the reactivation rate is not the most important. After all, the most important is to build a system which maximizes the number of suspensions and keeps the reactivation rate acceptable. The idea is even if the reactivation rate is high, the deactivations can be spread over a longer period. In this post, we tried to show how machine learning can be applied to a specific overpayment problem. From our experience, machine learning improves on average the number of deactivations by 13% comparing to expert rules.

comments powered by Disqus