The wireless industry currently uses many different customer-retention strategies and retention campaigns, including win-back teams, welcome calls, competitive rate plans and loyalty programs. However, even with these comprehensive programs in place, churn continues to plague the wireless industry and today remains a critical business issue for most wireless providers.
A key customer-retention component that has proven to address effectively and proactively voluntary customer churn is predictive churn modeling.
Predictive churn models can be used to increase the effectiveness of traditional retention strategies and campaigns. It does not replace them. Used in conjunction with a wireless provider’s retention strategies and campaigns, predictive churn models can help carriers:
achieve a higher level of retention productivity without increasing staff or workload;
enable new, targeted campaigns that were not feasible in the past;
significantly improve profitability; and
increase market share.
Predictive churn modeling
Predictive churn modeling is the process of using software to analyze and predict the behavior of a group of subscribers relative to churn.
The software looks at recent customer data, such as billing, usage, network quality, demographics, acquisition, features used, customer care and handset used. The software analyzes this data for two major groups of subscribers-those subscribers who have churned and those who have not churned.
The result of this analysis is a predictive churn model-a complex statistical formula that creates a churn score for each subscriber. The model typically can be reviewed to determine the accuracy level attained, the data elements that have shown the most impact relative to churn and other interesting information. A churn list, ranked by churn score, is the major output from the modeling process.
Predictive churn models can vary over time and between providers within the same market, regions, and segments within a provider’s market. Therefore, it is important to monitor the performance of these models on a regular basis, and, accordingly, fine tune existing models or create new models as market conditions change or better data becomes available. The purpose for using predictive churn models is to identify a high percentage of churners before they leave so retention actions may be taken.
How it works
To illustrate how a predictive churn model works, let’s create a fictitious wireless service provider called National Wireless. Let’s assume that National Wireless has 1 million subscribers and a voluntary monthly churn rate of 1.8 percent. This means each month 18,000 subscribers are churning. At National’s average monthly revenue per subscriber of $50 and assuming that each churner could have been retained for an average of 12 months, National is losing approximately $108 million in revenue per year.
Here’s how a predictive churn model works.
Step 1: Recent customer data (i.e. July) is obtained from many sources. Calculated data, such as the last three month’s average and number of days remaining in the contract, also can be created and used. Churn data also is needed for the following month (August), including an indicator as to whether a subscriber has churned or not, the de-activate date and, if available, a reason why the subscriber de-activated.
Step 2: The data file then is split in two on a non-biased basis. The first data set, or learning database, is used to create the models. This will be the data the churn modeling software will use to learn about subscriber behavior. The remaining data will be saved for use in a blind test that is conducted later-to verify the accuracy of the models.
Step 3: Using the learning database, data analysis is performed and an initial model is built. The resulting model (a complex mathematical formula) will generate a churn score for each subscriber. The higher the score, the more likely a subscriber is to churn.
Step 4: The model is reviewed to determine the level of accuracy and then fine-tuned, if necessary. The model output typically can be looked at in several different ways so an analyst can see the impact specific data variables had on churn, and several other metrics.
Step 5: The final model is applied to the blind test data. The subscribers in the blind test data are “scored,” and a ranked churn list is created.
Step 6: The churn list then is evaluated, usually by the modeling software, by measuring the actual concentration of churners within a group of subscribers with the highest churn scores. The more churners within this group, the better the model.
Step 7: The churn model then is applied to August’s data in order to predict September’s churners. A new set of churn scores will be calculated, and a new ranked churn list will be created. This new list can be used in various retention programs throughout the month of September.
Step 8: The existing models can be refined, and new models can be built as the market changes over time. The frequency with which this is done depends on various factors, including marketplace dynamics.
Improving existing retention
The purpose of a predictive churn model is to predict the likely behavior of a targeted group of subscribers. The retention focus should be placed on a group of subscribers with the highest churn score, for example the top 1.5 percent. This group could be called the campaign size, which should be a number that is manageable for the wireless provider from an economic, logistical and human resource standpoint.
In National Wireless’ case, let’s say National can contact 15,000 subscribers per month, or 1.5 percent of the total subscribers. Let’s say the model can successfully identify 22.5 percent of the total number of churners within the top 1.5 percent of the total subscribers with the highest churn score. Given National’s 18,000 churners per month, this campaign would reach 4,050 churners. This model would provide National a campaign list of 15,000 customer names, of which approximately 4,050 would churn and 10,950 would not churn. In effect, this group of 15,000 customers would have a voluntary churn rate of 27 percent, rather than the overall rate of 1.8 percent.
So, while a predictive churn model will not perfectly predict each specific individual’s behavior, it can be particularly effective at predicting the behavior of a targeted group.
Available models
Two basic categories of solutions can provide predictive churn models: 1) in-house developed solutions using data mining/statistical tools; and 2) predictive churn applications.
Typically, with a data mining/statistical tools approach, a tool is purchased and models are developed by a statistician or data-mining expert. The analytical method or methods used must be decided by the expert, and usually requires significant trial-and-error modeling to determine the best methods and techniques for predicting wireless churn. It is not unusual to experience a two- to six-month initial model-building phase using this approach.
The expert’s level of model building expertise, familiarity with the data, and knowledge and experience with predictive churn models can have a direct and significant impact on the model’s accuracy and quality, and the ability to quickly build new models on an ongoing basis.
Predictive churn applications have different strengths and weaknesses, but all of them use some type of predictive modeling. A one-size-fits-all predictive churn application uses a single churn model developed by an expert, and that model is embedded in the application software code and used by all of the software vendor’s customers.
A single-algorithm predictive churn application uses a specific, often simple, analysis method and will build automatically or semi-automatically models usin
g this method. Typically, these application types allow a non-expert to build models fairly easily. Certain compromises often are
made in the process, and the models created do not measure up in accuracy, relative to more sophisticated applications.
The most advanced predictive churn applications use leading data mining or statistical engines and modeling tools as a basis for building models. The best of these applications have been developed by a combination of experts in data mining, statistics and wireless churn.
These application types have easy-to-use front ends specifically devoted to building wireless churn models. Thus, a significant portion of the model-building time is eliminated relative to raw data mining tools or statistical tools. These applications also use a combination of modeling methods that take the best of each method to yield superior results. Thus, non-experts can build very accurate models in a relatively short time period. Also, models can be created dynamically as often as market conditions dictate using this approach.
Increasing profitability, retaining customers
Back to National Wireless again. Assume it will use predictive modeling as part of a proactive outbound retention campaign. As a reminder, National Wireless has 1 million subscribers and a voluntary monthly churn rate of 1.8 percent (18,000 churners per month). It has 50 retention specialists who can complete an average of 300 retention calls per month, which means that National’s retention team currently has the capacity to complete 15,000 calls each month.
The obvious question is, “Which 15,000 subscribers should be called this month?” So let’s look at some options for creating this list of 15,000 subscribers.
Option A: Randomly select 15,000 subscribers.
Option B: Use experience and market knowledge to make an intelligent guess-for example, contracts expiring within 60 days, subscribers with monthly revenue of more than $75, subscribers whose minutes of usage have dropped 50 percent since last month, etc.
Option C: Build the list in-house using data-mining tools.
Option D: Use a single-algorithm predictive churn application.
Option E: Use an advanced predictive churn application.
Options C-E would generate a ranked list of customers from the highest churn score to the lowest churn score. The option that can “predict” the most churners within the top 15,000 subscribers would be considered the most accurate list. Let’s now take a look at how to evaluate these options.
Given these options, the next step is to determine which option creates the best list. Options B, C, D and E could be evaluated in relation to Option A, a random selection. To evaluate these options, Steps 1-6 as previously described would be followed. The focus of the evaluation should be the accuracy of each option within the capacity of the campaign, or in National’s case, the top 15,000 churn scores.
In a purely random selection of 15,000 subscribers, given National’s voluntary churn rate of 1.8 percent, you could expect to find 270 churners (15,000 multiplied by 1.8 percent). A churner could be defined as someone who will leave in the next 30 days if not acted upon.
Let’s say billing, customer care, activation and other data for July is used to build models, with August churn data also supplied.
Now, let’s assume Option B is used to create a model, and a ranked churn list is generated. National finds in reviewing the top 15,000 churn scores that 810 of them actually churned in August. Compared with random subscriber selection (Option A), this option has identified three times as many churners.
This ratio of 3-to-1 over random is referred to as “lift.” One could say Option B has a 3-to-1 lift. That means that by calling the same number of 15,000 subscribers, Option B would result in contacting 540 more churners than Option A.
Next, let’s look at the predictive modeling options in relation to Option A (random) and Option B (best guess). Let’s assume Option C returns a lift of 7-to-1. Thus, this list of 15,000 subscribers would include 1,890 churners, which is seven times more than random. This would result in contacting 1,620 more churners than with a random approach.
We’ll further assume that Option D returns a lift of 5-to-1. This list of 15,000 subscribers can be expected to include 1,350 churners (270 at random, times a lift of 5). Option D would result in contacting 1,080 more churners than a random approach.
Finally, let’s assume that Option E returns a lift of 15-to-1. This list of 15,000 subscribers can be expected to include 4,050 churners (15 times more than random). Option E would result in contacting 3,780 more churners than a random approach.
Obviously, in this particular example the best option would be Option E. Using the list of the top 15,000 churn scores for Option E would result in contacting over 2,000 more churners than any other option.
The financial benefits
In each case, an outbound campaign conducted targets a list of 15,000 subscribers. Let’s assume National is able to retain 40 percent of the likely churners during this proactive campaign. For each of these retained subscribers, let’s also assume:
the additional average length of service achieved would be 12 months;
the average monthly revenue is $50;
the variable cost per month for these incremental subscribers is $15;
and the cost of the intervention is $100.
That means that for each subscriber retained, National would expect to get an incremental margin of $320 (12 months multiplied by $35 margin, less $100).
As can be seen, the benefits of predictive churn models can be quite significant. Predictive churn modeling has emerged as a very effective component of a service provider’s overall customer-retention process.
With competition continuing to increase, the ability of a service provider to identify proactively those customers exhibiting the highest propensity to churn, coupled with lifetime value information and optimal intervention recommendations, is a strategic necessity.
John Andrews is a senior account manager for SLP InfoWare in Chicago.