Newsletters list:

Throughput accounting
Agile Coaching
Data Models
“Big Data”
Visual facilitation
Agile planning
Churn modeling
Writing Survey Questions
Theory of Constraints
Hands On Data Mining
Data Vault
Time boxing
Surrounding requirements
Cybercrime
Retrospectives
Self Service BI
Internet Surveys
How to build predictive models
New Accounting Standards
Technical Reviews
Text Mining
Meta Data
Open Source BI
Data Warehouse Testing
Customer Value Management
Value From Transaction Data
Data Visualization
Survey Design
Predictive Modelling
Applied Probability Theory
Open Source
Software Testing
Data Warehouse Development
Data Quality Policy
History of Mathematics
Usability Research
Life Time Value
Balanced Scorecards
Survey Sampling
Agile Software Development
ETL
Neural Networks
Corporate Strategy
Missing Data
Segmentation
Decision Trees
XBRL
OLAP
Data Quality Assessment
Dashboards and Scorecards
Data Mining for CRM
Data Mining Algorithms
Data Preparation
Campaign Optimisation
Affinity Analysis
Vendor Selection
System Dynamics
Credit Scoring
Forecasting
Web Usage Analysis
Customer Profitability
Problem Analysis
Customer Satisfaction & Loyalty
IT Governance
Market Research
Search Engines
Marketing Accountability
CRM
Data Mining Models
Privacy
Data Warehousing
Data Quality

PDF iconPrint this newsletter

Tom's Ten Data Tips - October 2011

Churn modeling

Churn modeling is the practice of determining a mathematical relation between customer characteristics and likelihood to cancel or end a business relationship. These can be relatively “static” (like gender, or ZIP code) or “behavioral” characteristics (e.g.: number of support queries in the last month). A churn model calculates the likelihood (probability) a customer will cease to do business with you in a given time period.

1. Churn Management Becomes More Important When Market Growth Begins To Taper Off

Managing churn is important in any market, but even more so in saturated or “mature” markets. As long as all competitors are (still) growing their customer base rapidly, churn may not be perceived as an important business phenomenon. The cell phone market looked like that a couple of years ago. But even there growth is starting to taper off. Providers are shifting attention from acquiring more customers, selling more handsets, to increasing the value of existing customers. This is done, for instance, by selling more elaborate calling plans, mobile internet access, etc. Similar developments apply in other (young) markets as well.

2. Churn Is Something Different For Subscription Business Versus Recurring Purchases

Although it makes sense to refer to “churn” for both subscriptions as well as recurring purchase scenarios, the differences are fundamental enough to merit some attention. In a subscription business (e.g.: cell phone contracts), the formal ending of the relationship is evident when the customer chooses not to renew the subscription. In a repeat purchase scenario (e.g.: pre-paid cell phones), a historic purchase pattern is discontinued, and then we say the customer has churned. However, fundamentally, you can never know whether the customer is pausing versus ending usage. So you’ll need to apply some (perforce arbitrary) heuristic that defines how long of a ‘pause’ in usage is considered churn.

3. Get Your “Churn” Definition ‘Right’

There are myriad ways or reasons why a customer at some point in time may cease to do business with you. First of all, it’s important to think about all of these reasons. Then get some conceptual clarity about which reasons you might be able to influence, and for which reasons there is no apparent remedy. Deceased customers, for instance, are a source of “lost business” but there’s probably no realistic way to avoid this. And sometimes, the initiative for closing the business relationship might not originate with the customer, but with the company instead. Such reasons might have to do with poor credit or payment behavior, or fraudulent activities.

If at all possible, every effort should be made to single out these different types of “churners” (customers whose business relationship has ended), and then to build a model on only those that display the type of behavior you are interested in influencing. All other records will ‘pollute’ your models, because their behavior patterns are included in your mining set just the same. The data mining algorithm doesn’t ‘know’ the difference. Since these records are in the “rare” category (you typically have way more customers still with you, compared to churners) you always need to be even more careful in handling data quality in that minority class.

4. It’s Not ‘Just’ About Churn

Churn modeling concerns the prediction of whether someone will abandon your business. In particular for subscription models (like a cell phone contract, gym membership, all kinds of online services, etc.), the prediction really serves as a proxy for customer value. Alas, if the customer terminates his contract, there isn’t much value to be had. But more importantly, even if they do continue their membership, the eventual value of this relationship often depends on their frequency of usage.

In cell phone contracts, those customers who exceed the allowed usage within their contract pay a premium for extra calling minutes or additional text messages (even though these services aren’t more expensive to ‘produce’). Roaming can be a real cash cow. For gym membership, additional value comes from drinks or food that people buy – which they only can buy if they actually come to the gym. Many internet services operate on an almost break-even basis, and only start to generate significant yield when cross- or up-sell opportunities materialize. What these cases have in common is that besides renewal of membership, it’s actually the usage that (mostly) drives customer profitability.

5. Only Compare Gains Charts With Identical Proportions Of Churners

The use of gains chart (or rather: cumulative gains charts) has become more or less standard practice to assess the predictive accuracy of churn models. When you see a gains curve (sometimes also called lift curve) that runs higher up sooner (on the left hand side) you can infer that predictive accuracy is greater, as compared to another curve in the same graph that runs lower.

There are some peculiarities with these gains charts, however, and one of them is that they are not “standardized.” What that implies is that if you want to make comparisons across gains curves, you need to account for the a priori 0/1 proportion in the dataset. So one gains curve that has only 1% churned records in it may look better than another curve with 30% churned records, but this can be an illusion! In reality the former (with 1%) may be performing worse than the latter (with 30%), even though the graph is running higher.

6. Calculate “Break-Even” Tenure As Part Of Churn Analysis

Tenure is the opposite of churn. It turns out that average tenure is the inverse of the churn proportion: average tenure = 1/c. Customer acquisition cost needs to be offset by exceeding some “break-even” tenure. It might be quite insightful for the business to be informed about this. In particular because churn rates, and therefore average tenure rates, might be quite different across segments. And since acquisition costs could differ across segments, too, it behooves the analyst to calculate break-even tenure for all relevant customer segments. This often gives the business some feel for sensible business practices.

7. Every Modeling Algorithm Has Pros And Cons

There have been several studies on churn modeling, comparing various data mining algorithms. In a more generic context, these same data mining algorithms have been compared even more frequently. Yet the outcomes are largely inconclusive. For one thing, familiarity of the analyst with any particular algorithm has a large impact. If you’re most familiar with regression models, than those are likely to perform best. Duh. But there are idiosyncrasies of a particular data set that may sway results in favor of one algorithm over another. Besides some generic findings across algorithms, the practical algorithms of choice for churn modeling have been mostly decision trees and rules based algorithms, or various types of regression models. Which is best?

From a “purely” predictive viewpoint, regression models seem to often either do equally well, or outperform decision trees. Then why do people continue to use decision trees? For one thing, their results are more transparent (certainly to the layman), consequently fostering superior learning (see also a previous newsletter on Decision Trees). Secondly, by inserting business knowledge into these models, they can rely on more than ‘merely’ plain available data (so-called “model engineering”), thus outperforming models that can ‘only’ rely on data. Thirdly, the rule-based deployment of decision trees makes them (considerably) easier to maintain if the model needs to be updated. So consider all these pros and cons (and others!) before you settle on your algorithm of choice.

8. Being Able To Predict Churn, Doesn’t Mean You Can Prevent It

In some scenarios, it turns out to be eminently possible to predict churn, even though by all reasonable accounts it may seem (nearly) impossible to do much to prevent this. These two things need to be discerned, and should be monitored accordingly. To do so requires a methodologically sound empirical design of your churn campaigns. Think data scientists. If you run your business by the seat of the pants, you may or may not do well, but you won’t learn what’s working for you.

9. Churn Models Are Intricately Related To LTV

There is an exponential relation between churn rates and Life Time Value (LTV). Because the LTV is a function of the number of years of tenure, you get an exponential function. For instance, a relatively simple model would be: Customer Lifetime Value (CLV) = m * (r/1 + i + r) ; where m = annual customer profit, r = retention rate, and i = interest discount rate (Gupta & Lehmann, 2005).

Because of this exponential relation, and humans’ inability to intuitively grasp the quantitative impact of such parameters (for the same reasons we find logarithms ‘difficult’), people tend to underestimate the effect of churn on LTV. See also a previous newsletter on Life Time Value.

10. Churn Models Need To Be “Future Proof”

You build a churn model on a data set that has been gathered on the basis of historical data: you relate historical customer profiles to some churn flag which you’ve (also) observed in the past. Just like with direct response models we then assume that past correlations will continue to apply in the future. We test the model on some hold-out sample, and then implement it. But the model shouldn’t just work well on the hold-out sample; rather it should work well on a sample of ‘future’ data. This time shift is (even) more important when predicting churn (as opposed to direct response) because you need to be able to predict sufficiently far out in to the future so that you still have a chance to change your immanent ‘fate.’ As usual, we build predictive models in order to (try and) prove them wrong.

Once a model is in place, there will always be replacement/update costs. Hence, a model that has slightly lower immediate predictive accuracy, but longer ‘shelf-life’, might well be preferable over another model that predicts (slightly) better now. In particular when it’s cumbersome or expensive to replace the model. But most importantly, the prediction needs to pertain to a point in time that is sufficiently far out in the future that you have enough opportunity to act upon it and retain the customer.

Further reading

Some excellent books on Churn modeling:

Mastering Data Mining: The Art and Science of Customer Relationship Management.
Michael Berry & Gordon Linoff (1999)

ISBN# 0471331236

Managing Customers as Investments – The Strategic Value of Customers in the Long Run
Sunil Gupta & Donald Lehmann (2005)

ISBN# 0131428950

Contact
XLNT Consulting
Tom Breur, Principal

E-mail
Email Tom Breur

Telephone
+31-6-463 468 75

Address
Langestraat 8-03
5038 SE Tilburg
the Netherlands