Customer Lifetime Value via Retention — in Python, using Pandas and Numpy

On a Subscription Based Example

Published in

Towards Data Science

7 min readNov 8, 2020

Consider these as primers to customer lifetime value: these 3 parts are quite easy to digest and will give you a good intro to LTV. In Part 1, I went through the logical piece, so if you’re interested, go check it out! In (a small) Part 2, I showed an example of how cohort retention adds up to an average weighted lifetime of a customer. And here, in Part 3, I have added a Python blueprint code that you can use and improve upon to extrapolate your customer LTV.

Parts and Contents

Part 1: Estimating Customer Lifetime Value Via Cohort Retention, CLV or LTV as they call it

Part 2: Weighted Cohort Lifetime, As a Sum of Cohort Retention, not a Proof but an Experiment

Part 3: Customer Lifetime Value via Retention — in Python, with Pandas and Numpy

Before I begin here, I would like to re-iterate that LTV is not a financial measure, merely an estimate, however, it allows for much better approximation than straightforward guesswork. One of the most important LTV uses is a benchmark for customer acquisition cost because it allows setting a reasonable cap on the spend.

Purpose

The purpose of a series is to elaborate on a simple technique that will give you an estimate of a customer lifetime value using extrapolation of historical trends. In Python, you can do it with Pandas and NumPy — it rarely gets easier. Therefore, in this post, we will not dig into more advanced techniques that involve individual LTV predictions. I am thinking of writing a separate blog post about that. Lastly, you’d want to take some add-on improvements listed at the bottom into consideration.

Dataset and Assumptions

For this post, we are going to generate dummy monthly subscription data. We’ll need customer id, subscription date, maturity till the subscription was cancelled or till current date otherwise, and average subscription value.

For this post, I will generate 34 months of data. If you do not have this much data, you’d have to make more assumptions about cohort behavior and lifetime. Mapping retention curve and playing with decay rate might be of help for this purpose.

Python Implementation

Dummy Data

Overall, we will follow the logic laid out in Part I of the series: 1) get the cohort matrix; 2) get the marginal retention; 3) extrapolate marginal retention; 4) extrapolate (1) using (3). Estimated lifetime by cohort will be the sum of point (4) by row.

To start, we need a dataset for subscriptions. I picked up some techniques from Cohort Analysis with Python blog post, which has nice ideas of dummy data generation and visualization for cohorts.

Below are several helper functions that we’ll use to generate: a) customer list b) subscription date, maturity and value of the subscription c) bring a and b together in one dataframe.

*Helper functions for customer names, courtesy to* *Fabian Bosler* *and his blog post*

Helper functions for random subscription dates and resulting dataframe

When it all comes together, our random dataframe with 4000 rows looks somewhat like this. The last column is beginning of month for subscription date, which is equivalent to cohorts.

Cohort Retention

Cohort retention is the proportion of acquired customers that reached maturity T. Maturity in this case is the time till subscription was cancelled, or — if the subscription is active till current date, time between subscription start date and today.

A useful function is to plot heatmap for retention.

One of the ways to get there from our data is in 2 steps: 1) pivot the dataframe above to get the count of cohort customers that reached the maturity 2) add up all customers from T+1 to T, because all customer who are active in T+1 were active in T.

Dividing active subscribers in months by cohort size finally gets us to the triangular retention matrix, where each value is between [0,1]. Based on retention matrix below you can already make assumptions about average lifetime and lifetime value, which can be useful for selecting customer groups in marketing campaigns.

Applying heatmap plotting function we’ve created above, we can have a look at how cohorts retain over 34 months.

Cohort Retention as-is, you can see the extrapolated version below and compare

Marginal Retention

Marginal retention is a little different: it takes every column T+1, where T ≠ 1, and divides it by T. As a result, we get a share of previous month’s subscribers that have resubscribed the next month.

Taking an average of the previous N rows to extrapolate the matrix:

You see how marginal retention gets averaged and extrapolated in a matrix. If you expect changes in marginal retention, that’s a good time for adjustements.

Bringing It All Together

After the extrapolated marginal retention and cohort retention matrices are ready, we can extrapolate retention matrix for cohorts.

You can see how cohort retention got extrapolated on a heatmap:

Lifetime and Value

Lifetime of an average customer in a cohort, or an average cohort lifetime, is a sum of retention percentages across maturity.

Further Improvements

Obviously, this method gives you an average lifetime and value of a cohort, so you can compare it with CAC spent for this acquisition time. It can also be helpful if you want to target with LTV “above average”, adjusted for the time of acquisition. However, there might be situations when you want to predict LTV of a particular individual customer. For that, you can use predictive modeling, which I will address later.

Another low-hanging improvement is adding operational margin to subscription value. It will leave you with a better comparison base with Customer Acquisition Cost. The difference will tell you how much is left for the business once the customer is attracted and the first order is delivered.

For those ones of us who are familiar with the concept of discounting in finance (def. from Investopedia: the process of determining the present value of a payment or a stream of payments that is to be received in the future.) — it can be incorporated into your estimates.

Another improvement you could consider is seasonality of your business and considering it in extrapolation you are doing. The seasonality can be spotted when you look at the heatmap: the patterns will be pretty obvious and are likely to form a diagonal line.

Similarly to seasonality, there might be an adjustment to cohort or marginal retention that you would like to introduce based on primary knowledge. For example, if you run a campaign to impact retention or increase engagement, it is possible to include a rate of change in your calculations. You can get some inspiration in A Spreadsheet for Calculating Subscription Lifetime Value, which is a really neat blog post — with a spreadsheet.

Speaking of calculations in retention, as you might have noticed, the code takes simple rolling average with a window of N, if N values are available. Cohort size weights can be used for the weighted average.

Lastly, because half of the cohort and marginal matrices are filled element-wise in my code, calculations might run slow on larger matrices, so there is definitely a room for improvement there.

And finally, I am sharing my Colab notebook for those of you who’re interested in making a copy: https://colab.research.google.com/drive/1Viht58-NSW-IdjBP1ksbrX7_q4DSN5h3?usp=sharing

If you would like to connect with me, you can find me on LinkedIn and Twitter

https://www.linkedin.com/in/areusova/
https://twitter.com/khunreus

Stay well!