The value of a single business line, and its impacts

A services client approached the agency wondering about the lifetime value and annual business impact of customers who originated within a particular business line.

Assumptions and sourcing the data

The client was able to provide five years of transaction data that encompassed unique customer home address that first entered the business’s customer database through the business line of interest. Included in this data were unique keys, demographics, services utilized and per-check spend and location of transaction. A key limitation of this analysis was the length of the look back window the client was able to provide. Although for many categories of business, five years would be a good sample to base an LTV analysis upon, for the specific category of this business, long relationships are not uncommon. As such, it was known that for many customers, five years represented only a fraction of what was known to often be a decades long relationship.

Deliverables

To meet the needs outlined by the client, an interactive excel based dashboard was created and detailed interpretation provided.

Process

After an initial pass of data prep and EDA, each unique customer was matched to the client’s existing segmentation system based on address and assigned to a cohort that reflected the year in which they first transacted with the client. This approach allowed for tracking the years a customer re-transacted and easy association to the year in which they started.

 

This table shows the count of customers in each of their respective cohorts, in cohort year zero through year four, making five total years for the earliest cohort .

With the customers sorted by year of start and counted in each year after, a line fitting exercise was conducted to determine the function that best represented the data. With an eye towards compromising between overfitting on the limited sample and still adequately representing the observations, a logarithmic decay curve was selected and fitted. The post observation projections of which can be seen on the table below.

In the table above, a predicted mid-point for future customer retention can be seen in grey highlighted cells out to the ninth year post first visit.

With the logarithmic curve summarizing the relationship between attrition and time, as well as basic summary statistics of annual visits per customer and average visit value, it is simple to estimate both lifetime and lifetime value by customer type.

Here we can see not only the estimated lifetime of a customer, but also annual charges and customer type based on the service line of interest and other service lines. As in many businesses, those who use multiple service lines have both the highest annual charges and the longest lifetime with the company. This leads this multi-service line group to also have the highest lifetime value to the client. This then, is the answer to one of the two questions, what is the average value of service line of interest customers, but there remains the question of what is the value customers who make use of the service line of interest to the other service lines. Luckily, this question is much simpler to answer as the data has already controlled for service line of entry. This means that a simple chart can describe the average annual charges that customers who entered the system via the service line of interest provide to other service lines.

Value Service Line of Interest Customers Provide to Other Lines of Service ($)

This chart makes very clear in dollars the contribution customers who entered the business through the service line of interest make annually to other service lines. The client further looked into the share these contributions make up of those other service lines total revenues.

Further Study

After the excel based tool and initial findings were shared with the client, the agency was tasked to identify key indicators for customers who are likely to attrite between years zero and one, as this was the period of steepest decline. Using demographic data and customer matched satisfaction survey data, an interpretable classification model based on logistic regression was constructed using python. This model successfully identified key features that were highly correlated with first year attrition. This model led to additional research and change proposals by the client internally.