Data Science Driven Segmentation

According to wikipedia, segmentation is a marketing strategy which involves dividing a broad target market into subsets of consumers, businesses, or countries who have, or are perceived to have, common needs, interests, and priorities, and then designing and implementing strategies to target them. Segmentation allows companies to better target marketing messages (emails, ads) and interventions (price differentiation) to optimize product positioning, monetization and retention of customers.

Getting segmentation right is extremely crucial for companies. Internet is full of research reports demonstrating effective segmentation can have dramatic impact on customer conversion as well as marketing costs.

Traditionally, segmentation game is about striking the right balance between the granularity of segments v/s manageability of segments. In theory, the more granular your segmentation, the more you can pinpoint targeting. However, granularity also comes with costs – too much granular segments can create not only messaging confusion but also can drive up marketing costs.

In recent times, as our ability to track all granular digital interactions with customers grows, our ability to decipher what those interactions are cumulatively telling us about customer behavior diminishes. This leads to further possibility of either important behavior signals getting ignored, misinterpreted or over interpreted, resulting in bad segmentation.

Thankfully, it is possible using machine learning and BigData technologies to cut the noise out and amplify the signal.

An Example of Data Science Based Segmentation

We started with the hypothesis that customer interactions can predict conversion, retention (or attrition) of customers. We further hypothesized that some interactions are way more important than others in predicting conversion and retention.

In order to prove this hypothesis, we grouped available customer interaction data in two buckets –

A large bucket of data (upto current time minus 1 month) as input to the statistical model
Previous month’s data as reference point to compare output of statistical model against.

In other words, the first bucket would build the “formula”. We then feed the variables to this formula from second bucket and then compare the result against observed outcomes in the second bucket.

The following is the example of results we got. Customer interactions have been anonymized.

As we can observe this data shows few interesting things – Interactions 1, 2 and 3 account for over 75% of explainable conversions.

Interaction 3 has inverse correlation – i.e. reduction in that type of interaction helps significantly in customer conversion (without going into specifics, this factor actually related to an unintended flaw in customers’ digital journey which was subsequently addressed through technical changes. However, this example also shows how this approach can flag relative importance of one specific hinderance relative to others and can serve as direction for prioritization of removing hinderances. For example, Interactions 10 and 11 too were similar obstacles in customer interactions, but they did not affect conversion as much as Interaction 3.

Interactions 8 through 12 failed to register any meaningful impact. While it is easy to notice this in the table above, in absence of statistical model, it was difficult to discriminate between many of these interactions. In other words, they would all factor into segments even though they register no impact.

Based on this model, we could predict customer conversion 83% of times. In other words, results observed (both conversion and lack thereof) and model prediction agreed 83% of times – which is pretty darn good.

We then used the model to assign conversion probability to each customer. Here is a randomized example. Based on these probabilities, we can then group customers into three (or more buckets).

Top bucket for customers who have high likelihood of conversion. We can refrain from heavy marketing activity for this segment as they have already “indicated” that they would convert in a matter of time
Middle bucket is where we could target marketing heavily. These are the people sitting on fence and we could persuade them to convert.
Bottom bucket is where the cost to convert is likely not worth the effort required and we could choose to ignore. This group of customers also gives rise to broader questions around product positioning and value proposition that, when taken to their logical conclusions may lead to some valuable insights to product management.

These probabilities, in a way, capture all intelligence of the 12 factors we used to build the model and therefore, once we use these probabilities to factor into segments, we do not need to consider these 12 factors again in segmentation logic.

Conclusion

If one believes that the long tail rule holds true in marketing, one has to agree that some customer interactions are way more important that others – we just do not definitively know which ones. When this approach is used in conjunction with other demographic and other factors in segmentation scheme, it can lead to high level of optimization of segments.

This exercise is not something one would do once. Each optimization of segmentation will result in further refinement of interaction. This over a reasonable time, would result in rearrangement in weightage for various interactions in the model. The model will provide more refined insights – thus completing closed loop interactive cycle.