Moovly can now predict who will pay to subscribe within 24 hours with 82% accuracy

Moovly is a successful SaaS startup that helps people create rich multimedia content like animated presentations and professional-looking videos in no time. Moovly has seen exponential registered user growth in the last 12 months coupled with similar revenue trends.

The challenge:

In anticipation of further growth, Moovly sought to better understand it user behavior trends in more detail prior to implementing advanced sales and marketing growth strategies and associated systems.

That's why Moovly came to us with three simple questions:

  1. Can we easily distinguish forever-free users of the platform from paying customers?
  2. How quickly can we predict who will subscribe and who will not?
  3. Can we build a predictive model based on measurable, objective anonymous data to better interact with users and drive conversion?

We asked and were given three data sets:

  • Anonymous summary of all users (with enough subscribers and non-subscribers) with dates they opened account, summary of activity, etc. We got back 56 metrics in total.
  • Anonymous summary of all projects people every created at Moovly with their metrics and details. 40 metrics in total.
  • Anonymous summary of all transactions made (by design this only included paying users - transactions and subscriptions).

What we delivered back was:

  • A prioritized list of the most important metrics that drive decision to subscribe, clearly showing that only 6 out of the total 277 were relevant (with 97% correlation accuracy).
  • Models to predict the number of days from the first sign up of an anonymous user to converting to a paying subscriber.
  • Marketing strategy to split freemium users into cohorts based on the number of days since opening an account.
  • Models for predicting whether or not a anonymous user will subscribe or not in a specific given period of time (see 2. above).
  • Consolidated the above into an easy to implement scoring model to improve user interaction and influence conversion.
  • Marketing spend trade-off model to determine optimum profitability and maximum result expectation per marketing spend.

Here is how we did it:

1. We tried to model and predict the new Performance Metric: # Days to Subscription, and it worked.

We first took all users from the training sample and computed the number of days to subscription. For all subscribers this was easy - we know when person opened a free account, and know when person first subscribed.

This simple exercise gave us a bonus insight: The Moovly product-market fit is so amazing, that out 49% of all their subscribers opt for paid services in the first two weeks of opening an account!

Then we looked at a group of freemium users who definitely did not subscribe in 6 months from opening an account. Some of these people were using the app a lot, some not so much, but we knew precisely that all of them are 'forever-free'. So, we decided to set the number of days to subscribe to a very large number - 210 days (7 months) - which is virtually never.

Then we plugged this new Key Performance Indicator together with 56 other behavior metrics into DataStories and let the magic happen.

It turns out that only three metrics are sufficient to accurately distinguish between subscribers, who have a low number of days till subscription, and 'forever-free' users, who we assume will not subscribe before 210.

It turns out that three metrics are sufficient to robustly and accurately predict the time to subscription:

  1. # logins since opening an account,
  2. # projects created since opening an account,
  3. # logins before the second welcome email.

See example predictions of a sample of 196 users BELOW. The first 97 are paid subscribers, and the rest are non-subscribers.

Predicting "Days to Subscription" helps distinguish subscribers from "forever-free" users with 97.2% correlation accuracy

HOVER over the graph for more info. Note, that we can accurately predict the small number of days for subscribers (users #1-#97) (albeit with some errors), and very accurately predict long time to subscription for non-subscribers (users #98-#196).

We can also discover outliers. Automatically.

Note 'outlying' predictionsm which are way off from the truth.

Note, the two pink spikes for users #28 and #37 where we predict the people will only subscribe on day 194 and 188 (which means not at all in our assumptions), while they actually did subscribe on the first day.

DataStories have been consistently predicting these users to not subscribe, and at the end of 10,000 cross-validations marked them as outliers.

Surprisingly, our models say these people should never have become paying customers based on their interaction with the product.

We shared this fact with Moovly and discovered, that one of the users is the wife of a co-founder, and the other user is the office neighbor, and both these people got an automatic subscription on the first day!

It turns out they are special, and we could determine this automatically. Based on the data!

2. We can also predict whether or not the customer will subscribe in the next few weeks (with 73-86% classification accuracy)

Predicting the # days to subscription is interesting but because it requires the total number of logins up to today, it may not provide a good business opportunity to plan far ahead.

To improve predictability we divided clients into cohorts by time passed since opening an account, and built predictive models of whether or not any particular person will become a paying customer in the next couple of weeks.

So first we selected three following groups of freemium users:

  • 24-hour cohort: freemium users who opened an account in the last 24 hours, and haven't subscribed yet.
  • 3-day cohort: users who opened an account in the last 3 days and haven't subscribed yet.
  • 7-day cohort: users who opened an account in the last week and haven't subscribed yet.

The choice of time periods to define cohorts should be dictated by business needs. The good thing is that the cohorts are non-intersecting. They will form a funnel of non-subscribers going from 24-hour cohort, into 3-day cohort, into 1-week cohort, and so on, until they subscribe, or churn.

The business may also have different budgets to convert clients in each cohort (e.g. people who haven't subscribed in a month are less warm for conversion than people who opted to try the product yesterday).

For each cohort we could create precise scoring models (with 73-86% classification accuracy) with at most five metrics!

Out of 277 metrics considered only 3 to 5 were selected as drivers in each cohort. Of course, they may change in time as the business changes, but the process of identifying them with new data only takes 30 minutes with DataStories.

For example, the metric of whether of not the user is subscribed to the newsletter was never selected as the subscription driver. However, if Moovly further customizes the newsletters, or starts a new breed of tutorial letters - it may become a converting factor in the future.

Having to track and focus on three to five metrics of user behavior instead of 277 saves time, money and brings focus into the busy lives of marketeers!

Then we went ahead and built several types of predictive models using the handful of drivers, discovered by DataStories.

More complex algorithms did give us better accuracy, but even the simplest scoring models did output acceptable results! This is what we love the most in our job - going to hell (heavy-duty machine learning and computational intelligence) and back to simplify the problem, all for the pleasure of finding a solution and everyone can understand and use!

Here is the simplest scoring model for a 24-hour cohort. It only needs four metrics!

PLAY with SLIDERS to estimate the HEAT SCORE. The model has 76% of classification accuracy and 81% of AUC on never seen hold-out data - GOOD STUFF.

0      5 or more    (0)
0      18 or more    (0)
0      5 or more    (0)

A good scoring model also gives unique heat score thresholds for each business.

A great thing about the heat score model is that it also helps identify unique score thresholds for your business.

One way to do this is to compute a statistically optimal score, which optimally splits non-subscribers from subscribers. In the simple example above this threshold is 321.

A better way to find the threshold is to use the available marketing budget as the guidance.

HERE is a simple COST-REVENUE model for the 24-hour cohort.

Let's assume that we want to target the top x% of heat score customers with a special marketing campaign. Let's say we send them a personal letter and a bottle of wine, with a cost of $20.00. Let's also assume if we convert the customer we will earn $100.00. To HOW MANY customers should we send the bottle of wine?

HOVER over the graph for more info:

The graph illustrates the total revenue, loss, and profit from a special marketing campaign if it is applied to the top 0%, 5%, ... 100% of heat scored users. In our example we look at 480 people who opened and account in the last 24 hours and haven't subscribed yet. Note, that the profit will be maximized if we target the top 30% of the users based on their heat score.

Best yet, we can be smarter and only target people who are "on the fence", i.e. a tad short of getting over the heat score threshold!

What can you do with this information:

BEFORE

AFTER

Can compute conversion rate, but have no control on who will convert. Score each freemium user early & get reliable estimation of how likely each user will subscribe.
Enjoy 4-16% conversion rate from targeting all users. Enjoy 79-84% conversion rate from targeting the top 20% of heat-score customers.
Lack of insights into user intention makes it hard to change user behavior. Precise heat-scores help identify 'nudge-eable' users who are a tad short of subscribing and target them to change their behavior.
"Spray & Pray" is one of the few options to spend marketing budget. Each new customer gets assigned a heat score. Low and High value customers don’t need to be contacted. Now you know to spend 90%+ of your marketing budget on Medium Heat Score customers. A simple call or bonus gift to these "on the fence" customers can convince them to purchase.

Each new freemium customer get assigned a Heat Score:

Targeting medium heat score users will give you the biggest bang for the buck. You want to change THEIR behavior.

Your Marketing Team instantly knows which customers to spend time and money acquiring. These customers have double odds of signing up if called and offered a small bonus.

Targeting medium heat score users will give you the biggest bang for the buck. You want to change THEIR behavior.

Wondering if DataStories could bring you similar results?

Request a free online demo and see DataStories in action!

Or call us: +32 47 638 84 97