Introduction
Retention is a Report which shows how different Cohorts of users (defined by Starting Events on a Date) retain over time. This type of analysis is typically used to understand the difference between users who retain and those that don't and how to adapt user journeys in order to make products more sticky.
The main inputs are:
Subject: The Subject you want to measure retention of, these are custom to your environment and are things like User ID, Visitor ID etc.
Cohort: Specify the Time Unit for Cohort and Retention, default is "Day"
Retained: The interval of time you want to view their Retention (Daily, Weekly, Monthly etc.)
Looking Forward: The number of days you want to look forward to see if they've retained or not. If you want to look forward several weeks or months be sure to add the sufficient number of days (i.e. 14, 30, 90).
Measure As: How you want to measure the Retention based on Retention Types
Retention Based on: If needed, change the way you want to determine a "day", should it be Calendar Day or 24-hour period.
Then select your "Starting Event(s)" and "Returning Event(s)"
Selecting Your Time Unit for Cohort
In Kubit you first select the Cohort of users who performed the 'Starting Event' during a specific period of time. The report will "bucket" these users based on the first day they did the starting event within the window of time you've selected as well as the interval of time.
Intervals of time determine how your data will be represented
If you select Daily/Weekly/Monthly etc. the user had to perform the Starting Event within the interval of time and the results will be separated by each day/week/month etc. within the time period.
Example: If you select 1/1 - 1/5 Daily you will see a retention curve or row for each day.
This also means that each curve or row of data will have it's own Nth day.
Example: If you select 1/1 - 1/5 Daily then all users who performed the Starting Event on 1/1/ will have 1/2 as their Day 1. If they performed it on 1/2 then 1/3 will be their Day 1.
All Time will group all users into a single cohort and they will all have the same Day 1, the day following the last date of the cohort time period.
Example: If you select 1/1 - 1/5 All Time then all users will have their Day 1 as 1/6.
Retention types
There are multiple ways to measure retention, so in this section we'll cover the ones supported by Kubit. There is, however, something common concepts between all retention types - the Retention chart displays each cohort as a row identified by Cohort Date, followed by how many unique users exist on that initial date, then how many of them came back/churned on the following dates.
Normal Retention
In a nutshell - for each starting cohort, for each remaining date of the time period, count all users who are part of the starting cohort and have at least one returning event on that date.
It is useful when users are expected to return regularly over relatively short time periods.
Rolling Retention
Rolling retention is called Rolling, because we not only require a user to have been a part of the starting cohort and have a returning event on a particular date X in order to count them on that date, but we also want them to have a returning event on each date between the starting cohort date and X. We expect the user to have been continuously returning, hence - Rolling.
Useful when high stickiness is expected.
Unbounded Retention
In some businesses users return less regularly, e.g. once a month to pay a bill or fill in supplies. When that's the case Unbounded Retention can help get a more accurate number of retained users, because users will be counted not only on their return date, but also on all dates prior to it.
Retention Modes
Retention
Retention shows how different Cohorts of users (defined by Starting Events on a Date) retain over time. This is the most common and default mode of measuring Retention Rates.
Retention Over Time
Retention over time plots the trending retention rate for users retained on each interval of time i.e. Day 1, 3, 7. This is great to understand if your Day 1 retention rate is improving over each cohort of subjects.
You'll notice with this mode the lines plotted are the Daily Retention points vs. the cohort dates like in the default Retention mode.
Usage Interval
Usage Interval shows the typical Usage Interval of a product feature, e.g what % of users have generated the returning event by Day X. Typically you'll use Usage Interval to understand what a good Retention timeframe may be, when the line crosses a 50% threshold that's when half of your Users typically perform the returning event.
In the above example, 50% of users return on Day 4 to perform the return event which means our Retention report should use 4 day intervals to be consistent with user behavior.
Usage Interval Over Time
Usage Interval Over Time shows you how the Usage Interval of an event changes over time. Similar to Retention Over Time you can see if the interval of time Usage Interval of each cohort is higher/lower over time.
Measure Churn
Churn is essentially the inverse metric of Retention. If Retention tells us how many users are returning, Churn measures how many users are not coming back. Switching from Retention to Churn is done by a switch and does not require re-execution of existing analyses.
Retention Calculation Windows
In retention there are typically 2 ways to think of a day:
1. A Calendar Day that starts at 00:00 and ends at 23:59.
2. A 24h Window between the "Starting" and "Returning Event(s)" that is individual for each user.
Let's assume the following sequence of events for a user:
1. Start event - 2023-07-01 10:00:00
2. Return event - 2023-07-01 11:00:00
3. Return event - 2023-07-02 09:00:00
4. Return event - 2023-07-02 12:00:00
Strict Calendar Date
Strict calendar date window computes the difference between start and return events based on calendar dates. Let's go through the example above.
Start Event | Return Event | Retained at Day |
2023-07-01 10am | 2023-07-01 11am | Day 0 |
2023-07-01 10am | 2023-07-02 9am | Day 1 |
2023-07-01 10am | 2023-07-03 12pm | Day 2 |
Strict calendar date windows are not equal because in this case the start date makes the window shorter compared to Day 1, 2 etc. In the example above Day 0 starts at 2023-07-01 10am and ends at 2022-07-01 11:59pm while Day 2 and 3 have exactly 24 hours length starting from 00:00:00 and ending at 11:59pm.
24 Hour Windows
24 hour windows defines each day as equal interval as exactly 24 hours after each start event timestamps. Let's go through the example above again but this time using 24 hour windows calculation window.
Start Event | Return Event | Retained at Day |
2023-07-01 10am | 2023-07-01 11am | Day 0 |
2023-07-01 10am | 2023-07-02 9am | Day 0 |
2023-07-01 10am | 2023-07-03 12pm | Day 2 |
The example above shows that each day is computed with exactly 24 hours after the start event. Day 0 starts at 2022-07-01 10:00:00 and ends at 2022-07-02 09:59:59.
That's why the returned event at 2022-07-02 09:00:00 is computed as Day 0 but not Day 1 as it will be with Strict Calendar Date window.
Overall in Retention
Whenever a Breakdown is specified in Retention you can use the `Overall` toggle to display the cumulative count for all breakdown groups as a separate group. The `Overall` group is displayed among the other Breakdown groups.
Note: `Overall` toggle is switched off by default when Breakdown is selected.
Advanced Retention Features
There are a few advanced parameters which can be used to fine-tune the analysis:
Attribution
Linear means that a User will be assigned to all Cohorts when the Starting Event(s) occurs (inclusive)
First means that a User will be assigned only to the Cohort for the day when the first Starting Event(s) occurred.
When using First attribution and creating a Cohort from this result Kubit will enforce the "since" date to ensure we are not including erroneous subject in the Cohort created.
Group by
Group by is only applicable when there is a Breakdown specified.
For example, if there was a Breakdown per Country, when Group by = "Starting", Kubit will group users only for the `Starting Event(s)` and ignore the `Returning Event(s)`' Country code.
When Group by = "Global", both `Starting` and `Returning Event(s)` have to have the same Country code.