What is Significance?
Kubit’s Significance Report offers statistical insights to verify if a finding in a Measure, Query, Funnel or Retention is significant (“whether a result is likely due to chance or to some factor of interest”).
We don’t support Significance for other Reports like Path, Data Table or Cohort Compare.
It is a general-purpose tool that doesn’t necessarily require conducting formal A/B tests. You can use this feature in a strict way with your experiment data (A/B test assignment), or compare between any two groups (breakdowns or segments) as if they had different treatments.
The major steps in building a Significance Report are:
Select the Mode you want to use in your analysis
Identify the Control and Variant(s) you are going to analyze
Select the additional parameters of your analysis based on items like one or two tailed, p-value and confidence intervals
Select the Metric you want to analyze the lift of between your Control and Variant(s).
You can selected Saved Measures, Query, Funnel and Retention reports.
Apply any filters if needed and set your date range
Execute!
Once executed you will see all the relevant statistical information.
Significance Modes Summarized
In Kubit's Significance Report you're able to analyze experiment type data two ways.
Experiment (Optional): Analyze the significance of a metric based on Experiment IDs and Variant IDs currently being shared with Kubit.
This Mode is only available for Customers that have shared specific Experiment type data with Kubit. If you do not have this available don't worry! You are still able to use the other 2 Modes.
Breakdown: Analyze the significance of a metric based on a field and the values within it.
This Mode is best if you have a field that denotes an experiment or different experience that you typically breakdown by.
Note that the breakdown value must be present in the resulting measure. More about that below.
Segments: Analyze the significance of a metric based on cohorts of users.
This mode is best if you assign users to experiments or different experiences with a single "assignment" event.
Create a Cohort of users who saw that assignment event with specific IDs to segment your control and variate groups.
Experiment Mode
Note that this Mode is only available for customers with the following fields:
Experiment ID or Name
AND a Variant ID or Name
Values are mapped to the User and not a specific event (i.e. an assignment or audience event)
If your experiment data is mapped to a specific event we recommend you use the Segment Mode described below.
In Experiment Mode you will build your analysis by selecting the relevant Experiment ID and related Experiment Variants to isolate the correct audiences.
Kubit will map these fields according to your data and this Mode will be visible once mapped and enabled.
Breakdown Mode
When you're analyzing the lift or impact of metrics and aren't capturing typical experiment information you're still able to measure lift from the other values within your dataset. Some examples include User Type, Platform, Subscription Type, Country etc.
Within Significance you are able to built a report based on the breakdown of a Field value.
If your breakdown value has more than 2 groups you're able to add additional Variants with the "+ Add Variant" option.
Segment Mode
Similar to Breakdown mode however instead of a Field you are Segmenting users based on their presence in a Saved Cohort.
First you'll want to build and save the Cohorts you want to compare using the Cohort options, read more here.
Cohorts that have not been Saved will not appear in the Segment Mode dropdown.
Once the Cohorts have been built you will select them into the relevant Segments of Control and Variant(s).
Other Selection Options
No matter the Mode you decide to use there are 3 options available to customize the parameters of your analysis.
Hypothesis Test
One-Tail Test
Looks for an effect in one direction (e.g., better or worse).
Two-Tail Test
Looks for an effect in both directions (e.g., different, whether better or worse).
Use a one-tail test for a specific direction and a two-tail test for any difference.
P-Value
Description
The p-value tells us how likely it is to see our data, or something more extreme, if the null hypothesis is true.
Range: It goes from 0 to 1.
How to Interpret P-Value
If the p-value is less than 0.05, it means our result is statistically significant, and we should consider rejecting the null hypothesis.
Confidence Level
Description
The confidence level indicates the likelihood that observed differences between versions A and B are real, not random.
Range: Typically given as a percentage (e.g., 95% confidence level).
How to Interpret Confidence Level
A 95% confidence level means we are 95% confident that the true difference between versions A and B lies within the calculated interval.
Metric Selection
Once you've established the variants to be analyzed against one another now you must select the metric you are interested in seeing the impact on. To use an example, if you are running an A/B test on an experience to improve check out events then the "Check Out" event count would be the metric you'd select.
A Metric should be relevant to the analysis and/or experiment you've selected and be something that enough users performed to yield a significant result. If you select a metric that is too unique or too few users perform there may not be enough data to product a meaningful result.
Metric Options
Measure: Select from a list of Saved Measures to compare the results of each significance group.
Query: Select from a list of all Queries (that meet the criteria below) to compare the resulting Measure.
The Query cannot be built using Histogram or Impact Analysis modes.
The Query cannot use a Rolling Window.
Funnel: Select from a list of all Funnels (must be in Conversion Mode) to compare the resulting Conversion Rates.
Retention: Select from a list of all Retention curves (must be in Retention Mode) to compare the resulting Retention Rates.
Things to Consider
Kubit will overwrite dates in the original query and replace them with the dates selected in the Significance Report you've built. This ensures the metric results are in-line with the significance date range you are interested in.
Any Filters in any part of the source Metric will be applied to the Significance Analysis.
If the Experiment/Breakdown/Segment you've selected has too few users you will see an error noting this and results will not be returned.
Filter Your Analysis
You will be able to Filter based on a Global Filter or Cohort Filter like you do in all other Kubit reports. These filters will be applied before any statistical analysis has been performed.
Selecting Your Date Range
Once you've made all your inputs you'll select the date range that corresponds with the duration of the experiment you want to analyze.
As mentioned above, Kubit will overwrite the dates from a Metric sourced Query/Funnel/Retention report to those of the significance date range. Don't worry, we don't change the underlying Metric report logic.
All results will be displayed as "All Time" and be an aggregation of the entire duration of the date range.
Interpreting Your Results
Kubit will display results in the following way, and hovering over the Variant's displayed will show more detailed metrics outlined in the Glossary of Terms Used section of this article.
You will be able to add these results to a Dashboard and/or Workspace.
Significance FAQ
When using a Report as the Metric, does Significance take the Query/ Funnel/ Retention Report’s date range into account?
No, the date on a Report used as a Metric is overwritten by the date set in Significance.
If I have a Cohort Filter on my report metric, will that date be overwritten as well?
No, Cohort are defined separately and will retain their dates.
Why am I getting no results in Significance?
If you’re using Query or Measure as a metric, check that your report counts something other than the subject of the significance report. If your Significance subject is users and your query or metric is counting unique users the mean will be 1 and you’ll get no results.
Alternatively, check that your sample size is large enough. If the sample size is too small, significance can't be calculated.
What happens if I have a Breakdown or Segment in the report metric I use for Significance?
Any Breakdowns/Segments in the
Group By
part of the report metric will be ignored. This is to avoid confusing interactions with the Breakdowns or Segments used in the Significance report.
Glossary of Terms Used
Hypothesis Test | One-Tail Test: Looks for an effect in one direction (e.g., better or worse). |
P-Value | Description: The p-value tells us how likely it is to see our data, or something more extreme, if the null hypothesis is true. |
Confidence Level | Description: The confidence level indicates the likelihood that observed differences between versions A and B are real, not random. |
Lift | Description: Lift, or relative performance, quantifies how much better one option performs compared to another. |
Lift Confidence Interval | Description: Lift CI shows the potential range of change around a measured value. |
Stat Sig | Description: The p-value tells us how likely it is to see our data, or something more extreme, if the null hypothesis is true. |
Sample Size | Description: Sample size is the number of participants or observations in each group. It affects result reliability. |
Mean | Description: The mean is the average outcome of a measurement (like clicks or purchases) for each group tested. |
Std Deviation | Definition: Standard deviation measures how spread out the numbers in a set are. It shows whether the numbers are close to the average (mean) or scattered far from it. |
Delta | Description: Delta percentage, or absolute performance, shows the exact difference in performance between two options. It focuses only on the actual change, not on how it compares to other factors. |
Test Score for Query, Retention and Measures | Test Score (Welch's t-test) |
Test Score for Funnel | Test Score (Chi-Square Statistic) |
P-Value | Description: The p-value tells us how likely it is to see our data, or something more extreme, if the null hypothesis is true. |
Power (# of exp) | Power |