Help Center

Campaigns

Statistical Significance in the PostPilot App

In this article, we'll walk through statistical significance, what it means for your direct mail campaigns, and how we calculate it.

Statistical Significance 101

If data is statistically significant, it likely resulted due to an actual effect.

Not due to chance.

In ecommerce/retail, statistical significance helps marketers determine whether results from an initiative (such as a direct mail campaign) are attributable to the initiative itself (e.g., the creative, copy, or assets) or merely due to chance.

What Statistical Significance Tells You

Statistical significance is purely about reliability and confidence in the results, not the quality of the results from a business perspective. It tells you:

How confident you can be that the underlying pattern would show up consistently in similar campaigns
Whether the campaign results reflect a real pattern or could be due to chance
The mathematical reliability of your campaign results

What Statistical Significance Does NOT Tell You

Statistical significance says nothing about:

Whether your campaign was successful or profitable
Whether the performance difference is large or small
Whether you should scale or stop the campaign
Your return on ad spend (ROAS) or incremental ROAS (iROAS)

For a marketing initiative to be considered statistically significant, it needs to have a number of recipients large enough to discount the possibility of positive results just being due to chance/random variation.

We use a common and robust statistical analysis called a two-proportion Z-test to evaluate this. More on this below.

Requirements for a Campaign to be Considered “Statistically Significant”

Only campaigns that have holdout groups will be evaluated for statistical significance.

Campaigns must have the following requirements:

- At least 1000 in treatment group
- At least 1000 in holdout group
- Data that is at least 50% mature (halfway through attribution window)

Why Do We Determine Statistical Significance Based on Conversion Rate?

Conversion rate provides the most reliable foundation for statistical significance testing as it directly measures campaign effectiveness as a percentage of recipients who took action.

Using this standardized metric eliminates variability caused by different campaign sizes and order values, creating a consistent baseline for comparison between treatment and holdout groups.

This approach enables marketers to confidently determine whether performance differences represent genuine marketing effects rather than random variation.

How We Display Statistical Significance In-App

If campaign results are statistically significant, the blue highlighted text will appear with a tool tip. Notes:

The Statistically significant indicator appears whether your campaign outperformed OR underperformed the holdout. Statistical significance is about confidence in the results, not whether those results are good or bad for your business.
If you do not see the Statistically significant indicator, it could be because:
- The campaign didn't meet the requirements for analysis.
- The results are not statistically significant.
- For automations: The particular date range does not show statistical significance (other date ranges may)

58e0848e8468081d908ae2714a71b25b53a564c93fd2052996acd840ba4856b8a9e37e64da0c8155474cda0f3f35a8afe52503476e5c78101169305f161c.png

By hovering over the tool tip, the following text will appear.

Note: The percentage will vary based on the campaign.

If results are not statistically significant, it means:

Confidence is less than 90%.
The results could be random chance. The pattern may not show up consistently in similar campaigns
The pattern isn't reliable enough to make definitive conclusions

This doesn't mean your campaign failed or succeeded. It means the mathematical confidence isn't high enough to be certain about the pattern.

What if My Results Aren't Statistically Significant?

If your results are not statistically significant, your AM may recommend changes to the campaign (e.g., to design, offer, or segment), which may lead to a higher confidence level.

Consider another marketing test, such as a CRO test where you're testing button colors on a page.

If you're getting 100k monthly visitors, and you don't see either color as a clear winner, a next test might be to make a bolder design choice. E.g., a vibrant, larger button and/or stronger wording in the CTA.

In the same way, larger changes and additional testing may be recommended so that we can be more confident in the ultimate results.

How We Calculate Statistical Significance

When a campaign has at least 1000 in the treatment group, 1000 in the holdout group, and is at least 50% mature, we will automatically perform a statistical analysis called a two proportion Z-test on the results.

Why this analysis?

The two proportion Z-test is used to assess the difference between two population proportions. In this case, the proportions are conversion rates, and we are evaluating conversion rates between two independent groups (holdout vs. treatment group).

Formula:

Where:

p₁ = conversion rate of test group
p₂ = conversion rate of holdout group
n₁ = sample size of test group
n₂ = sample size of holdout group
p̂ = pooled proportion = (conversions₁ + conversions₂) / (n₁ + n₂)

Formula Breakdown:

1. Numerator (p₁ - p₂)
This measures the difference between the two observed sample proportions.

2. Denominator: √[p̂(1-p̂) (1/n₁ + 1/n₂)]
The denominator estimates the standard error (the standard deviation of the sampling distribution) of the difference between proportions.

Steps for Calculating Statistical Significance with an Example Campaign

Example Campaign Variables:

Test Group Size: 2064
Test Conversions: 75
Test CVR: 3.63
Holdout Group Size: 900
Holdout Conversions: 40
Holdout CVR: 4.44%

Step 1: Calculate the pooled proportion

Total conversions = 75 + 40 = 115
Total sample size = 2,064 + 900 = 2,964
Pooled proportion (p̂) = 115 / 2,964 = 0.0388

Step 2: Calculate the standard error

SE = √[0.0388 × (1 - 0.0388) × (1/2,064 + 1/900)]
SE = √[0.0373 × (0.00048 + 0.00111)]
SE = √[0.0373 × 0.00159]
SE = √0.0000593
SE = 0.0077

Step 3: Calculate the Z-score

Z = (0.0363 - 0.0444) / 0.0077
Z = -0.0081 / 0.0077
Z = -1.05

Step 4: Find the p-value

For a two-tailed test with Z = -1.05
p-value = 0.2938 (approximately 0.29)

Step 5: Determine the confidence level

The p-value is 0.29 (or 29%)
The confidence level is calculated as (1 - p-value)
So the confidence level is (1 - 0.29) = 0.71 or 71%

Step 6: Determine Outcome

Since the confidence level is 71%, then we consider this not statistically significant.