Sample Ratio Mismatch Calculator

Understanding Sample Ratio Mismatch

Introduction to Sample Ratio Mismatch (SRM)

Sample Ratio Mismatch (SRM) is a critical concept in experimental design, particularly in A/B testing and data analysis. It occurs when the observed ratio of samples in different experimental groups significantly deviates from the expected ratio. This phenomenon serves as an early warning system that something may be wrong with your experiment design, implementation, or data collection process.

Key Insight:

According to studies by major tech companies, about 6-10% of online experiments naturally experience some level of SRM. However, when SRM occurs more frequently, it warrants deeper investigation.

Why SRM Matters in Experimental Design

The importance of SRM cannot be overstated in the context of experimental validity. When you encounter an SRM, it typically indicates that:

Your randomization process might be flawed - Proper randomization is essential for valid experimental conclusions.
Selection bias may be present - Certain types of users might be systematically excluded from one variant.
Technical issues could exist - Implementation errors might be affecting how users are assigned or tracked.
Data collection might be inconsistent - Issues in logging or tracking could be creating discrepancies.

SRM in A/B Testing

In A/B testing, SRM is particularly concerning because it can invalidate your entire experiment. Consider a scenario where you're testing a new website design:

Expected Scenario

Variant A: 50% of traffic (5,000 visitors)
Variant B: 50% of traffic (5,000 visitors)

SRM Scenario

Variant A: 60% of traffic (6,000 visitors)
Variant B: 40% of traffic (4,000 visitors)

This 60/40 split instead of the intended 50/50 could indicate that some users are systematically being excluded from Variant B, perhaps due to browser compatibility issues or page load failures. If this is the case, any conversion rate differences might be due to the selection bias rather than the actual design changes.

Statistical Framework for SRM Detection

Detecting SRM requires a statistical approach, most commonly using the Chi-Square test of independence. This test helps determine if the observed allocation differences are due to random chance or if they indicate a systematic issue.

Chi-Square Test for SRM

The formula calculates the difference between observed and expected frequencies:

χ² = Σ [(observed - expected)² / expected]

The resulting p-value indicates the likelihood of seeing this allocation by chance:

p-value < 0.01: Strong evidence of SRM
p-value >= 0.01: No significant evidence of SRM

Common Causes of Sample Ratio Mismatch

Category	Common Causes
Experiment Assignment	Flawed randomization algorithms, corrupt user IDs, incorrect bucketing
Experiment Execution	Different start times for variations, filter execution delays
Technical Issues	JavaScript errors, page load failures, browser compatibility problems
Data Collection	Bot traffic, tracking failures, analytics implementation errors
External Interference	Direct links shared on social media, overlapping experiments

Best Practices for Handling SRM

Early detection - Check for SRM as soon as your experiment starts running
Regular monitoring - Continue checking throughout the experiment duration
Segment analysis - Determine if SRM affects specific user segments (browsers, devices)
Root cause investigation - Systematically examine potential causes from the table above
Document findings - Keep records of SRM incidents and resolutions for future reference

SRM vs. Natural Variation

It's important to distinguish between statistically significant SRM and natural variation in sample distribution:

Natural Variation

Small differences in allocation (e.g., 50.5% vs 49.5%) usually fall within expected statistical variation.

Significant SRM

Larger, statistically significant differences (e.g., 55% vs 45%) likely indicate an underlying issue.

Impact on Business Decisions

Ignoring SRM can lead to costly business mistakes. Consider these scenarios:

False positives - Incorrectly concluding a variation is better when it's not
False negatives - Missing actual improvements due to biased data
Wasted resources - Making changes based on invalid test results
Repeated errors - Propagating flawed experiment designs in future tests

Practical Tip:

Use our Sample Ratio Mismatch Calculator to quickly determine if your experiment has a statistically significant SRM. Simply input your expected ratio, observed ratio, and sample size to get an immediate assessment.

Advanced SRM Considerations

For more complex experimental designs, consider these additional factors:

Users vs. Sessions - Always check SRM at the user level first, as session-level analysis can be misleading
Multi-variant testing - Apply SRM checks to all variants individually
Time-based analysis - Track SRM patterns over time to detect issues that might appear after experiment launch
Cross-platform consistency - Ensure consistent assignment across different platforms and devices

Conclusion

Sample Ratio Mismatch is more than just a statistical anomaly—it's a critical indicator of experiment health. By understanding, detecting, and addressing SRM, you can ensure the validity of your experiments and the reliability of your business decisions. Remember that while some level of SRM occurs naturally in experiments, persistent or significant SRM requires investigation and resolution to maintain data integrity.

Concept

What is Sample Ratio Mismatch?

Sample Ratio Mismatch (SRM) occurs when the observed ratio of samples in different groups significantly differs from the expected ratio. This can indicate issues with randomization or data collection in experiments.

Key Points:

Indicates potential randomization issues
Can affect experiment validity
Should be monitored in A/B tests
Requires statistical testing

Guide

Detecting SRM

Chi-Square Test

Most common method

Z-Test

For large samples

Visual Inspection

Initial screening

Guide

Interpreting Results

Interpretation Guidelines

p-value < α: Significant mismatch
p-value ≥ α: No significant mismatch
Consider sample size impact
Check for systematic bias

Examples

Common Examples

Example 1 No Significant Mismatch

Expected: 0.5, Observed: 0.48, n=1000
Result: Not significant (p > 0.05)

Example 2 Significant Mismatch

Expected: 0.5, Observed: 0.35, n=1000
Result: Significant (p < 0.05)

Example 3 Small Sample Size

Expected: 0.5, Observed: 0.45, n=100
Result: Not significant (p > 0.05)

Calculate Sample Ratio Mismatch

Table of Contents