Analyze experiment
Type in aggregated stats or upload a CSV — for conversion rates or continuous metrics, with optional stratification.
Frequently asked questions
What is statistical significance in an A/B test?
Statistical significance means the difference between your control and variant is unlikely to be due to random chance. It's typically measured with a p-value: a p-value below your chosen threshold (commonly 0.05) indicates the result is statistically significant.
How is the p-value calculated for a conversion-rate A/B test?
For binary conversion data, AB SHARK uses a two-proportion z-test by default. The test statistic compares the observed difference in conversion rates against the standard error under the null hypothesis of equal rates, and the p-value is the tail probability of seeing a result at least as extreme.
When should I stop my A/B test?
Decide your sample size before you start using a sample-size calculator, then run until you reach it. Peeking at results and stopping early when a test crosses p < 0.05 inflates your false-positive rate. Use AB SHARK's planner to compute the required n given your baseline rate, MDE, alpha, and power.
What is a confidence interval in A/B testing?
A confidence interval is a range of plausible values for the true effect size (such as conversion lift). A 95% confidence interval means that if the experiment were repeated many times, 95% of such intervals would contain the true effect.
Is AB SHARK free to use?
Yes. AB SHARK is a free, open-source A/B test analyzer. There's no signup, no account, and no usage limits. You can run as many analyses as you want directly in your browser.
Can AB SHARK analyze continuous metrics like revenue per user?
Yes. Upload per-visitor numeric data or paste summary statistics (mean, standard deviation, n) and AB SHARK runs Welch's t-test, returning the mean difference, confidence interval, and p-value. Variance uses the Bessel-corrected sample estimate so small samples aren't biased downward.
What is post-stratification and when should I use it?
Post-stratification reweights each variant's per-stratum estimates by the pooled stratum distribution (e.g., device, country, user tenure). It reduces variance when strata are predictive of the outcome and removes bias when arms have imbalanced strata distributions. AB SHARK supports it for both binary and continuous outcomes — provide a stratum column in your CSV.
How is the sample size for a continuous-metric A/B test calculated?
AB SHARK uses a Welch-style sample-size formula that takes the baseline standard deviation, the minimum detectable effect (MDE) in absolute or relative units, alpha, and target power. The planner returns the per-arm sample size and (optionally) the duration in days from a daily-traffic input.