A/B test analysis framework

A reusable framework for analyzing controlled experiments using frequentist, Bayesian, and sequential methods, with multiple comparison correction and power analysis.

Bayesian inference Sequential testing Hypothesis testing Power analysis Experimentation Product analytics

0.996

P(B > A), website redesign

Interactive dashboard

Five-page Streamlit application for end-to-end experiment analysis

Overview

Three experiments with conversion and revenue
Summary table with significance decisions
Forest plot of effect sizes

Frequentist

Z-test for proportions and Welch's t-test
Confidence intervals and effect sizes
Multiple comparison correction (Bonferroni, Holm, BH)

Bayesian

Beta-Binomial posterior distributions
P(B > A) and expected lift with credible intervals
Posterior sampling with 100K draws

Power calculator

Sample size estimation given baseline, MDE, alpha, power
Interactive sensitivity curves
MDE vs. sample size trade-off chart

Report

Structured summary for each experiment with frequentist and Bayesian verdicts
Sequential monitoring with O'Brien-Fleming spending function boundaries
Exportable results table with all metrics, p-values, and credible intervals

Launch live app

$ pip install -r requirements.txt && streamlit run app.py

Key results

Three controlled experiments analyzed with dual frequentist and Bayesian frameworks

Experiments analyzed

p<.001

Email subject line test

0.996

P(B > A), website redesign

2 / 3

Significant experiments

Methodology

Three experiments (website redesign, pricing change, email subject line) with conversion and revenue metrics. The frequentist path uses z-tests for proportions and Welch's t-tests for continuous outcomes, with Cohen's h and d effect sizes and multiple comparison correction via Bonferroni, Holm, and Benjamini-Hochberg FDR. The Bayesian path uses a Beta-Binomial conjugate model with uninformative priors and 100K posterior samples. Sequential monitoring applies O'Brien-Fleming spending functions for valid early stopping.

Power analysis

Sample size, MDE, alpha, power

Frequentist tests

Z-test, t-test, correction

Bayesian inference

Beta-Binomial, posterior, P(B>A)

Decision report

Sequential monitoring, verdicts