Bayesian A/B Testing
Instead of 'is this significant?' it answers 'what's the chance B beats A, and by how much?' — the question teams actually have.
- Term
- Bayesian A/B Testing
- Reports
- P(variant best), expected loss, credible intervals
- Contrast
- Frequentist p-value / significance
- Strength
- Intuitive outputs, graceful early reads
Forms & parts of speech
Definition in plain terms
Bayesian A/B testing frames experiments as updating beliefs with evidence: starting from a prior, the data produces a posterior distribution, and the test reports intuitive quantities — the PROBABILITY that variant B beats A, the expected magnitude of the difference, and the 'expected loss' from choosing wrong. It answers the question teams actually ask ('what's the chance B is better, and by how much?') rather than the frequentist question ('could this data have arisen if there were no difference?').
The mechanics
The practical contrasts with frequentist testing: Bayesian outputs are directly interpretable (a 95% probability-to-beat means what people wrongly think a p-value means), it handles early looks and ongoing monitoring more gracefully (the posterior just updates, with less of the rigid peeking penalty — though decision rules still matter), and 'expected loss' supports risk-based stopping. The honest caveats: the PRIOR is a real choice (a strong prior sways small samples — usually set weak/uninformative), Bayesian methods aren't immune to bias or underpowering (a confident-looking posterior on tiny data is still tiny data), and the framework is a different lens, not a license to skip rigor. Both schools, done well, converge.
When it matters
Bayesian testing fits teams that want decision-shaped outputs (probability and expected loss map cleanly onto 'ship or not'), continuous-monitoring contexts, and stakeholders who misread p-values (the Bayesian number means what they think it means). Frequentist methods remain the regulated-research default and many platforms' native mode. The mature stance is method-agnostic: the discipline — adequate data, honest priors or honest alpha, pre-set decision rules, effect sizes that matter — outranks the school. Pick the lens whose outputs your team will read correctly.
Synonyms & antonyms
Synonyms
Antonyms
Origin & history
Bayesian inference descends from Thomas Bayes' 18th-century theorem (published posthumously, 1763); its application to online A/B testing was popularized in the 2010s by experimentation platforms (VWO's Bayesian engine, 2015) and writers seeking more interpretable, peeking-tolerant alternatives to frequentist significance.
Etymology: source.
Usage trends
Search interest for this term over the last five years:
Common questions
- What is Bayesian A/B testing?
- An approach reporting the probability one variant beats another and the expected cost of choosing wrong — updating beliefs with data.
- How does it differ from frequentist testing?
- It outputs directly interpretable probabilities and expected loss, rather than p-values and significance against a null hypothesis.
- Is Bayesian testing better?
- Different, not automatically better — its outputs are more intuitive, but it still needs adequate data, honest priors, and pre-set decision rules.
Related tools & calculators
Resources & people to follow
- bookTrustworthy Online Controlled Experiments — Kohavi, Tang & Xu
- referenceVWO / Dynamic Yield — Bayesian engine documentation
- referenceRGM analysis — method-agnostic rigor beats school loyalty
Curated, non-competitor resources verified per term.
Related training
- moduleCRO & experimentation
Disciplines
Areas of marketing where bayesian a/b testing is a core concern: