Background and Context. Probability (p) values are widely used in social science research and evaluation. However, as summarized in a recent statement from the American Statistical Association, they are often misused or misinterpreted. Statistical experts have noted that using “bright lines,” such as p<.05 may lead to misinterpretation of results. In addition, p-values alone do not provide information about the size of an effect, they do not indicate the probability that a given hypothesis is true, and they can be misconstrued when researchers are not transparent about reporting all the analyses they have conducted (e.g., p-hacking/cherry picking). Despite cautions against overreliance on p-values, many researchers lack the knowledge and training to confidently implement alternative statistical approaches, even when a p-value may not be the most appropriate approach for a given research question or dataset, or when a more efficient and effective approach may be available.
Recent computational and software advances have made Bayesian methods the primary alternative approach to p-values. Bayesian methods use probabilistic inference (e.g., “there is a 90% chance that the program reduced costs”) rather than a strict cutoff value like p<.05. This approach can be more meaningful and intuitive, particularly to lay audiences without formal statistical training. Bayesian model-building is a transparent process that requires researchers to specify their decisions and make them available for others to evaluate. Finally, Bayesian methods provide an opportunity to incorporate and quantify different types of uncertainty (e.g., model uncertainty, researcher uncertainty), while frequentist approaches, like p-values, only account for sampling uncertainty.
In addition to addressing some of the limitations of p-values, Bayesian methods have other advantages. For example, they can be used to improve precision in studies with small samples and low power. Bayesian methods also allow researchers to incorporate existing information (either from prior studies or from the dataset at hand), and update conclusions as new data becomes available.
There are challenges when using Bayesian methods as well, including intensive model building and checking processes, selection of appropriate priors, and translating probabilistic inferences into yes/no decisions. It is essential to understand the underlying assumptions, tradeoffs, validity, and generalizability of the results in a Bayesian framework, and the circumstances under which it may be more appropriate than using a frequentist approach.
Meeting Topics and Goals. OPRE’s 2017 research methods meeting encouraged attendees to question their assumptions around traditional frequentist approaches and explore the leading alternative, Bayesian methods. Speakers shared their expertise in understanding when and when not to use p-values, the ideas underlying Bayesian methods and associated advantages and drawbacks, examples of successful applications of Bayesian analysis in social science research and evaluation, and using Bayesian approaches for decision-making as part of a broader conversation about responsibly communicating research findings. The meeting included presentations and discussions on the following questions:
The goals of the meeting were to:
9:00 – 9:30
Naomi Goldstein (Deputy Assistant Secretary for the Office of Planning, Research, and Evaluation)
9:30 – 10:45
Slide Deck: Doctor, It Hurts When I p
Ronald Wasserstein (American Statistical Association)
Slide Deck: A Brief Introduction to Bayesian Statistics
David Kaplan (University of Wisconsin–Madison)
11:00 – 12:00
Slide Deck: Plausible Priors Precede Persuasive Posteriors
Mariel Finucane (Mathematica Policy Research)
Slide Deck: Bayesian Model Specification (Or At Least Some of What Can Be Said About This Topic in 25 Minutes)
David Draper (University of California–Santa Cruz)
1:30 – 2:15
Slide Deck:vThe Failure of Null Hypothesis Significance Testing When Studying Incremental Changes – What to do About it?
Andrew Gelman (Columbia University)
2:15 – 3:15
Slide Deck: Making Bayesian Analyses Accessible through Visualization: Case Study, Meta–Evaluation of the Health Care Innovation Awards
Nikki Freeman (RTI International)
Slide Deck: Applications: Health Care Provider Performance Assessment
Susan Paddock (RAND Corporation)
How can we use Bayesian methods to better support decision making? How can researchers and policymakers responsibly communicate research findings, regardless of statistical methodology?
Stuart Buck (Laura and John Arnold Foundation), moderator
Donald Berry (MD Anderson Cancer Center, University of Texas)
Gregory Campbell (GCStat Consulting LLC)
Timothy Day (Center for Medicare and Medicaid Innovation)
Jacob Alex Klerman (Abt Associates)
Slide Deck: The Right Tool for the Job: A Bayesian Meta-Regression of Employment and Training Studies
Lauren Vollmer (Mathematica Policy Research)
Slide Deck: On the Utility of Bayesian Model Averaging for Optimizing Prediction: Two Case Studies
David Kaplan (University of Wisconsin–Madison)
Slide Deck: Bayesian Inference for Sample Surveys
Trivellore Raghunathan (University of Michigan)
What can research funders do to facilitate the use of Bayesian methods? How can Bayesian results be incorporated in to evidence reviews? What are specific challenges to using Bayesian methods in social policy research? How can we introduce Bayesian approaches to unfamiliar audiences?
Robin Ghertner (Office of The Assistant Secretary for Planning and Evaluation), moderator
Scott Cody (Project Evident)
Molly Irwin (Department of Labor, Chief Evaluation Office)
Renee Mentnech (Center for Medicare and Medicaid Innovation)
David Rindskopf (CUNY Graduate Center)
Moving Beyond Statistical Significance: The BASIE (BAyeSian Interpretation of Estimates) Framework for Interpreting Findings from Impact Evaluations