Building Strong Evidence in Challenging Contexts: Alternatives to Traditional Randomized Controlled Trials

Meeting Topic

Randomized controlled trials (RCTs) are considered the “gold standard” in estimating program impacts. In the social sciences, traditional RCTs usually involve randomly assigning participants (or other units) to an intervention or comparison group, and all participants either receive or do not receive the intervention. The strong preference for RCTs stems from the fact that the counterfactual condition can be clearly defined, and creating the counterfactual through randomization allows for causal inference.

In some cases, however, a traditional RCT may not be the most appropriate design, if it does not truly address the research question or if it presents challenges that make it impossible or infeasible to carry out the research that is desired. For example, there may be very few cases to randomize (e.g., in the case of small, specialized target populations, or community-level and place-based interventions), there may be ethical concerns about not providing services to a portion of a population, or community leadership may not agree to randomization. Unfortunately, the heavy reliance on traditional RCTs for assessing program impacts can result in research disparities, with little rigorous evidence on some populations that are difficult to study but in great need of evidence-based services, or on promising interventions that are not suitable for RCTs.

If we wish to rigorously assess program impacts in these contexts, one challenge is identifying or creating an appropriate counterfactual condition and a method for assigning participants or other study units to it. Some alternatives to traditional RCTs do involve random assignment (e.g., stepped wedge design, single case design), but the method and timing of randomization are different and create new considerations for understanding the treatment differential and interpreting the results. Other options are non-experimental and capitalize on other sources of variation (e.g., comparative interrupted time series, regression discontinuity designs); each of these approaches requires careful decisions in selecting the counterfactual condition and ensuring that it will result in a fair comparison group. Finally, other alternatives involve using statistical analysis (e.g., Bayesian statistics, simulations) to maximize what we can learn from the available information.

The starting point for any study should be to establish the research questions and then match them to a rigorous design that is best able to address them. Like traditional RCTs, alternative approaches also require careful consideration of the counterfactual condition(s), policy context, and timing of delivery. These alternative strategies can move us forward in understanding program impacts in settings or populations that can be quite challenging to study. However, to use these approaches appropriately, it is essential to understand their underlying assumptions, tradeoffs, internal validity, and generalizability of the results.

This meeting focused on understanding what kinds of research questions can be addressed using alternatives to traditional RCTs, the special considerations involved with these types of approaches, and the tradeoffs between using alternative and traditional impact designs and analyses. Speakers shared their experience and knowledge around innovative, applied examples of alternatives to traditional RCTs and the theoretical and statistical models underlying those designs. The meeting included presentations and discussions on the following questions:

  • Understanding alternative designs. What are some alternatives to traditional RCTs for assessing program impacts, when is it appropriate to use them, and what are some strategies to strengthen causal inference when using alternative designs?
  • Counterfactual condition(s). What should be considered with regard to identifying the counterfactual condition(s), and how can methods for establishing those conditions be crafted to maximize the comparability of the treatment and comparison groups?
  • Combining information from multiple studies. What are some approaches to combining limited information to draw conclusions, in cases such as small samples, limited or short-term outcome data, or data from a single community?
  • Working with communities to design evaluations. How can we work with communities to create research designs that balance scientific rigor with community needs/preferences, in order to ensure data integrity, accurate interpretation of the results, and utility of the findings?
  • And, how can we leverage these designs to study special populations, such as people in tightly knit communities and highly vulnerable people?
  • Alternative designs and broader research agendas. How can we incorporate alternative designs into broader research agendas, how does the policy context influence the research design selection process, and how can they contribute to improving the evidence base?

The meeting convened federal staff and researchers with an interest in exploring and using alternative research designs and analyses. The ultimate goals of the meeting were to 1) better understand the different types of alternatives to traditional RCTs and their assumptions and tradeoffs; 2) identify the benefits and challenges involved in using such approaches; and 3) promote capacity, utilization, and innovative uses of alternative research designs and analyses.

The meeting was held on September 22 and 23, 2016 at The Holiday Inn – Capitol in Washington D.C. There were participants and speakers from federal and state government, research firms, and academia.

Agenda and Presentations

Day 1: Thursday, September 22

Welcome and Opening Remarks

9:00 – 9:30

Mark Fucello, Director of the Division of Economic Independence, Office of Planning, Research and Evaluation

Setting the Stage

9:30 – 10:00

Slide Deck: Building strong evidence in challenging contexts 
Nancy Whitesell, University of Colorado, Denver

Roundtable Discussion – Working with Communities to Design Research and Evaluations

10:00 – 11:45
Moderator
Aleta Meyer, Office of Planning, Research and Evaluation

Panelists:
Domestic violence example: Lisa Goodman (Boston College) and Deborah Heimel (REACH Beyond Domestic Violence, Inc.)
Tribal example: Nancy Whitesell (University of Colorado, Denver) and Alicia Mousseau (University of Colorado, Denver)
Diverse urban populations example: Bowen Chung (University of California, Los Angeles) and Aziza Lucas Wright (Charles R. Drew University of Medicine and Science)

Working with Small Samples

1:15 – 3:00
Moderator
Nicole Deterding, Office of Planning, Research and Evaluation

Slide Deck: Single case research design in early childhood contexts 
Jennifer Ledford, Vanderbilt University

Slide Deck:Designing controlled trials with the power of optimization 
Nathan Kallus, Cornell University

Slide Deck: Bayesian analysis for small subgroups: Intuitive inferences with heightened precision 
Mariel Finucane, Mathematica Policy Research

Slide Deck: The challenges of developing a counterfactual for place-based initiatives: Promise Zones 
Calvin Johnson, Department of Housing and Urban Development

Alternative Forms of Randomization

3:15 – 5:00
Moderator
Anna Solmeyer, Office of Planning, Research and Evaluation

Slide Deck: Stepped wedge designs and the Washington state EPT trial 
James Hughes, University of Washington

Slide Deck: The extended family of randomized roll-out designs, including stepped wedge and dynamic wait listed designs 
Hendricks Brown, Northwestern University

Slide Deck: A hybrid double randomized client preference trial: Benefits and challenges 
Gerald August, University of Minnesota

Slide Deck: Leveraging lotteries for school value-added 
Joshua Angrist, Massachusetts Institute of Technology

Day 2: Friday, September 23

When Randomization is Not Possible

9:00 – 10:45
Moderator
Anupa Bir, RTI International

Slide Deck: Regression discontinuity and extensions 
Coady Wing, Indiana University

Slide Deck: Using simulated instruments for addressing descriptive and causal policy questions 
Vivian Wong, University of Virginia

Slide Deck: A short comparative interrupted time-series analysis of the impacts of Jobs-Plus 
Howard Bloom, MDRC

Slide Deck: Recent advances in non-experimental comparison group designs 
Elizabeth Stuart, Johns Hopkins University

Federal Efforts and Future Directions

11:00 – 12:45
Moderator
Nicole Constance, Office of Planning, Research and Evaluation

Slide Deck: The Social Innovation Fund: Utilizing a range of research designs 
Lily Zandniapour, Corporation for National and Community Services

Slide Deck: The NIH Collaboratory: Working with grantees and stakeholders to strengthen research 
Elizabeth DeLong, Duke University

Slide Deck: What Works Clearinghouse standards for alternative designs 
Jonathan Jacobson, Department of Education

Slide Deck: Considerations for including non-experimental evidence in systematic reviews 
John Deke, Mathematica Policy Research

Meeting Products

Meeting Briefs and Resource Documents

Building Strong Evidence in Challenging Contexts: Alternatives to Traditional Randomized Controlled Trials

First page of the brief shows a blue header with the OPRE logo and indistinguishable text.

Recommended Reading List

Screenshot of a document with indistinguishable text.