Randomized controlled trials (RCTs) are considered the “gold standard” in estimating program impacts. In the social sciences, traditional RCTs usually involve randomly assigning participants (or other units) to an intervention or comparison group, and all participants either receive or do not receive the intervention. The strong preference for RCTs stems from the fact that the counterfactual condition can be clearly defined, and creating the counterfactual through randomization allows for causal inference.
In some cases, however, a traditional RCT may not be the most appropriate design, if it does not truly address the research question or if it presents challenges that make it impossible or infeasible to carry out the research that is desired. For example, there may be very few cases to randomize (e.g., in the case of small, specialized target populations, or community-level and place-based interventions), there may be ethical concerns about not providing services to a portion of a population, or community leadership may not agree to randomization. Unfortunately, the heavy reliance on traditional RCTs for assessing program impacts can result in research disparities, with little rigorous evidence on some populations that are difficult to study but in great need of evidence-based services, or on promising interventions that are not suitable for RCTs.
If we wish to rigorously assess program impacts in these contexts, one challenge is identifying or creating an appropriate counterfactual condition and a method for assigning participants or other study units to it. Some alternatives to traditional RCTs do involve random assignment (e.g., stepped wedge design, single case design), but the method and timing of randomization are different and create new considerations for understanding the treatment differential and interpreting the results. Other options are non-experimental and capitalize on other sources of variation (e.g., comparative interrupted time series, regression discontinuity designs); each of these approaches requires careful decisions in selecting the counterfactual condition and ensuring that it will result in a fair comparison group. Finally, other alternatives involve using statistical analysis (e.g., Bayesian statistics, simulations) to maximize what we can learn from the available information.
The starting point for any study should be to establish the research questions and then match them to a rigorous design that is best able to address them. Like traditional RCTs, alternative approaches also require careful consideration of the counterfactual condition(s), policy context, and timing of delivery. These alternative strategies can move us forward in understanding program impacts in settings or populations that can be quite challenging to study. However, to use these approaches appropriately, it is essential to understand their underlying assumptions, tradeoffs, internal validity, and generalizability of the results.
This meeting focused on understanding what kinds of research questions can be addressed using alternatives to traditional RCTs, the special considerations involved with these types of approaches, and the tradeoffs between using alternative and traditional impact designs and analyses. Speakers shared their experience and knowledge around innovative, applied examples of alternatives to traditional RCTs and the theoretical and statistical models underlying those designs. The meeting included presentations and discussions on the following questions:
The meeting convened federal staff and researchers with an interest in exploring and using alternative research designs and analyses. The ultimate goals of the meeting were to 1) better understand the different types of alternatives to traditional RCTs and their assumptions and tradeoffs; 2) identify the benefits and challenges involved in using such approaches; and 3) promote capacity, utilization, and innovative uses of alternative research designs and analyses.
The meeting was held on September 22 and 23, 2016 at The Holiday Inn – Capitol in Washington D.C. There were participants and speakers from federal and state government, research firms, and academia.
9:00 – 9:30
Mark Fucello, Director of the Division of Economic Independence, Office of Planning, Research and Evaluation
9:30 – 10:00
Slide Deck: Building strong evidence in challenging contexts
Nancy Whitesell, University of Colorado, Denver
10:00 – 11:45
Moderator
Aleta Meyer, Office of Planning, Research and Evaluation
Panelists:
Domestic violence example: Lisa Goodman (Boston College) and Deborah Heimel (REACH Beyond Domestic Violence, Inc.)
Tribal example: Nancy Whitesell (University of Colorado, Denver) and Alicia Mousseau (University of Colorado, Denver)
Diverse urban populations example: Bowen Chung (University of California, Los Angeles) and Aziza Lucas Wright (Charles R. Drew University of Medicine and Science)
1:15 – 3:00
Moderator
Nicole Deterding, Office of Planning, Research and Evaluation
Slide Deck: Single case research design in early childhood contexts
Jennifer Ledford, Vanderbilt UniversitySlide Deck:Designing controlled trials with the power of optimization
Nathan Kallus, Cornell UniversitySlide Deck: Bayesian analysis for small subgroups: Intuitive inferences with heightened precision
Mariel Finucane, Mathematica Policy ResearchSlide Deck: The challenges of developing a counterfactual for place-based initiatives: Promise Zones
Calvin Johnson, Department of Housing and Urban Development
3:15 – 5:00
Moderator
Anna Solmeyer, Office of Planning, Research and Evaluation
Slide Deck: Stepped wedge designs and the Washington state EPT trial
James Hughes, University of WashingtonSlide Deck: The extended family of randomized roll-out designs, including stepped wedge and dynamic wait listed designs
Hendricks Brown, Northwestern UniversitySlide Deck: A hybrid double randomized client preference trial: Benefits and challenges
Gerald August, University of MinnesotaSlide Deck: Leveraging lotteries for school value-added
Joshua Angrist, Massachusetts Institute of Technology
9:00 – 10:45
Moderator
Anupa Bir, RTI International
Slide Deck: Regression discontinuity and extensions
Coady Wing, Indiana UniversitySlide Deck: Using simulated instruments for addressing descriptive and causal policy questions
Vivian Wong, University of VirginiaSlide Deck: A short comparative interrupted time-series analysis of the impacts of Jobs-Plus
Howard Bloom, MDRCSlide Deck: Recent advances in non-experimental comparison group designs
Elizabeth Stuart, Johns Hopkins University
11:00 – 12:45
Moderator
Nicole Constance, Office of Planning, Research and Evaluation
Slide Deck: The Social Innovation Fund: Utilizing a range of research designs
Lily Zandniapour, Corporation for National and Community ServicesSlide Deck: The NIH Collaboratory: Working with grantees and stakeholders to strengthen research
Elizabeth DeLong, Duke UniversitySlide Deck: What Works Clearinghouse standards for alternative designs
Jonathan Jacobson, Department of EducationSlide Deck: Considerations for including non-experimental evidence in systematic reviews
John Deke, Mathematica Policy Research