Leveraging rapid learning evaluations for German development cooperation
Rigorous impact evaluations remain relatively rare in German development cooperation. Program managers and policymakers fail to introduce them in a systematic manner, even though they acknowledge that evaluation findings are crucial for evidence-based programming. We believe that a centralized, top-down approach to introduce impact evaluations is misleading because of missing incentives for program managers. One solution is to make rigorous impact evaluations more attractive for practitioners by focusing on concrete program design questions and producing results more quickly. This article introduces rapid learning evaluations as a tool to iteratively improve programs through structured testing of design and implementation alternatives.
By Kevin Hempel and Jonathan Stöterau | July 2021
Development organizations and policymakers are under increasing scrutiny regarding the impact of their programs. Moreover, program managers are looking for better evidence on what works and what does not. Against this background, rigorous impact evaluations have gained importance in development cooperation over the past 15 years. This development was largely driven by international organizations (e.g. World Bank), Anglo-Saxon countries (e.g. UK Aid and MCC) as well as research organizations (e.g. J-PAL, IPA, 3ie). This development was further pushed in 2019, when three researchers in the field received the Nobel Prize for Economics.
In German development cooperation, even though significant demand for better evidence exists, the introduction of rigorous impact evaluations is still limited (Faust, 2018). As noted by the German Development Institute (DEval) in a recent study, the use of rigorous impact evaluation in German development organizations has mostly been in an ad-hoc manner – in part because “there are no incentive systems to systematically encourage the application of rigorous methods and use of the resulting findings” (DEval, 2019, p.2).
Challenges of traditional impact evaluation
Several factors constrain the systematic introduction of rigorous impact evaluations and the translation of findings into policy making in German development cooperation. We believe that five key issues contribute to limited interest among program managers:
1. Lack of operational relevance: Impact evaluations often address questions that do not relate to practical concerns of policymakers and program managers. Traditional impact evaluations commonly assess the overall impact of a program or entire policies. But they often cannot uncover which design element of a program is driving its impact (the “black box”).
2. Time-lag to get results: Many typical impact evaluations take a long time to implement and oftentimes deliver results no earlier than two or three years after the program has been implemented. As a result, the evaluation findings are too late to feed lessons learned into the design of the program itself or of follow-up phases.
3. High cost and effort: Impact evaluations rely on detailed data on expected outcomes, which is often costly and difficult to collect. These costs are rarely foreseen in budgets of German development cooperation programs. Program managers may be unwilling or unable to dedicate their funds to evaluation and away from programmatic activities. Even if external research organizations are contracted, planning and implementation of the evaluation often puts additional workload on program staff.
4. Limited feasibility: Impact evaluations require a valid comparison group to estimate the impact of programs. This usually requires that beneficiaries are selected in a specific and transparent manner (e.g. through random assignment or fixed criteria) and that some individuals are excluded from receiving services or put on a waitlist. This can complicate and delay program implementation or invoke ethical concerns. Moreover, many German development programs implement several smaller interventions (i.e. different target groups or activities). Even if programs are large, each one of these activities is often too small to be rigorously evaluated.
5. Lack of enabling environment: A specific challenge for German development cooperation has been an insufficient support structure to carry out rigorous impact evaluations. First, there has not been a strong political mandate by BMZ for its implementing organizations to prioritize impact evaluations. Second, many program managers are still not familiar with impact evaluations (and their potential benefits). This reduces the bottom-up demand to generate and share evidence. Finally, central structures, funds and experts are missing to guide and support program managers in setting up impact evaluations.
As a result of these five challenges, program managers in German development organizations often lack the incentives and mandate to conduct rigorous impact evaluations of their programs. For them, impact evaluations imply additional workload and challenges without clear benefits in terms of practical recommendations. At the same time, program managers lack readily available, credible, and applicable evidence for designing their programs. Hence, they often must take decisions based on untested (and maybe misleading) assumptions – likely limiting the effectiveness of their programs.
Introducing rapid learning evaluations
Given the challenges to introduce traditional impact evaluations in German development cooperation – especially in a centralized, top-down approach – we propose to put more emphasis on a complementary approach: Rapid learning evaluations. The core idea is to conduct small-scale rigorous impact evaluations to iteratively test design options throughout the program cycle. These evaluations do not seek to measure whether a program’s ultimate objectives are achieved, but rather to provide evidence for optimizing specific aspects of program design and delivery. Similar approaches have been a long-standing tool in the private sector for product testing, e.g. in the form of A/B testing on websites. The results provide program managers with timely, credible guidance for choosing among program design and implementation alternatives.
For instance, consider a program to improve employment prospects for youth through job matching in career fairs. The program manager may face several options to encourage firms and youth to participate. She could use different outreach channels (SMS, email, etc.) or different wordings of the invitation. To identify the best options, she can randomly assign youth and/or firms to different communication channels in the first career fairs that are held. The channel(s) that delivered the highest attendance can then directly be prioritized in subsequent fairs. These subsequent fairs can again be used to test other implementations options.
Figure 1: Example of a rapid learning evaluation design
The need for a novel approach to generate evidence has been recognized by various international development experts and research organizations. The initial idea was coined under different names – such as “Nimble RCTs”, “Rapid Fire Testing” or “Rapid-Cycle Evaluation”. They are receiving increasing attention on the international scene, for instance in the context of applying behavioral science to development cooperation programs. Another interesting example is the NGO Educate!, which has introduced rapid impact evaluation to adapt its skills development programs to distance learning during the COVID-19 pandemic.
Rapid learning evaluations can mitigate some core disadvantages of traditional impact evaluations that limit their use in German development cooperation, e.g.:
1. Operational focus: Rapid learning evaluations typically address more concrete questions about program design and implementation. For example, rather than assessing impact of an entire job training program, they might just compare the impact of two different training curricula in terms of enrollment and completion rates.
2. Shorter timeframe: While traditional impact evaluations typically assess long-term outcomes (e.g. incomes), rapid learning evaluations assess outputs (e.g. take-up and use of services) and short-term outcomes (e.g. changes in behavior). Rapid learning evaluations consequently yield results in a shorter timeframe.
3. Lower costs and efforts: Given their focus on short-term results, rapid learning evaluation often can rely on program monitoring or administrative data, whereas impact evaluations often require large scale surveys to measure long-term outcomes. Moreover, small design elements are easier to adapt for random testing.
4. Direct feedback loops for iterative adaptation: Rapid learning evaluations deliver results within the project cycle. This allows program managers to (iteratively) adopt the most successful elements in subsequent programming (e.g. at program mid-term or when planning a follow-up phase). As a result, the findings are of immediate benefit to the program manager, which is an important incentive mechanism.
5. No need for pure control group: Traditional impact evaluations typically rely on a “pure” control group that is excluded from the program to measure impact. Rapid learning evaluations, instead, compare different groups of beneficiaries who receive alternative versions of the program. This eliminates some ethical and practical issues related to pure control group evaluation designs.
6. Academic relevance: Rapid learning impact evaluations are also of interest for partnering research institutions. They allow researchers to test their (behavioral) theories in real-world situations. This increases incentives for researchers to cooperate with programs and contribute their time and methodological expertise.
Given these features, we believe that rapid learning evaluations can address key challenges that often inhibit evidence-based program design to policymaking in real world situations. The approach of rapid learning and iterative program design is well suited for (German) development cooperation, where many programs face a large number of design questions. Rapid learning evaluations allow to address these questions one by one and improve program design iteratively over the program cycle.
The way forward
Rapid learning evaluations are almost inexistant in German development cooperation so far. The first step to introduce rapid learning evaluations would be that German development organizations identify programs in their portfolio to pilot the approach. Ideally, these programs should have the following features:
- Currently in a phase of (re-)design or piloting.
- Faces clear operational questions that can be tested in the short run.
- Possibility to simultaneously (or subsequently) implement design alternatives.
- Ideally allows for random assignment of beneficiaries to these alternatives.
- Serves a sufficient number of beneficiaries to allow for credible results.
- M&E-system can deliver quality data on outputs and/or short-term outcomes.
In a second step, selected program could go through small-scale evaluability assessments by internal or external experts. Such assessments would help to ensure the feasibility of incorporating a rapid learning evaluation for the shortlisted program(s). The evaluability assessment should also outline the core evaluation design and next steps in the process – including a commitment by program managers to act upon the findings.
Without a doubt, introducing this new kind of evaluation is not a silver bullet and comes with its own challenges and limitations. In particular, the simultaneous implementation and testing of different program design alternatives can increase the complexity of the intervention and therefore requires sufficient institutional capacity. Moreover, rapid learning evaluations cannot answer questions about long-term impacts. They are therefore a complement to traditional impact evaluations, not a substitute. The latter remain important and need to be systematically introduced in German development cooperation as well.
In sum, rapid learning evaluations constitute an additional option in the evaluation toolbox and could support evidence-based policymaking in German development cooperation. Most importantly, rapid learning evaluations offer a clear benefit – and thus incentive – to program managers as they provide timely, practical recommendations on how to make programs (more) effective.
About the authors:
Kevin Hempel is the Founder and Managing Director of Prospera Consulting, a boutique consulting firm working towards stronger policies and programmes to facilitate the labour market integration of disadvantaged groups. You can follow him on Twitter @KevHempel.
Jonathan Stöterau is currently an Economist at the World Bank Finance, Competitiveness & Innovation Global Practice, where he is working on impact evaluations programs to support entrepreneurship, firm innovation and technology adoption in Europe and Central Asia.