Youth Development
  N o t e s                                                                                                                            41270




  Evaluating Youth Interventions
  Youth development projects aim to improve the lives and livelihoods of young people around
  the world. Interventions for youth are often multi-sectoral in nature, ranging from job- and
  life-skills development to programs for better health and nutrition. Rigorous impact evaluation
  is key to producing the knowledge base required by policymakers and practitioners to choose
  among different options, and implement the most cost effective projects. This note outlines some
  approaches to producing evidence of what works in the context of youth development projects,
  and looks at expanding the set of outcome indicators to more fully capture the effects of these
  projects on the welfare of young people around the world.



                    Today's youth (15­24) constitute the largest cohort ever to enter the transition to adulthood. Nearly 90% live in developing
                    countries and the challenges they face--low quality education, lack of marketable skills, high rates of unemployment, crime,
                    early pregnancy, social exclusion, and the highest rates of new HIV/AIDS infections--are costly to themselves and to society
                    at large. Client demand for policy advice on how to tap the enormous potential of youth is large and growing. This series
                    aims to share research findings and lessons from the field to address these important cross-sectoral topics.
Volume II, Number 5
June 2007           www.worldbank.org/childrenandyouth

the challenge of incorporating impact                               X on Y?" For example: what is the effect of a youth training
evaluation into youth projects                                      program on employment? Ideally, this would be estimated
                                                                    by comparing the employment status of an individual with
    . . . few solid evaluations of youth programs in                and without the training program at the same point in time.

    developing countries unambiguously identify the                 Given that we will never observe the same individual in two
                                                                    different states at the same time, impact evaluation must
    causality from policy to program to effect . . .                attempt to construct a plausible alternative for comparison, or
    many (youth) programs fall into the promising but               counterfactual: that is "what would have happened to the youth
    unproven camp . . .                                             without the training program?". As depicted in Figure 1, the

            --World Development Report 2007 (WDR07),                program impact is the difference between the observed outcome
                                                                    (the continuous line) and an estimate of the outcome had no
                    Development and the Next Generation             program been offered (the dashed line--i.e. the counterfactual).

The evaluation of youth development projects poses special          Counterfactuals are estimated using control groups, that is,
challenges, both conceptual and the logistical, particularly        a group of individuals who do not participate in a program.
if they are multi-sectoral. Youth development projects are          Identifying a valid counterfactual is critical to good impact
often diffuse in nature and scope, extend over a long period        evaluation. Typically, identifying the control group entails
of time, vary widely across applications, and have outcomes         determining why one group of individuals was treated and the
across a range of sectors. These challenges must be addressed       other was not. Doing this retrospectively can be challenging,
in an evaluation to ensure that causality is well established and   especially if the two groups were not randomly selected. There
that outcomes are adequately measured. For example, when            may be unobserved differences between those in the treatment
looking at the effects of a youth intervention on employment,       group and and those in the control that affect the outcome, and
we know that obtaining a job is also a function of health and       this will confound measurement of the impact of treatment.
schooling. Alternatively, we may want to know if discouraging       When working prospectively in the planning phase of an
girls from early marriage is more effective when girls are in       intervention, one can either explicitly select--or preferably
school. We can use impact evaluation to isolate the impact of       randomly assign--individuals into treatment and control groups.
any one component of a youth intervention, test the optimal
combination of interventions in different contexts, or look at     Identifying a control group
potential spillover effects across populations.                     By knowing Who is eligible, Where the intervention will go
Although evaluations are usually narrowly defined, the              and When the intervention will be delivered, we can identify a
multifaceted nature of youth transitions means that                 control group that can be used to estimate a valid counterfactual
interventions can have unexpected outcomes. Recent evaluations      for the estimation of a program's impact. By working within the
of youth programs use a wider range of outcome indicators           context of program planning and operations, we can minimize
to capture these different impacts. For example, interventions      the ethical concerns that may arise by denying treatment to
focusing on education have also been shown to affect risky          the control group. For example, in the early stages of program
behavior: conditional cash transfers may reduce alcohol and         implementation, budgetary and logistical constraints usually
smoking, early child development may reduce crime, violence         limit the number of eligible youth or groups than can receive
and teen pregnancy, and additional schooling may lower the          the intervention. Everyone who is eligible will receive the
incidence of teen pregnancy and HIV/AIDS (1).                       intervention, just not all at the same time. When a project can't
When considering the evaluation of youth projects there are also    go everywhere at the same time, managers must use some rule
a number of logistical considerations to keep in mind. Young        to determine where the project will begin and how it will scale
people are exceptionally mobile, and it is important to make        up. Provided we understand the scaling-up rules, the individuals
provisions to track individuals in the evaluation sample over       who do not receive the intervention in the early stages can
time. Similarly, when interviewing minors, issues of parental       provide valid controls for those who do.
consent are important, while at the same time providing the         Suppose that 100 localities are identified as the areas of highest
necessary safeguards to protect the young person's privacy.         youth unemployment, but budgetary and logistical constraints
The remainder of this note outlines key aspects of an impact        only permit coverage of a training program in 50 localities
evaluation design, and considers a number of issues that are        during the first year. One fair way to assign the benefit is to
unique to the evaluation of youth development projects.             give each locality an equal chance of receiving the benefit;
                                                                    for example, by using a lottery to select the localities that will
elements of effective impact evaluation                             receive the intervention this year. In that case, the localities that

design                                                              will receive the program in the future serve as a counterfactual
                                                                    control group to the localities that receive the program in the
An impact evaluation design allows us to isolate the effect of      first year. On average, there will be no differences (observed
a youth development program on a given outcome, or to test          or unobserved) between the two groups before the program
the optimal combination of interventions in different contexts.     is rolled out, and assignment to treatment and control groups
Impact evaluation helps us understand "what is the effect of        is by design unrelated to any characteristics of the localities.



                                                                                                  Youth Development Notes | June 007

Therefore, differences in outcomes between the two groups
following program implementation can be attributed to the                         Box . A Counterfeit Counterfactual
causal effect of the program, since the only difference between
the groups is that one received training and the other did not.                   A commonly used counterfactual that may produce misleading re-
                                                                                  sults is the comparison of the same individual before and after the
When randomization is not possible, other good options for                        intervention. For example, say that the youth employment rate in a
identifying valid control groups can be found using program                       given community is 0%. A training program enters the communi-

eligibility rules. For example, interventions are often targeted to               ty and trains young people in employable skills. After the training,
                                                                                  it is observed that youth employment has increased to 60%. Was
groups or individuals that meet certain criteria, such as poverty:                the training program a success? If, for example, positive economic
those with incomes just below the threshold are eligible, while                   growth also affected youth employment over the same period, the
those just above are ineligible. Arguably, pre-intervention                       contribution of the program to the increase in employment may be
differences between two individuals with incomes on either side                   only minor or zero. On the other hand, in the case of an eco-

of the threshold are very small, and differences in outcomes after                nomic recession over the same period, youth employment would
                                                                                  have been much lower, and so the simple change in employment
the intervention can be largely attributed to the intervention itself.            rates will underestimate the true impact of the program. Thus, just
                                                                                  comparing an outcome for the same individual before and after the
Identifying relevant outcomes and                                                 introduction of a program may lead us to erroneous conclusions
                                                                                  about a program's success.
indicators
Program activities produce outputs, and the resulting changes
observed in the beneficiaries is the outcome. For example, in
the case of vocational training, an outcome is employable skills,             while an output is receiving the training. Outcomes are observed
                                                                              characteristics of the beneficiary, and not of the program; and
                                                                              whether short or long term, should have measurable proxies or
    Box 1. Outcome indicators                                                 indicators.
    Let's consider a job training program in a post conflict setting char-    Any evaluation should have some idea how and why the
    acterized by inter-ethnic conflicts, designed to increase employ-         intervention leads to the expected outcomes. Impact evaluation
    ment skills, as well as reduce inter-ethnic conflict, and risky behav-
    iors, while promoting tolerance, and civic participation. Direct and      should include a review of program implementation, or a
    indirect expected outcomes to consider are as follows:                    "process evaluation," to understand this chain of events. Some
    Direct Outcomes                                                           programs do not work because planned activities are not carried

    1.    Competencies in training program skills (e.g., basic business,      out as planned. When a program is poorly implemented, there
          finance, and accounting knowledge)                                  may not be a great need to delve deeply into all the hypothesized
    .     Business activities (e.g., size, profitability, employment, youth   causal links in the chain.
          employed, revenues and revenue growth, return on invest-            The selection of relevant outcome indicators is a critical step
          ment, etc.)                                                         in the design of an impact evaluation, and should be guided by
    .     Credit and capital activities (sources of credit and equity         the logical framework that connects program activities to direct
          raised)                                                             outcomes. These direct outcomes may in turn lead to other,
    4.    Economic status (e.g., employment, wages, days employed,
          average earnings, asset ownership)                                  more indirect outcomes. Examples of measurable outcomes are
                                                                              described in Box 1.
    5.    General skill competency (e.g., literacy test score, numeracy
          test score, English language skills test score, computer skills     Frequently, the diversity of objectives of an intervention makes
          test score)                                                         selecting valid indicators difficult. For example, projects that aim
    Indirect outcomes                                                         at providing skills may have a direct impact on competencies
    6.    Risky behaviors (e.g., school absenteeism or dropout, inactiv-      and employment, but may also have equally important indirect
          ity - neither in school nor in work, substance use, early sexual    impacts on reducing risk behaviors. It is necessary to anticipate
          initiation, unsafe sexual practices, criminal behavior)             both direct and indirect outcomes, keeping in mind that direct
    7.    Violent behavior (e.g., hostility, participation in fights, car-    outcomes may not always be the most relevant from a social and
          rying a weapon, participation in riots or violent protest,
          attitudes towards the use of violence)                              policy perspective.

    8.    Ethnic and religious attitudes (e.g., ethnic and religious
          tolerance, ability to articulate another ethnic group's point of   Collecting data for the evaluation of
          view)                                                              youth development programs
    9.    Political and community participation (e.g., membership
          in community groups, civic participation, participation in          The success and reliability of an evaluation rests heavily on
          peaceful protests, political extremism)                             the quality of the data used. Since primary data collection
                                                                              can represent the lion's share of an evaluation budget (2),
    . The outlined outcome indicators have been defined for the impact        data collection strategies need to be carefully considered.
    evaluation of the Post Conflict Fund for Kosovo Youth Development
    (in particular for the Business Development for Young Entrepreneurs       Samples should be representative of the target population
    component). For further information, please contact Silvia Paruzzolo      and include sufficient sample sizes to detect the desired effect
    (sparuzzolo@worldbank.org)                                                size (power calculations can help determine required sample
                                                                              sizes). Survey methods should also be carefully considered,


Youth Development Notes | June 007                                                                                                               

especially when collecting sensitive data. For example, in the
case of risk behavior outcomes, experience suggests that audio                          Figure 1. Outcome level, Outcome change and
or computer assisted self-interviewing (CASI) can help young                            program effect (impact)
people to discuss candidly a range of sensitive and potentially
embarrassing subjects.1 Compared to face-to-face interviews,
both Audio-CASI and self-administered questionnaires have                                                                                                 Outcome status
                                                                                                                                                          with program
been shown to provide better prevalence estimates of youth
risk behavior in culturally conservative societies, especially for                        Post-program outcome level

particularly stigmatized or legally sanctioned behaviors (4).                                                                                             Program effect

                                                                                           hangec
Measuring all the program's costs and                                                                                                                     Outcome status
                                                                                                                                                          without program
benefits                                                                                         Outcome
Finally, for the purpose of informing policy decisions, an
evaluation is not complete until one considers the costs of the                           Pre-program outcome level
program. Impact is only one criterion for program selection.
The program must be effective in both a statistical or clinical                                         Before               During                  After
sense and an economic sense; the most effective program may                                             program             program               program

not be the most cost-effective one. An intervention may have a                        Source: Adapted from Rossi et al. (004)
profound impact on participants, but if it is extremely expensive,
it may not make sense to implement or continue it. It may be
preferable to select a program that has a smaller impact, but is
much less costly.                                                                        References and Recommended Reading
This highlights the importance of measuring all of the costs
and all of the benefits of a given program. Some benefits, and
some costs, may not become apparent until some time after the                            (1)              World Bank. 006. World Development Report 2007: De-
intervention. And as noted above, a program's benefits may be                                             velopment and the Next Generation. New York: Oxford Uni-
unrelated to its original goals. Similarly, the program may have                                          versity Press. Also downloadable at: http://web.worldbank.

social costs as well as financial costs, and all of the resources                                         org/WBSITE/EXTERNAL/TOPICS/EXTPOVERTY/EXTISPMA/
                                                                                                          0,,menuPK:846~pagePK:149018~piPK:14909~theSiteP
used will have shadow costs--that is, even volunteers to a                                                K:849,00.html.
program have potential alternative uses, and it is the job of the                        ()               Rossi P.H., Lipsey M.W., Freeman H.E. 004. "Evaluation ­ A
evaluation to determine whether the intervention presents the                                             systematic approach" (7th ed.), Thousand Oaks, Calif.: SAGE
best feasible use of these scarce resources. Finally, the program's                                       Publications.

average costs may not be a good indicator of marginal costs--                            ()               Baker J. 000. "Evaluating the Impacts of Development Proj-

that is, what it will cost to scale up the program.                                                       ects on Poverty: A Handbook for Practitioners." Washington,
                                                                                                          D.C.: The World Bank.
                                                                                         (4)              MacMillan H.L. 1999, "Computer survey technology: a win-
                                                                                                          dow on sensitive issues", CMAJ Specialty Spotlight.
                                                                                         ·                Duflo E., Glennersterzand R., Kremer M. 006. "Using
                                                                                                          Randomization in Development Economics Research: A
                                                                                                          Toolkit". Also downloadable at http://www.povertyactionlab.
                                                                                                          com/papers/Using%0Randomization%0in%0Developme
                                                                                                          nt%0Economics.pdf.
                                                                                         ·                The Poverty Lab webpage (http://www.povertyactionlab.
                                                                                                          com/).
                                                                                         ·                The World Bank impact evaluation webpage (http://web.
                                                                                                          worldbank.org/WBSITE/EXTERNAL/TOPICS/
                                                                                                          EXTPOVERTY/EXTISPMA/0,,menuPK:846~
                                                                                                          pagePK:149018~piPK:14909~theSitePK:849,00.html.)

1. With AUDIO-CASI, prerecorded questions are presented through headphones
and on a computer screen. Answers are given using numbered keys on a computer
keyboard. This obviates the need for interviewers, but given the costs of the
technology may not reduce overall survey costs.




                             Children & Youth Unit, Human Development Network, The World Bank
                             www.worldbank.org/childrenandyouth
                             This note was prepared by Silvia Paruzzolo, M&E Specialist (HDNCY), Sebastian Martinez, Economist (AFTRL), Luisa Sigrid Vivo,
                             Economist (HDNVP), Linda McGinnis, Lead Economist (HDNCY), Mattias Lundberg, Senior Economist (HDNCY) and Paul Gertler,
                             Professor of Economics at the University of California Berkeley. Photo credit: Ray Witlin. The views expressed in these notes are those of
        4                    the authors and do not necessarily reflect the view of the World Bank or their respectiveYouth Development Notes | June 007
                                                                                                                                 institutions.