Detecting an impact What do we mean by impact? “An impact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.” Impact Evaluation in Practice, second edition What do we mean by impact? “An impact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.” Impact Evaluation in Practice, second edition Is it really so simple? I What would M What happens have P with the happened program without the A program C T Increasing preschool enrolment TREATMENT CONTROL 3 percentage 53% 50% points? NOT REALLY We measure things with error Bad measurement? NO! Chance W What do you mean by error? We’re dealing with random variables X = 1 if heads, X=0 if tails X ∈ [0.28kg, 635kg] What does this mean for us? We’re not just estimating a single number, or point estimate. We’re going to estimate an interval that takes into account this error. Agenda 1. What is a confidence interval? Statistical significance? 2. When can we say with confidence that we have detected an impact? We’re dealing with random variables X = 1 if heads, X=0 if tails X ∈ [0.28kg, 635kg] This randomness generates a distribution Histogram Number of people Weight We can make the bins smaller Number of people Weight We can graph probabilities instead Probability distribution Fraction of people These bins should add up to 1 Weight We can graph probabilities instead The area under the curve should add up Fraction of people to 1 Weight We can graph probabilities instead Area to left of red line: Probability that weight is less than 70kg. Fraction of people 70kg Weight The average of sample averages is also a random variable Sample A Draw a random sample from the population of 12 yr olds. !A Calculate the sample average for weight: " Sample B Draw another random sample from the population of 12 yr olds !B Calculate the sample average for weight: " # B be identical? # A and $ Will $ Let’s look at the distribution of sample averages Fraction of people Weight It will be a normal distribution Why is this good news? We understand the properties of the normal distribution. How does the normal distribution help us? Suppose we want to know average weight in a population We draw a sample and calculate the average Q: How large would Sample average is 50kg the true average have to be for us to draw an average of 50kg or below only 5% of the time 50kg True average (unknown) We draw a sample and calculate the average Q: How large would The normal distribution the true average allows us to calculate this - have to be for us to UPPER BOUND draw an average of 50kg or below only 5% of the time 50kg True average (unknown) We draw a sample and calculate the average Q: How small would The normal distribution the true average allows us to calculate have to be for us to this draw an average of LOWER BOUND 50kg or above only 5% of the time True average 50kg (unknown) We can create an interval around the sample average 90% confidence interval LOWER BOUND UPPER BOUND 50kg We draw an We draw an average less than average greater this 5% of the time than this 5% of the time We can create an interval around the sample average 90% confidence interval LOWER BOUND UPPER BOUND 50kg Values within the interval are not statistically distinguishable How confident should we be? 90% confidence interval 95% confidence interval 99% confidence interval How confident should we be? 90% confidence interval Significant at the 10% level 95% confidence interval Significant at the 5% level 99% confidence interval Significant at the 1% level Why did you just tell me all of this? How is this related to estimating impact? Random Random Random variable variable variable TREATMENT CONTROL 3 percentage 53% 50% points? We want to know if zero is in our confidence interval 95% confidence interval LOWER BOUND UPPER BOUND 3 percentage points Values within the interval are not statistically distinguishable If zero lies outside the confidence interval, 0 and 3 can be statistically distinguished. 95% confidence interval LOWER BOUND UPPER BOUND 3 percentage points If zero lies in the confidence interval, 0 and 3 cannot be statistically distinguished. What do we want to see here? TREATMENT CONTROL 3 percentage 53% 50% points (95% CI 2.54 to 4.1) Other stats providing similar information t-statistic We typically want this to be above 1.96. The point estimate divided standard error by this gives you the t-stat We typically want this to be p-value 0.05 or below. THANK YOU