Introduction
Introduction
- Program Evaluation, or Causal Inference
- Estimation of “treatment effect” of some intervention (typically binary)
- Example:
- effects of job training on wage
- effects of advertisement on purchase behavior
- effects of distributing mosquito net on children’s school attendance
- Difficulty: treatment is endogenous decision
- selection bias, omitted variable bias.
- especially in observational data (in comparison with experimental data)
Overview
- Introduce Rubin’s causal model (potential outcome framework)
- Generalization of the linear regression model: Nonparametric
- Solutions to the selection bias
- Randomized control trial
- Matching
- Instrumental Variable Estimation
- Difference-in-differences
- Regression Discontinuity Design
- Instrumental Variable
- Note: IV estimation in program evaluation framework involves with the argument of local average treatment effect (LATE), which is beyond the scope of this course.
Reference
- Angrist and Pischke:
- Mostly harmless econometrics : advanced undergraduate to graduate students
- Mastering Metrics: good for undergraduate students after taking econometrics course.
- Ito: Data Bunseki no Chikara (in Japanese)
Program Evaluation
Framework
- \(Y_{i}\): observed outcome for person \(i\)
- \(D_{i}\): treatment status \[D_{i}=\begin{cases}
1 & treated\ (treatment\ group)\\
0 & not\ treated\ (control\ group)
\end{cases}\]
- Define potential outcomes
- \(Y_{1i}\): outcome for \(i\) when she is treated (treatment group)
- \(Y_{0i}\): outcome for \(i\) when she is not treated (control group)
- With this, we can write \[\begin{aligned}
Y_{i} & =D_{i}Y_{1i}+(1-D_{i})Y_{0i}\\
& =\begin{cases}
Y_{1i} & if\ D_{i}=1\\
Y_{0i} & if\ D_{i}=0
\end{cases}\end{aligned}\]
Key Points
- Point 1: Fundamental problem of program evaluation
- We can observe \((Y_{i},D_{i})\), but never observe \(Y_{0i}\) and \(Y_{1i}\) simultaneously.
- Counterfactual outcome.
- Point 2: Stable Unit Treatment Value Assumption (SUTVA)
- Treatment effect for a person does not depend on the treatment status of other people.
- Rules out externality / general equilibrium effects.
- Ex: If everyone takes the job training, the equilibrium wage would change, which affects the individual outcome.
Parameters of Interest
- Define the individual treatment effect \(Y_{1i}-Y_{0i}\)
- Key: allowing for heterogenous effects across people
- Individual treatment effect cannot be identified due to the fundamental problem.
- Instead, we focus on the average effects
- Average treatment effect: \(ATE=E[Y_{1i}-Y_{0i}]\)
- Average treatment effect on treated: \(ATT=E[Y_{1i}-Y_{0i}|D_{i}=1]\)
- Average treatment effect on untreated: \(ATT=E[Y_{1i}-Y_{0i}|D_{i}=0]\)
- Average treatment effect conditional on covariates \(X_{i}\): \(ATE(x)=E[Y_{1i}-Y_{0i}|D_{i}=1,X_{i}=x]\)
Relation to Regression Analysis
- Assume that
- linear (parametric) structure in \(Y_{0i}\), and
- constant (homogenous) treatment effect, \[\begin{aligned}
Y_{0i} & =\beta_{0}+\epsilon_{i}\\
Y_{1i}-Y_{0i} & =\beta_{1}\end{aligned}\]
- You will have \[Y_{i}=\beta_{0}+\beta_{1}D_{i}+\epsilon_{i}\]
- Program evaluation framework is nonparametric in nature.
- Though, in practice, estimation of treatment effect relies on a parametric specification.
Selection Bias
- Compare average outcomes between treatment and control group
- Does this tell you average treatment effect? No in general! \[\begin{aligned}
\underbrace{E[Y_{i}|D_{i}=1]-E[Y_{i}|D_{i}=0]}_{simple\ comparison}= & E[Y_{1i}|D_{i}=1]-E[Y_{0i}|D_{i}=0]\\
= & \underbrace{E[Y_{1i}-Y_{0i}|D_{i}=1]}_{ATT}\\
& +\underbrace{E[Y_{0i}|D_{i}=1]-E[Y_{0i}|D_{i}=0]}_{selection\ bias}\end{aligned}\]
- The bias term \(E[Y_{0i}|D_{i}=1]-E[Y_{0i}|D_{i}=0]\)
- not zero in general: Those who are taking the job training would do a good job even without job training
- Cannot observe \(E[Y_{0i}|D_{i}=1]\): the outcome of people in treatment group when they are NOT treated (counterfactual).
Solutions
- Randomized Control Trial (A/B test):
- Assign treatment \(D_{i}\) randomly
- Matching (regression):
- Using observed characteristics of individuals to control for selection bias
- Instrumental variable
- Use the variable that affects treatment status but is not correlated to the outcome
- Difference-in-differences
- Use the panel data to control for individual heterogeneity by fixed effects.
- Regression Discontinuity Design
- Exploit the randomness around the thresholds.
- Others: Bound approach, synthetic control method, regression kink design, etc..
RCT Framework
What is RCT ?
- RCT: Randomized Controlled Trial
- Measure the effect of “treatment” by
- randomly assigning treatment to a particular group (treatment group)
- measure outcomes of subjects in both treatment and “control” group.
- the difference of outcomes between these two groups is “treatment” effect.
- Starts with clinical trial: measure the effects of medicine.
Examples
- Development economics: Esther Duflo “Social experiments to fight poverty”
- Health economics: Amy Finkelstein “Randomized evaluations & the power of evidence | Amy Finkelstein”
Framework and Identification
- Key assumption: Treatment \(D_{i}\) is independent with potential outcomes \((Y_{0i},Y_{1i})\) \[D_{i}\perp(Y_{0i},Y_{1i})\]
- Under this assumption, \[\begin{aligned}
E[Y_{1i}|D_{i} & =1]=E[Y_{1i}|D_{i}=0]=E[Y_{1i}]\\
E[Y_{0i}|D_{i} & =1]=E[Y_{0i}|D_{i}=0]=E[Y_{0i}]\end{aligned}\]
- The sample selection does not exist! Thus, \[\begin{aligned}
\underbrace{E[Y_{i}|D_{i}=1]-E[Y_{i}|D_{i}=0]}_{simple\ comparison}= & \underbrace{E[Y_{1i}-Y_{0i}|D_{i}=1]}_{ATT}\end{aligned}\]
Estimation
- Difference of the sample average is consistent estimator for the ATT \[\frac{\frac{1}{N}\sum_{i=1}^{N}Y_{i}\cdot\mathbf{1}\{D_{i}=1\}}{\frac{1}{N}\sum_{i=1}^{N}\mathbf{1}\{D_{i}=1\}}-\frac{\frac{1}{N}\sum_{i=1}^{N}Y_{i}\cdot\mathbf{1}\{D_{i}=0\}}{\frac{1}{N}\sum_{i=1}^{N}\mathbf{1}\{D_{i}=0\}}\]
- You can run a linear regression of \(Y\) on \(D\) along with other covariates \(X_i\) \[ Y_i = \beta_0 + \beta_1 D_i + \beta' X_i + \epsilon_i\]
Health Insurance Experiment
Example: RAND Health Insurance Experiment (HIE)
- Taken from Angrist and Pischke (2014, Sec 1.1)
- 1974-1982, 3958 people, age 14-61
- Randomly assigned to one of 14 insurance plans.
- No insurance premium
- Different provisions related to cost sharing
- 4 categories
- Free
- Co-insurance: Pay 25-50% of costs
- Deductible: Pay 95% of costs, up to $150 per person ($450 per family)
- Catastrophic coverage: 95% of health costs. No upper limit. Approximate “no insurance”
First step: Balance Check
- Differences in demographic characteristics & baseline health are statistically insignificant
- Assignment of health insurance plans is indeed random!
Results of RAND HIE
- HI increases health spending (Panel A)
- But, HI has no statistically significant effect on health outcomes