1 Introduction

1.1 Introduction

  • Difference-in-differences (DID)
    • Exploit the panel data structure to estimate the causal effect.
  • Consider that
    • Treatment and control group comparison: selection bias
    • Before v.s. After comparison: time trend
  • DID combines those two comparisons to draw causal conclusion.

1.2 DID in Figure (on screen)

1.3 Plan of the Lecture

  • Formal Framework
  • Implementation in a regression framework
  • Parallel Trend Assumption

1.4 Reference

2 Framework

2.1 Framework

  • Consider two periods: \(t=1,2\). Treatment implemented at \(t=2\).
  • \(Y_{it}\): observed outcome for person \(i\) in period \(t\)
  • \(G_{i}\): dummy for treatment group
  • \(D_{it}\): treatment status
    • \(D_{it}=1\) if \(t=2\) and \(G_{i}=1\)
  • potential outcomes
    • \(Y_{it}(1)\): outcome for \(i\) when she is treated
    • \(Y_{it}(0)\): outcome for \(i\) when she is not treated
  • With this, we can write \[\begin{aligned} Y_{it} & =D_{it}Y_{it}(1)+(1-D_{it})Y_{it}(0)\end{aligned}\]

2.2 Identification

  • Goal: ATT at \(t=2\) \[E[Y_{i2}(1)-Y_{i2}(0)|G_{i}=1]=E[Y_{i2}(1)|G_{i}=1]-E[Y_{i2}(0)|G_{i}=1]\]

  • What we observe

    Pre-period (\(t=1\)) Post (\(t=2)\)
    Treatment (\(G_{i}=1\)) \(E[Y_{i1}(0)|G_{i}=1]\) \(E[Y_{i2}(1)|G_{i}=1]\)
    Control (\(G_{i}=0)\) \(E[Y_{i1}(0)|G_{i}=0]\) \(E[Y_{i2}(0)|G_{i}=0]\)
  • Under what assumptions can we the ATT?

    • Simple comparison if \(E[Y_{i2}(0)|G_{i}=1]=E[Y_{i2}(0)|G_{i}=0]\).
    • Before-after comparison if \(E[Y_{i2}(0)|G_{i}=1]=E[Y_{i1}(0)|G_{i}=1]\).
    • Other (more reasonable) assumption?

2.3 Parallel Trend Assumption

  • Assumption: \[E[Y_{i2}(0)-Y_{i1}(0)|G_{i}=0]=E[Y_{i2}(0)-Y_{i1}(0)|G_{i}=1]\]
    • Change in the outcome without treatment is the same across two groups.
  • Then, \[\begin{aligned} \underbrace{E[Y_{i2}(1)-Y_{i2}(0)|G_{i}=1]}_{ATT}= & E[Y_{i2}(1)|G_{i}=1]-E[Y_{i2}(0)|G_{i}=1]\\ = & E[Y_{i2}(1)|G_{i}=1]-E[Y_{i1}(0)|G_{i}=1]\\ & -\underbrace{(E[Y_{i2}(0)|G_{i}=1]-E[Y_{i1}(0)|G_{i}=1])}_{=E[Y_{i2}(0)-Y_{i1}(0)|G_{i}=0]\ (pararell\ trend)}\end{aligned}\]

  • Thus, \[\begin{aligned} ATT= & E[Y_{i2}(1)-Y_{i1}(0)|G_{i}=1]-E[Y_{i2}(0)-Y_{i1}(0)|G_{i}=0]\end{aligned}\] which is why this is called “difference-in-differences”.

3 Estimation

3.1 Estimation Approach

  1. Plug-in estimator

  2. Regression estimators

3.2 Plug-in Estimator

  • Remember that the ATT is \[\begin{aligned} ATT= & E[Y_{i2}(1)-Y_{i1}(0)|G_{i}=1]-E[Y_{i2}(0)-Y_{i1}(0)|G_{i}=0]\end{aligned}\]

  • Replace them with the sample average. \[\begin{aligned} \hat{ATT=} & \left\{ \bar{y}(t=2,G=1)-\bar{y}(t=1,G=1)\right\} \\ & -\left\{ \bar{y}(t=2,G=0)-\bar{y}(t=1,G=0)\right\} \end{aligned}\] where \(\bar{y}(t,G)\) is the sample average for group \(G\) in period \(t\) .

  • Easy to make a \(2\times2\) table!

3.3 Example: Card and Kruger (1994)

image

3.4 Regression Estimators

  • Run the following regression \[y_{it}=\alpha_{0}+\alpha_{1}G_{i}+\alpha_{2}T_{t}+\alpha_{3}D_{it}+\beta X_{it}+\epsilon_{it}\]

    • \(G_{i}\): dummy for treatment group

    • \(T_{t}:\)dummy for treatment period

    • \(D_{it}=G_{i}\times T_{t}.\) \(\alpha_{3}\) captures the ATT.

  • Regression framework can incorporate covariates \(X_{it}\), which is important to control for observed confounding factors.

3.5 Regression Estimators with FEs

  • With panel data \[y_{it}=\alpha D_{it}+\beta X_{it}+\epsilon_{i}+\epsilon_{t}+\epsilon_{it}\] where \(\epsilon_{i}\) is individual FE and \(\epsilon_{t}\) is time FE.
  • Do not forget to use the cluster-robust standard errors!
    • See Bertrand, Duflo, and Mullainathan (2004, QJE) for the standard error issues.

4 Parallel Trend

4.1 Discussions on Parallel Trend

  • Parallel trend assumption can be violated in various situations.
  • Most critical issue: Treatment may depend on time-varying factors
    • DID can only deal with time-invariant factors.
  • Self-selection: participants in worker training programs experience a decrease in earnings before they enter the program
  • Targeting: policies may be targeted at units that are currently performing best (or worst).

4.3 Example (Fig 5.2 from Mastering Metrics)

image


4.5 Other Diagnostics: Placebo test

  • Placebo test using other period as treatment period. \[y_{it}=\sum_{\tau}\gamma_{\tau}G_{i}\times I_{t,\tau}+\mu_{i}+\nu_{t}+\epsilon_{it}\]
    • The estimates of \(\gamma_{\tau}\) should be close to zero up to the beggining of treatment (Fig 5.2.4 of Angrist and Pischke)
  • Placebo test using different dependent variable which should not be affected by the policy.

5 Research Strategy

5.1 Research Strategy using DID

  • Ishise et al (2019)
    1. How to find a research question
    2. What outcome dataset to look for
    3. What policy to look for (except for example 1 and 2).