1 Introduction

1.1 Introduction

Regression Discontinuity Design
- Exploit the discontinuous change in treatment status to estimate the causal effect.
Example:
- Threshold of test score for college admission
- Eligibility of policy due to age.
- Geographic boundary of two regions.
Pros: Strong internal validity
- Assumption for identification is weak.
Cons: Very little external validity
- What we estimate is the effect on people at the boundary.

1.2 Idea in Figure

1.3 Reference

Angrist and Pischke “Mostly harmless econometrics” Chapter 6
R packages: https://sites.google.com/site/rdpackages/rdrobust

2 Framework

2.1 Framework

\(Y_{i}\): observed outcome for person \(i\)
Define potential outcomes
- \(Y_{1i}\): outcome for \(i\) when she is treated (treatment group)
- \(Y_{0i}\): outcome for \(i\) when she is not treated (control group)
\(D_{i}\): treatment status is deterministically determined (sharp RD design) \[D_{i}=\mathbf{1}\{W_{i}\geq\bar{W}\}\]
- \(W_{i}\): running variable (forcing variable).
- Probabilistic assignment is allowed (fuzzy RD design)

2.2 Example: Incumbent Advantage

Consider the two-candidate elections
- \(D_{i}\): dummy for incumbent in the election
- \(Y_{i}\): whether the candidate win in the election
- \(W_{i}:\) the vote share in the previous election.
The incumbent status is defined as \[D_{i}=\mathbf{1}\{W_{i}\geq0.5\}\]
Idea of RD:
- Suppose that you won with 51%.
- You are similar to the guy who lose at 49% (main assumption of RD).
- If you focus on these people, \(D_{i}\) is as if it were randomly assigned.

2.3 Framework cont.d

Note that \(D_{i}=\mathbf{1}\{W_{i}\geq\bar{W}\}\) implies the unconfoundedness \[(Y_{1i},Y_{0i})\perp D_{i}|W_{i}\]
But the overlap assumption does not hold \[P(D_{i}=1|W_{i}=w)=\begin{cases} 1 & if\ w\geq\bar{W}\\ 0 & if\ w<\bar{W} \end{cases}\]
To compare people with and without treatment, we need to rely on some sort of extrapolation around the threshold.

2.4 Linear approach

Suppose for a moment that \[\begin{aligned} Y_{1i} & =\rho+Y_{0i}\\ E[Y_{0i}|W_{i}=w] & =\alpha_{0}+\beta_{0}w\end{aligned}\]
This leads to a regression \[Y_{i}=\alpha+\beta W_{i}+\rho D_{i}+\eta_{i}\]
- \(\rho\) is the causal effect.
This approach relies on linear extrapolation. May not be good.
- What if \(E[Y_{0i}|W_{i}=w]\) is nonlinear?

image

2.5 A more general approach

Allowing for nonlinear effect of the running variable \(W_{i}\) \[Y_{i}=f(W_{i})+\rho\mathbf{1}\{W_{i}\geq\bar{W}\}+\eta_{i}\]
A function \(f(\cdot)\) might be a \(p\)th order polynomial. \[f(W_{i})=\beta_{1}W_{i}+\beta_{2}W_{i}^{2}+\cdots+\beta_{p}W_{i}^{p}\]
- nonparametric approach later.

2.6 Implementation in Regression

Consider \[\begin{aligned} E[Y_{0i}|W_{i}=w] & =f_{0}(W_{i}-\bar{W})\\ E[Y_{1i}|W_{i}=w] & =\rho+f_{1}(W_{i}-\bar{W})\end{aligned}\]
- \(\tilde{W}_{i}=W_{i}-\bar{W}\) is a normalization.
Then the regression equation is (See page 255 in Angrist and Pischke) \[\begin{aligned} Y_{i} & =\alpha+\beta_{01}\tilde{W_{i}}+\cdots+\beta_{0p}\tilde{W}_{i}^{p}\\ & +\rho D_{i}+\beta_{1}^{*}D_{i}\tilde{W}_{i}+\cdots+\beta_{p}^{*}D_{i}\tilde{W}_{i}^{p}+\eta_{i}\end{aligned}\]
- \(\rho\) is the causal effect.
When running regression, need to focus on the sample around threshold.
- How close the sample should be to the threshold can be taken care by statistical procedure.

3 Example

3.1 Mastering Metrics Sec 4.1: Effects of the minimum age drinking law

image

4 Formal Analysis

4.1 Formal Identification Analysis

Key: continuity assumptions: Both \(E[Y_{1i}|W_{i}=w]\) and \(E[Y_{0i}|W_{i}=w]\) are continuous at the threshold \(w=\bar{W}\).
- This is not directly testable assumption (because we cannot observe \(Y_{1i}\) below the threshold).
- Will discuss several validating approaches.
To see how this works, notice that \[\begin{aligned} E[Y_{i}|W_{i}=w]= & E[Y_{0i}|W_{i}=w]\\ & +\mathbf{1}\{w\geq\bar{W}\}\left(E[Y_{1i}|W_{i}=w]-E[Y_{0i}|W_{i}=w]\right)\end{aligned}\]
Taking the limit of \(w\) to \(\bar{W}\) from above and below \[\begin{aligned} \lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\uparrow\bar{W}}E[Y_{0i}|W_{i}=w]=E[Y_{0i}|W_{i}=\bar{W}]\\ \lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\downarrow\bar{W}}E[Y_{1i}|W_{i}=w]=E[Y_{1i}|W_{i}=\bar{W}]\end{aligned}\]
- Notice that we use continuity in the second equalities!

Remember that \[\begin{aligned} \lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\uparrow\bar{W}}E[Y_{0i}|W_{i}=w]=E[Y_{0i}|W_{i}=\bar{W}]\\ \lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\downarrow\bar{W}}E[Y_{1i}|W_{i}=w]=E[Y_{1i}|W_{i}=\bar{W}]\end{aligned}\]
So, we have \[E[Y_{1i}-Y_{0i}|W_{i}=\bar{W}]=\lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i}=w]-\lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i}=w]\]
- LHS: Average treatment effect at the threshold
- RHS: We can observe from the data.
  - Conditional expectation near the threshold.

5 Validation of Assumptions

5.1 Validation of Assumptions

The key assumptions : Both \(E[Y_{1i}|W_{i}=w]\) and \(E[Y_{0i}|W_{i}=w]\) are continuous at the threshold \(w=\bar{W}\).
This is not directly testable because we cannot observe \(Y_{1i}\) below the threshold.
There are two common approaches that support this assumption:
1. Covariate test
2. Density test (no bunching in the running variable).

5.2 Covariate Test

The underlying idea of RDD: Comparing outcomes right above and right below \(\bar{W}\) provides a comparison of treated and control agents who are similar due to the assumed continuity in conditional distributions
If this is a valid comparison, then we would expect that covariates \(X\) also change smoothly as we pass through the threshold.

Run the RDD on the covariate \(X\).
If we found the discontinuity, it suggests that the conditional expectation of \(Y\) on \(W\) may not be continuous either.
If \(X\) has a direct effect on \(Y\), the discontinuity in \(E[Y_{i}|W]\) at \(\bar{W}\) will confound the treatment effect.
Example:
- \(Y\) hours worked,
- \(D\): older-than-65 discounts,
- \(W\): age, \(X\): social security benefit (non-work income)

5.3 Density Test, or No Bunching

Manipulation if agents know about the institutional details
- If schools scoring lower than \(w = 50\) on standardized tests get labeled as dysfunctional, we might see many schools to be right above 50
In this case, we observe bunching around the threshold.
- Agents are “manipulating” treatment assignment around the threshold.
- Density of \(W_{i}\) is discontinuous at \(\bar{W}\)
We would expect that \(E[Y_{1i}|W_{i}=w]\) would be also discontinuous.
McCrary (2008) suggests a test of the null hypothesis that the density of \(W_{i}\) is continuous at \(\bar{W}\).

5.4 Bunching Estimation

Bunching itself is an interesting economic phenomenon. It can be used to analyze a different question.

5.5 Example: Ito and Sallee (2018, REStat)

image

6 Empirical Paper

6.1 Empirical Paper: Health Demand

“The Effect of Patient Cost Sharing on Utilization, Health, and Risk Protection” by Hitoshi Shigeoka 2014 AER’

6.2 Policy Issue: Medical Expenditure

Medical expenditures are rising.
- due to an aging population and coverage expansion
- acute fiscal challenge to governments!
Current expenditure on health (to GDP) in 2018 according to OECD Health Statistics 2019
- U.S.A. (16.9%), Switzerland (12.2%), Germany (11.2%), France (11.2%), Sweden (11.0%), Japan (10.9%)...
One main strategy is higher patient cost sharing, that is, requiring patients to pay a larger share of the cost of care.
Question: how does patient cost sharing affect
- utilization (demand elasticity)?
- health?
- risk protection (out-of-pocket expenditures)?

6.3 Background and Cross-sectional Data

All Japanese citizens are mandatorily covered by health insurance.
Use a sharp reduction in cost sharing for patients aged over 70 in Japan.
The sources are the Patient Survey and the Comprehensive Survey of Living Conditions (CSLC). 1984-2008.
Advantages
- There are no confounding factors at age 70. We can isolate the effect of patient cost sharing.
- Medical providers do not have incentive to differentiate prices by the patients’ insurance type.
- We can separate inpatient and outpatient.

6.5 Identification Strategy

Standard RD designs.
Basic estimation equation for the CSLC is \[Y_{iat}=f(a)+\beta Post70_{iat}+X_{iat}^{\prime}\gamma+\varepsilon_{iat}.\]
- \(Y_{iat}\): a measure of morbidity or out-of-pocket medical expenditure
- \(f(a)\): a smooth function of age.
- \(X_{iat}\): a set of individual covariates
- \(Post70_{iat}\): \(=0\) if individual \(i\) is over 70.
Patient Survey/mortality data represents individuals who are present in the medical institutions/deceased.
As in Card, Dobkin, and Maestas (2004), basic estimation equation for the Patient Survey and mortality data is \[\log(Y_{at})=f(a)+\beta Post70_{at}+\mu_{at}.\]
We dealt with heaping, seasonality, and the catch-up effect.

6.6 Results: Outpatient Visits

10.3% increase in overall visits. The implied elasticity is \(-0.18\).
Sharp drop in the duration from the last visit by one day.
The effect is heterogeneous across institutions, genders, and diagnoses.

image

6.7 Results: Inpatient Admissions

Left: 8.2% increase in overall admissions. The implied elasticity is \(-0.16\).
Right: Surge (increase by 12.0%) in admissions with surgery.
From robustness checks, the implied elasticity is around \(-0.2\).

image

6.8 Benefits: Health Outcomes

We cannot find significant discontinuity in mortality.
This result is expected because health is stock (Grossman 1972).
There is no discontinuity in morbidity (self-reported health).
The available health measures here are limited, so we would underestimate the benefit.

image

6.9 Benefits: Risk Reduction

Another benefit is a lower risk of unexpected out-of-pocket medical spending.
We use a nonparametric estimator for quantile treatment effects.
Patients at the right tail of the distribution in particular are substantially benefited.

image

6.10 Discussion

Price Elasticities
- We cannot distinguish own- from cross-price effects.
- However, for some diagnosis groups, cross-price effects should be nearly zero.
- The overall effect of the price change for the groups is an approximately 10 percent increase in visits.
Cost-Benefit Analysis
- Imposing many assumptions, we speculate that the welfare gain of risk protection from lower patient cost sharing is comparable to the total social cost.
- We cannot include welfare gains from health improvements.

Program Evaluation (Causal Inference) 2: Regression Discontinuity Design

Instructor: Yuta Toyama

Last updated: 2020-06-22

1 Introduction

1.1 Introduction

1.2 Idea in Figure

1.3 Reference

2 Framework

2.1 Framework

2.2 Example: Incumbent Advantage

2.3 Framework cont.d

2.4 Linear approach

2.5 A more general approach

2.6 Implementation in Regression

3 Example

3.1 Mastering Metrics Sec 4.1: Effects of the minimum age drinking law

4 Formal Analysis

4.1 Formal Identification Analysis

5 Validation of Assumptions

5.1 Validation of Assumptions

5.2 Covariate Test

5.3 Density Test, or No Bunching

5.4 Bunching Estimation

5.5 Example: Ito and Sallee (2018, REStat)

6 Empirical Paper

6.1 Empirical Paper: Health Demand

6.2 Policy Issue: Medical Expenditure

6.3 Background and Cross-sectional Data

6.5 Identification Strategy

6.6 Results: Outpatient Visits

6.7 Results: Inpatient Admissions

6.8 Benefits: Health Outcomes

6.9 Benefits: Risk Reduction

6.10 Discussion

Program Evaluation (Causal Inference) 2: Regression Discontinuity Design

Instructor: Yuta Toyama

Last updated: 2020-06-22

1 Introduction

1.1 Introduction

1.2 Idea in Figure

1.3 Reference

2 Framework

2.1 Framework

2.2 Example: Incumbent Advantage

2.3 Framework cont.d

2.4 Linear approach

2.5 A more general approach

2.6 Implementation in Regression

3 Example

3.1 Mastering Metrics Sec 4.1: Effects of the minimum age drinking law

4 Formal Analysis

4.1 Formal Identification Analysis

5 Validation of Assumptions

5.1 Validation of Assumptions

5.2 Covariate Test

5.3 Density Test, or No Bunching

5.4 Bunching Estimation

5.5 Example: Ito and Sallee (2018, REStat)

6 Empirical Paper

6.1 Empirical Paper: Health Demand

6.2 Policy Issue: Medical Expenditure

6.3 Background and Cross-sectional Data

6.4 Cost Sharing and Out-of-Pocket Medical Expenditure

6.5 Identification Strategy

6.6 Results: Outpatient Visits

6.7 Results: Inpatient Admissions

6.8 Benefits: Health Outcomes

6.9 Benefits: Risk Reduction

6.10 Discussion