class: center, middle, inverse, title-slide # Regression Discontinuity 1: Framework and Application ### Instructor: Yuta Toyama ### Last updated: 2021-07-15 --- class: title-slide-section, center, middle name: logistics # Introduction --- ## Introduction - **Regression Discontinuity Design (回帰不連続デザイン)** - Exploit the discontinuous change in treatment status to estimate the causal effect. - Example: - Threshold of test score for college admission - Eligibility of policy due to age. - Geographic boundary of two regions. --- .middle[ .center[ <img src="figure_table/fig_Manen.png" height="600"> ] ] --- ## RD Idea --- ## Course Plan and Reference - Plan - Framework - Estimation - Application: Shigeoka (2014) - Implementation in R - Reference - Angrist and Pischke "Mostly harmless econometrics" Chapter 6 - R packages: [https://sites.google.com/site/rdpackages/rdrobust ](https://sites.google.com/site/rdpackages/rdrobust ) --- class: title-slide-section, center, middle name: logistics # Framework --- ## Framework - `\(Y_{i}\)`: observed outcome for person `\(i\)` <br/><br/> - Define *potential outcomes* - `\(Y_{1i}\)`: outcome for `\(i\)` when she is treated (treatment group) - `\(Y_{0i}\)`: outcome for `\(i\)` when she is not treated (control group) <br/><br/> - `\(D_{i}\)`: treatment status is deterministically determined (sharp RD design) `$$D_{i}=\mathbf{1}\{W_{i}\geq\bar{W}\}$$` - `\(W_{i}\)`: running variable (forcing variable). - Probabilistic assignment is allowed (**fuzzy RD design**) --- ## Example: Incumbent Advantage - Consider the two-candidate elections - `\(D_{i}\)`: dummy for incumbent in the election - `\(Y_{i}\)`: whether the candidate win in the election - `\(W_{i}:\)` the vote share in the previous election. <br/><br/> - The incumbent status is defined as `$$D_{i}=\mathbf{1}\{W_{i}\geq0.5\}$$` <br/> - Idea of RD: - Suppose that you won with 51%. - You are similar to the guy who lose at 49% (main assumption of RD). - If you focus on these people, `\(D_{i}\)` is as if it were randomly assigned. --- ## Framework cont.d - Note that `\(D_{i}=\mathbf{1}\{W_{i}\geq\bar{W}\}\)` implies the unconfoundedness `$$(Y_{1i},Y_{0i})\perp D_{i}|W_{i}$$` <br/> - But the overlap assumption does not hold `\begin{aligned} P(D_{i}=1|W_{i}=w)=\begin{cases} 1 & if\ w\geq\bar{W}\\ 0 & if\ w<\bar{W} \end{cases} \end{aligned}` - To compare people with and without treatment, we need to rely on some sort of extrapolation around the threshold. --- ## Linear approach - Suppose for a moment that `\begin{aligned} Y_{1i} & =\rho+Y_{0i}\\ E[Y_{0i}|W_{i}=w] & =\alpha_{0}+\beta_{0}w \end{aligned}` - This leads to a regression `$$Y_{i}=\alpha+\beta W_{i}+\rho D_{i}+\eta_{i}$$` - `\(\rho\)` is the causal effect. <br/><br/> - This approach relies on linear extrapolation. May not be good. - What if `\(E[Y_{0i}|W_{i}=w]\)` is nonlinear? --- .middle[ .center[ <img src="figure_table/Figure6_1_1.png" height="630"> ] ] --- ## A more general approach - Allowing for nonlinear effect of the running variable `\(W_{i}\)` `$$Y_{i}=f(W_{i})+\rho\mathbf{1}\{W_{i}\geq\bar{W}\}+\eta_{i}$$` <br/> - A function `\(f(\cdot)\)` might be a `\(p\)`th order polynomial. `$$f(W_{i})=\beta_{1}W_{i}+\beta_{2}W_{i}^{2}+\cdots+\beta_{p}W_{i}^{p}$$` - nonparametric approach later. --- ## Implementation in Regression - Consider `\begin{aligned} E[Y_{0i}|W_{i}=w] & =f_{0}(W_{i}-\bar{W})\\ E[Y_{1i}|W_{i}=w] & =\rho+f_{1}(W_{i}-\bar{W}) \end{aligned}` where `\(\tilde{W}_{i}=W_{i}-\bar{W}\)` is a normalization. - Then the regression equation is `\begin{aligned} Y_{i} & =\alpha+\beta_{01}\tilde{W_{i}}+\cdots+\beta_{0p}\tilde{W}_{i}^{p}\\ & +\rho D_{i}+\beta_{1}^{*}D_{i}\tilde{W}_{i}+\cdots+\beta_{p}^{*}D_{i}\tilde{W}_{i}^{p}+\eta_{i} \end{aligned}` --- - When running regression, need to focus on the sample around threshold. - How close the sample should be to the threshold can be taken care by statistical procedure. --- class: title-slide-section, center, middle name: logistics # Example --- ## Effects of the minimum age drinking law .middle[ .center[ <img src="figure_table/MMfig41.png" height="500"> ] ] --- .middle[ .center[ <img src="figure_table/MMfig42.png" height="600"> ] ] --- .middle[ .center[ <img src="figure_table/MMfig44.png" height="600"> ] ] --- .middle[ .center[ <img src="figure_table/MMfig45.png" height="600"> ] ] --- class: title-slide-section, center, middle name: logistics # Validation of Assumptions --- ## Validation of Assumptions for RD - Key assumption 1: STUVA. No spill over of treatment across threshold. - See next slide. - Key assumption 2: Continuity of potential outcomes at the threshold. --- ## Violation of STUVA .middle[ .center[ <img src="figure_table/fig_STUVA_violate.png" height="500"> ] ] --- ## Continuity of - The key assumptions : Both `\(E[Y_{1i}|W_{i}=w]\)` and `\(E[Y_{0i}|W_{i}=w]\)` are continuous at the threshold `\(w=\bar{W}\)`. <br/><br/> - This is not directly testable because we cannot observe `\(Y_{1i}\)` below the threshold. <br/><br/> - There are two common approaches that support this assumption: 1. Covariate test 2. Density test (no bunching in the running variable). --- ## Covariate Test - The underlying idea of RDD: Comparing outcomes right above and right below `\(\bar{W}\)` provides a comparison of treated and control agents who are similar due to the assumed continuity in conditional distributions. <br/><br/> - If this is a valid comparison, then we would expect that covariates `\(X\)` also change smoothly as we pass through the threshold. --- <br/><br/> - Run the RDD on the covariate `\(X\)`. <br/><br/> - If we found the discontinuity, it suggests that the conditional expectation of `\(Y\)` on `\(W\)` may not be continuous either. <br/><br/> - If `\(X\)` has a direct effect on `\(Y\)`, the discontinuity in `\(E[Y_{i}|W]\)` at `\(\bar{W}\)` will confound the treatment effect. <br/><br/> - Example: - `\(Y\)` hours worked, - `\(D\)`: older-than-65 discounts, - `\(W\)`: age, `\(X\)`: social security benefit (non-work income) --- ## Density Test, or No Bunching - Manipulation if agents know about the institutional details - If schools scoring lower than `\(w = 50\)` on standardized tests get labeled as dysfunctional, we might see many schools to be right above 50 <br/><br/> - In this case, we observe **bunching** around the threshold. - Agents are "manipulating" treatment assignment around the threshold. - Density of `\(W_{i}\)` is discontinuous at `\(\bar{W}\)` <br/><br/> - We would expect that `\(E[Y_{1i}|W_{i}=w]\)` would be also discontinuous. <br/><br/> - McCrary (2008) suggests a test of the null hypothesis that the density of `\(W_{i}\)` is continuous at `\(\bar{W}\)`. --- ## Digression: Bunching Analysis (集積点分析) - Bunching itself is an interesting economic phenomenon. It can be used to analyze a different question. --- ## Example: Ito and Sallee (2018, REStat) .middle[ .center[ <img src="figure_table/ItoSallee.png" height="500"> ] ] --- class: title-slide-section, center, middle name: logistics # Empirical Paper --- ## Empirical Paper: Health Demand - "The Effect of Patient Cost Sharing on Utilization, Health, and Risk Protection" by Hitoshi Shigeoka 2014 AER' --- ## Policy Issue: Medical Expenditure - Medical expenditures are rising. - due to an aging population and coverage expansion - acute fiscal challenge to governments! <br/><br/> - Current expenditure on health (to GDP) in 2018 according to OECD Health Statistics 2019 - U.S.A. (16.9%), Switzerland (12.2%), Germany (11.2%), France (11.2%), Sweden (11.0%), Japan (10.9%)\... <br/><br/> - One main strategy is higher patient cost sharing, that is, requiring patients to pay a larger share of the cost of care. <br/><br/> - Question: how does patient cost sharing affect - utilization (demand elasticity)? - health? - risk protection (out-of-pocket expenditures)? --- ## Background and Cross-sectional Data - All Japanese citizens are mandatorily covered by health insurance. <br/><br/> - Use a sharp reduction in cost sharing for patients aged over 70 in Japan. <br/><br/> - The sources are the Patient Survey and the Comprehensive Survey of Living Conditions (CSLC). 1984-2008. <br/><br/> - Advantages - There are no confounding factors at age 70. We can isolate the effect of patient cost sharing. - Medical providers do not have incentive to differentiate prices by the patients' insurance type. - We can separate inpatient and outpatient. --- ## Cost Sharing and Out-of-Pocket Medical Expenditure - In sum, the proportion is 30% for `\(<\)` 69 and 10% for 70 `\(\leq\)`. <br/><br/> - Out-of-pocket medical expenditure for impatient admissions can reach 27% for a 69-year-old. <br/><br/> - However, for 70, it would be reduced to 8.6%. <br/><br/> - We need to take the stop-loss into account. --- .middle[ .center[ <img src="figure_table/ShigeokaFig1.png" height="550"> ] ] --- .middle[ .center[ <img src="figure_table/ShigeokaTab2.png" height="500"> ] ] --- ## Identification Strategy - Standard RD designs. <br/> - Basic estimation equation for the CSLC is `$$Y_{iat}=f(a)+\beta Post70_{iat}+X_{iat}^{\prime}\gamma+\varepsilon_{iat}.$$` - `\(Y_{iat}\)`: a measure of morbidity or out-of-pocket medical expenditure - `\(f(a)\)`: a smooth function of age. - `\(X_{iat}\)`: a set of individual covariates - `\(Post70_{iat}\)`: `\(=0\)` if individual `\(i\)` is over 70. <br/> - Patient Survey/mortality data represents individuals who are present in the medical institutions/deceased. <br/> - As in Card, Dobkin, and Maestas (2004), basic estimation equation for the Patient Survey and mortality data is `$$\log(Y_{at})=f(a)+\beta Post70_{at}+\mu_{at}.$$` --- ## Results: Outpatient Visits - 10.3% increase in overall visits. The implied elasticity is `\(-0.18\)`. <br/><br/> - Sharp drop in the duration from the last visit by one day. <br/><br/> - The effect is heterogeneous across institutions, genders, and diagnoses. --- .middle[ .center[ <img src="figure_table/ShigeokaFig2A.png" height="530"> ] ] --- .middle[ .center[ <img src="figure_table/ShigeokaFig2B.png" height="530"> ] ] --- ## Results: Inpatient Admissions - Left: 8.2% increase in overall admissions. The implied elasticity is `\(-0.16\)`. <br/><br/> - Right: Surge (increase by 12.0%) in admissions with surgery. <br/><br/> - From robustness checks, the implied elasticity is around `\(-0.2\)`. --- .middle[ .center[ <img src="figure_table/ShigeokaFig4A.png" height="530"> ] ] --- .middle[ .center[ <img src="figure_table/ShigeokaFig4B.png" height="530"> ] ] --- ## Benefits: Health Outcomes - We cannot find significant discontinuity in mortality. <br/><br/> - This result is expected because health is stock (Grossman 1972). <br/><br/> - There is no discontinuity in morbidity (self-reported health). <br/><br/> - The available health measures here are limited, so we would underestimate the benefit. --- .middle[ .center[ <img src="figure_table/ShigeokaFig6.png" height="530"> ] ] --- ## Benefits: Risk Reduction - Another benefit is a lower risk of unexpected out-of-pocket medical spending. <br/><br/> - We use a nonparametric estimator for quantile treatment effects. <br/><br/> - Patients at the right tail of the distribution in particular are substantially benefited. --- .middle[ .center[ <img src="figure_table/ShigeokaFig7A.png" height="530"> ] ] --- .middle[ .center[ <img src="figure_table/ShigeokaFig7B.png" height="530"> ] ] --- ## Discussion - Price Elasticities - We cannot distinguish own- from cross-price effects. - However, for some diagnosis groups, cross-price effects should be nearly zero. - The overall effect of the price change for the groups is an approximately 10 percent increase in visits. <br/><br/> - Cost-Benefit Analysis - Imposing many assumptions, we speculate that the welfare gain of risk protection from lower patient cost sharing is comparable to the total social cost. - We cannot include welfare gains from health improvements. --- class: title-slide-section, center, middle name: logistics # Appendix: Formal Analysis --- ## Formal Identification Analysis - Key: continuity assumptions: Both `\(E[Y_{1i}|W_{i}=w]\)` and `\(E[Y_{0i}|W_{i}=w]\)` are continuous at the threshold `\(w=\bar{W}\)`. - This is not directly testable assumption (because we cannot observe `\(Y_{1i}\)` below the threshold). - Will discuss several validating approaches. - To see how this works, notice that `\begin{aligned} E[Y_{i}|W_{i}=w]= & E[Y_{0i}|W_{i}=w]\\ & +\mathbf{1}\{w\geq\bar{W}\}\left(E[Y_{1i}|W_{i}=w]-E[Y_{0i}|W_{i}=w]\right) \end{aligned}` --- <br/><br/> - Taking the limit of `\(w\)` to `\(\bar{W}\)` from above and below `\begin{aligned} \lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\uparrow\bar{W}}E[Y_{0i}|W_{i}=w]=E[Y_{0i}|W_{i}=\bar{W}]\\ \lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\downarrow\bar{W}}E[Y_{1i}|W_{i}=w]=E[Y_{1i}|W_{i}=\bar{W}] \end{aligned}` - Notice that we use continuity in the second equalities! --- <br/> - Remember that `\begin{aligned} \lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\uparrow\bar{W}}E[Y_{0i}|W_{i}=w]=E[Y_{0i}|W_{i}=\bar{W}]\\ \lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i} & =w]=\lim_{w\downarrow\bar{W}}E[Y_{1i}|W_{i}=w]=E[Y_{1i}|W_{i}=\bar{W}] \end{aligned}` - So, we have `$$E[Y_{1i}-Y_{0i}|W_{i}=\bar{W}]=\lim_{w\downarrow\bar{W}}E[Y_{i}|W_{i}=w]-\lim_{w\uparrow\bar{W}}E[Y_{i}|W_{i}=w]$$` - LHS: Average treatment effect at the threshold - RHS: We can observe from the data. - Conditional expectation near the threshold.