Regression Discontinuity Design – Sharp
Exploits a sharp cutoff in treatment assignment to estimate causal effects near the threshold.
Quick Reference
- When to Use
- When treatment is assigned by a rule based on whether a continuous running variable crosses a known cutoff, the running variable is observed and measured accurately, and there is no evidence that units manipulate it.
- Key Assumption
- Continuity: potential outcomes and all background characteristics are continuous at the cutoff. The only thing that jumps at the cutoff is treatment status. This requires that units cannot precisely manipulate the running variable.
- Common Mistake
- Using high-order global polynomials instead of local linear/quadratic regression (use rdrobust), or not reporting the McCrary density test for manipulation of the running variable.
- Estimated Time
- 3 hours with tutorial lab
One-Line Implementation
rdrobust y x, c(0)rdrobust(y = df$Y, x = df$X, c = 0)rdrobust(Y, X, c=0)Download Full Analysis Code
Complete scripts with diagnostics, robustness checks, and result export.
Motivating Example
Does winning an election make a politician more likely to win the next election? You might think so — incumbents have name recognition, fundraising advantages, and can deliver benefits to their districts. But testing this is hard. Politicians who win elections are different from those who lose: they may be more charismatic, better funded, or from more favorable districts. Simply comparing incumbents to non-incumbents conflates the effect of incumbency with all these pre-existing differences.
David Lee found an elegant solution(Lee, 2008). In a two-candidate race, the candidate who gets 50.1% of the vote wins, while the candidate who gets 49.9% loses. But these candidates are nearly identical — they ran in similar districts, spent similar amounts, and had similar qualities. The 0.2 percentage point difference in vote share that separates them is essentially random noise.
Lee's insight: by comparing candidates who barely won to candidates who barely lost, you get something close to a randomized experiment — but created naturally by the sharp cutoff rule that determines who wins. This approach is a regression discontinuity design.
The finding: barely winning an election increases the probability of winning the next election by about 45 percentage points. The incumbency advantage is real and enormous.
A. Overview
What RDD Does
A regression discontinuity design exploits a rule that assigns treatment based on whether a continuous variable (the ) falls above or below a known cutoff. Units just above and just below the cutoff are almost identical on average — the only systematic difference between them is whether they received treatment. By comparing outcomes on either side of the cutoff, you can estimate the causal effect of treatment.
Why It Works
The key intuition is that units very close to the cutoff are "as good as randomly assigned" to treatment. A student who scores 71 on an exam and receives a scholarship is not meaningfully different from a student who scores 69 and does not. Their tiny score difference is essentially noise — influenced by whether they slept well, guessed correctly on one question, or had a good day. By zooming in on this narrow window around the cutoff, you approximate the conditions of a randomized experiment.
Sharp vs. Fuzzy
In a sharp RDD, the treatment is a deterministic function of the running variable: everyone above the cutoff is treated, everyone below is not. There is no discretion, no exceptions, no partial compliance. Examples:
- Scholarship awarded if test score >= 80 (everyone above gets it, no one below does)
- Politician wins if vote share > 50%
- Firm is regulated if revenue exceeds a threshold
In a fuzzy RDD, crossing the cutoff changes the probability of treatment but does not determine it perfectly. See the Fuzzy RDD page for that case. The fuzzy design is closely related to instrumental variables, where the cutoff serves as the instrument for actual treatment.
When to Use RDD
- Treatment is assigned by a rule based on a continuous running variable and a known cutoff
- The running variable is observed and measured accurately
- There is no evidence that units manipulate the running variable to sort around the cutoff
- You have enough observations near the cutoff
When NOT to Use RDD
- The running variable is discrete with few values (e.g., age in whole years) — the "as good as random" argument is weaker
- There is strong evidence of manipulation (e.g., students who know the cutoff score target exactly that score)
- You want to estimate the effect for the entire population, not just for units near the cutoff
- The cutoff is not sharp (treatment leaks across the boundary) — consider fuzzy RDD instead
The Taxonomy Position
RDD is a method. Its credibility comes from the institutional rule that generates the cutoff, not from a model of how confounders relate to outcomes. Unlike OLS, which requires the researcher to correctly specify all confounders, RDD leverages a specific institutional feature that produces quasi-random variation. Thistlethwaite and Campbell (1960) introduced the idea in the context of merit-based scholarships(Thistlethwaite & Campbell, 1960). Lee and Lemieux (2010) describe RDD as "the most credible and transparent non-experimental strategy for estimating causal effects." This characterization reflects RDD's strong internal validity when a sharp cutoff rule exists and manipulation is absent; it does not imply RDD is universally preferred, since the design is only applicable in settings with an institutional threshold and the resulting estimates are local to the cutoff.
Common Confusions
B. Identification
Assumption 1: Continuity of Potential Outcomes at the Cutoff
Plain language: Everything about units — their potential outcomes, their background characteristics, their unobserved traits — changes smoothly as the running variable crosses the cutoff. The only thing that jumps at the cutoff is treatment status. If you could graph every characteristic against the running variable, you would see smooth curves everywhere except in the treatment indicator, which jumps from 0 to 1.
Formally: and are continuous in at the cutoff , where and are potential outcomes without and with treatment.
This continuity condition is the core identifying assumption. If it holds, any discontinuity in the outcome at the cutoff can be attributed to the treatment.
Assumption 2: No Manipulation (No Precise Sorting)
Plain language: Units cannot precisely control their running variable to sort around the cutoff. If students who would score 69 can strategically boost their score to 71 to get a scholarship, the students just above and below the cutoff are no longer comparable — the ones above are those who tried harder or had the resources to manipulate their score.
"No manipulation" does not mean units are unaware of the cutoff or do not try to cross it. It means they cannot precisely control the running variable. In elections, candidates try hard to win, but they cannot guarantee they will get exactly 50.1% of the vote. The residual uncertainty in the running variable is what creates the quasi-random assignment.
Assumption 3: Treatment Assignment is Sharp
Plain language: Treatment is a deterministic function of the running variable. Everyone above the cutoff is treated; no one below is treated. If there are exceptions — people above the cutoff who do not receive treatment, or people below who do — you have a fuzzy RDD, which requires different estimation methods (essentially, IV/2SLS at the cutoff).
What RDD Identifies
Under these assumptions, the RDD estimates the local average treatment effect at the cutoff:
This quantity is the expected treatment effect for units whose running variable is exactly at the threshold.
C. Visual Intuition
The Jump at the Cutoff
The running variable (e.g., vote share) varies continuously across units.
Notice what happens as you increase the polynomial order above 2: the fit becomes erratic near the cutoff, and the estimate becomes unstable. This instability is why Gelman and Imbens (2019) recommend against global high-order polynomials. Stick with local linear (order 1) or local quadratic (order 2).
Regression Discontinuity Design
Explore how RDD identifies a causal effect at a cutoff. The DGP is Y = f(R) + 3.0·D + ε, where D = 1(R ≥ 0) and f(R) includes polynomial terms controlled by the curvature parameter.
Estimation Results(local n = 84)
| Estimator | β̂ | Bias |
|---|---|---|
| Global OLS (D only) | 6.919 | +3.919 |
| Global OLS (D + R) | 2.859 | -0.141 |
| Local linear RD | 3.505 | +0.505 |
| True β | 3.000 | — |
Number of observations to generate
Causal jump at the cutoff R=0
0 = linear, 1 = strong nonlinearity in the conditional expectation
Width of the local window |R| < h around the cutoff
Standard deviation of the error term
D. Mathematical Derivation
Don't worry about the notation yet — here's what this means in words: The RDD treatment effect is the difference in the limits of the conditional expectation function from the right and left of the cutoff.
Setup. Let be the running variable with cutoff . Treatment is:
The observed outcome is:
Step 1: Define the estimand.
The causal effect at the cutoff is:
Under the continuity assumption, and are continuous at . This continuity means:
Why? Because just above the cutoff, everyone is treated, so for . Just below, no one is treated, so for .
Step 2: Local polynomial estimation.
To estimate the limits from each side, fit separate polynomials above and below the cutoff, using only observations within a bandwidth of :
For observations with , fit:
For observations with , fit:
The RDD estimate is — the difference in the intercepts at the cutoff.
Step 3: Bandwidth selection.
The bandwidth controls a bias-variance tradeoff:
- Smaller : Less bias (observations closer to the cutoff are more comparable) but more variance (fewer observations used)
- Larger : More observations but greater risk that the polynomial misapproximates the true regression function
Calonico et al. (2014) developed a data-driven bandwidth selector that minimizes the asymptotic mean squared error of the estimator(Calonico et al., 2014). This selector is what rdrobust uses by default.
Step 4: Bias-corrected inference.
The standard local polynomial estimator has a bias of order (where is the polynomial order). Calonico et al. (2014) proposed a bias-corrected estimator with robust confidence intervals that account for this bias. This correction is the "robust" in rdrobust.
E. Implementation
The rdrobust Ecosystem
The standard approach for RDD estimation is the rdrobust package, developed by Cattaneo et al. (2020). It is available in Stata, R, and Python.
library(rdrobust)
library(rddensity)
# Basic sharp RDD estimate
rd <- rdrobust(y = df$y, x = df$x, c = 0)
summary(rd)
# RDD plot
rdplot(y = df$y, x = df$x, c = 0, p = 1,
x.label = "Running Variable",
y.label = "Outcome",
title = "Sharp RDD Plot")
# Manipulation test
density_test <- rddensity(X = df$x, c = 0)
summary(density_test)
rdplotdensity(density_test, df$x)
# With covariates
rd_cov <- rdrobust(y = df$y, x = df$x, c = 0,
covs = cbind(df$age, df$female))
summary(rd_cov)
# Sensitivity: vary bandwidth
for (h in c(5, 10, 15, 20)) {
cat("\nBandwidth =", h, "\n")
print(summary(rdrobust(y = df$y, x = df$x, c = 0, h = h)))
}Understanding the rdrobust Output
The key numbers to report from rdrobust:
- Conventional estimate: The local polynomial point estimate with conventional standard error
- Bias-corrected estimate: The point estimate after bias correction
- Robust confidence interval: Uses the bias-corrected estimate with a standard error that accounts for the additional variability introduced by the bias correction (preferred for inference)
- Bandwidth: The data-driven optimal bandwidth used
- Effective number of observations: How many observations fall within the bandwidth on each side
Bandwidth Referee Roulette
You are a referee evaluating an RDD paper. Pick a bandwidth and see how the treatment effect estimate changes. The DGP has a known jump of τ = 2.5 at X = 0 with curvature in the conditional expectation function. Can you find the sweet spot between bias and variance?
Width of the estimation window on each side of the cutoff
Total number of observations
True discontinuity jump at the cutoff
Bias-Variance Tradeoff
Estimation Results
- RD estimate (τ̂)
- 2.508
- Standard error
- 0.310
- N in bandwidth
- 202
- Bias (τ̂ - τ)
- +0.008
- Your bandwidth (h)
- 1.5
- MSE-optimal (h*)
- 1.22
Referee Verdict: Goldilocks Zone
Your bandwidth is in the Goldilocks zone. It balances bias and variance well, using enough observations for precision while staying close enough to the cutoff to avoid bias from the nonlinear conditional expectation function.
Your bandwidth is 1.23x the MSE-optimal bandwidth (h* = 1.22).
Why this matters: Bandwidth selection is the key researcher degree of freedom in RDD. Too narrow and your estimate is noisy; too wide and you introduce bias from the functional form. The MSE-optimal bandwidth (Imbens & Kalyanaraman, 2012) balances squared bias and variance. Good practice: report estimates across a range of bandwidths and show sensitivity. Referees should be suspicious if results only hold at one specific bandwidth.
F. Diagnostics
F.1 McCrary Density Test (Manipulation Check)
An essential diagnostic for any RDD. If units can precisely sort around the cutoff, you will see a discontinuity in the density of the running variable at the cutoff. The McCrary test (and its modern successor rddensity) formally tests for this sorting behavior.
- If the density is smooth through the cutoff: no evidence of manipulation (good)
- If there is a jump in density: possible manipulation (serious threat to identification)
Example: If students know the scholarship cutoff is 80, and you see an excess of students scoring exactly 80-82 with a deficit at 78-79, that bunching is evidence of manipulation.
F.2 Covariate Balance (Placebo Tests)
Run the RDD on each pre-determined covariate (age, gender, prior achievement, etc.) as the outcome. If the running variable is as-good-as-randomly assigned near the cutoff, there should be no jump in any pre-determined covariate. This test is the RDD analog of the "balance table" in a randomized experiment.
F.3 Bandwidth Sensitivity
Re-estimate the treatment effect using a range of bandwidths (e.g., 0.5x, 0.75x, 1x, 1.25x, 1.5x, 2x the optimal bandwidth). If the estimate is stable across bandwidths, that stability is reassuring. If it swings wildly, the result may be fragile. For a broader discussion of robustness testing across specifications, see sensitivity analysis.
F.4 Placebo Cutoffs
Estimate the treatment effect at fake cutoffs where no treatment actually occurs (e.g., at the median of the running variable, or at percentiles far from the real cutoff). You should find no significant effects at these placebo cutoffs. If you do, something besides the treatment is causing discontinuities.
F.5 Donut Hole
Remove observations very close to the cutoff (e.g., within 1 unit) and re-estimate. If the result is driven entirely by the handful of observations right at the boundary, it may be fragile or reflect manipulation by units who scored exactly at the cutoff.
Interpreting Your Results
What to Report
- The RDD plot with the fitted lines and data points
- The point estimate, robust confidence interval, and p-value from
rdrobust - The optimal bandwidth and the effective number of observations on each side
- The McCrary density test results
- Covariate balance tests at the cutoff
- Sensitivity of the estimate to bandwidth choice
How to Interpret the Estimate
Common Misstatements to Avoid
- Avoid claiming the effect applies to all units — it is local to the cutoff
- Avoid using global high-order polynomials and presenting the approach as best practice
- Include the McCrary test
- Remember that the effective sample size is the number within the bandwidth, not the total sample
G. What Can Go Wrong
Manipulation of the Running Variable
Running variable is as-good-as-random near the cutoff (e.g., election vote share)
McCrary test p = 0.34, smooth density at cutoff. RDD estimate is valid.
Global High-Order Polynomial
Local linear regression within optimal bandwidth (rdrobust default)
Estimate = 5.0 (SE = 1.2). Stable, interpretable, low sensitivity to distant observations.
Discrete Running Variable
Running variable is continuous (e.g., vote share to many decimal places)
Observations near the cutoff are truly comparable. The continuity assumption is plausible.
H. Practice
H.1 Concept Checks
A scholarship is awarded to all students who score 80 or above on a standardized test. A researcher estimates that the scholarship increases college enrollment by 15 percentage points using an RDD. This estimate tells you the effect for:
You are running an RDD and the McCrary density test has a p-value of 0.001. What does this suggest?
A researcher estimates an RDD using a 5th-order global polynomial and finds a large, significant treatment effect. When they switch to local linear regression with rdrobust, the effect shrinks and becomes insignificant. What is the likely explanation?
You run covariate balance tests at the cutoff and find that age, income, and gender show no discontinuity, but race shows a significant jump (p = 0.02). Should you be concerned?
In Lee's (2008) incumbency study, why is the vote share margin a good running variable for RDD?
H.2 Guided Exercise
You are analyzing a scholarship program that awards funding to students who score 80 or above on an exam. Fill in the blanks.
You have test scores for 2,000 students. Running rdrobust with the cutoff at 80, you obtain: Conventional estimate: 12.5 (SE = 3.1) Bias-corrected (robust) estimate: 11.8 (robust SE = 3.6) Robust 95% CI: [4.7, 18.9] Optimal bandwidth: 7.2 points Effective N (left of cutoff): 312 Effective N (right of cutoff): 298 McCrary test p-value: 0.41
H.3 Error Detective
Read the analysis below carefully and identify the errors.
Select all errors you can find:
Read the analysis below carefully and identify the errors.
Select all errors you can find:
H.5 You Are the Referee
Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.
Paper Summary
The authors study whether winning a close mayoral election affects city-level economic growth. The running variable is the winner's vote margin. Using a sharp RDD with rdrobust, they compare cities where the eventual mayor won by less than 5 percentage points. They find that cities where a pro-business candidate barely won grew 1.3 percentage points faster over the next 4 years than cities where the opposing candidate barely won. The McCrary test shows no manipulation (p = 0.62). Covariate balance tests show no discontinuity in city size, income, or industry composition at the cutoff.
Key Table
| Variable | RDD Estimate | Robust SE | Robust 95% CI |
|---|---|---|---|
| GDP growth (4-year) | 1.30 | 0.52 | [0.28, 2.32] |
| Optimal bandwidth | 4.8 pp | ||
| Effective N (left) | 142 | ||
| Effective N (right) | 156 | ||
| McCrary p-value | 0.62 | ||
| Covariate balance tests | All p > 0.1 |
Authors' Identification Claim
In close elections, the winning margin is effectively random, making this design equivalent to a randomized experiment. The McCrary test and covariate balance support this claim.
I. Swap-In: When to Use Something Else
- Fuzzy RDD: When the cutoff induces a jump in the probability of treatment but not a deterministic shift — fuzzy RDD uses the threshold as an instrument for treatment receipt.
- Difference-in-differences: When there is no sharp threshold but a policy change creates treated and untreated groups with temporal variation.
- IV / 2SLS: When the source of exogenous variation is not a threshold on a running variable but a discrete instrument (e.g., lottery, draft number).
- Matching: When there is no threshold but rich observational data allow for matching on pre-treatment covariates.
- RDD with covariates: When the bandwidth is narrow and precision is low, adding pre-determined covariates can improve efficiency without threatening identification.
J. Reviewer Checklist
Critical Reading Checklist
Paper Library
Foundational (8)
Thistlethwaite, D. L., & Campbell, D. T. (1960). Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.
This paper introduced the regression discontinuity design. Thistlethwaite and Campbell proposed comparing units just above and just below a cutoff score to estimate causal effects, reasoning that units near the cutoff are essentially randomly assigned. The idea lay dormant for decades before being rediscovered by economists.
Lee, D. S. (2008). Randomized Experiments from Non-random Selection in U.S. House Elections.
Lee formalized the conditions under which an RDD is 'as good as' a randomized experiment—namely, when agents cannot precisely manipulate the running variable around the cutoff. Applied to U.S. House elections, this paper established the modern theoretical foundation for sharp RDD.
Imbens, G. W., & Lemieux, T. (2008). Regression Discontinuity Designs: A Guide to Practice.
This practical guide covers all the key steps of implementing an RDD: choosing bandwidth, testing for manipulation, selecting polynomial order, and conducting robustness checks. It is the standard how-to reference for applied researchers.
Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014). Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs.
This paper developed bias-corrected confidence intervals for RDD that solve a fundamental problem: conventional methods undersmooth or oversmooth the local polynomial fit. Their 'rdrobust' software package has become the standard tool for RDD estimation.
Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2024). A Practical Introduction to Regression Discontinuity Designs: Extensions.
This follow-up volume to Cattaneo, Idrobo, and Titiunik's first book covers extensions of the regression discontinuity framework, including multi-score designs, geographic RDD, kink designs, and discrete running variables. It provides practical guidance and software implementations for these more advanced settings, making it an essential companion for applied researchers going beyond the standard sharp RDD.
Gelman, A., & Imbens, G. W. (2019). Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs.
Gelman and Imbens showed that using high-order global polynomials in RDD leads to noisy estimates, sensitivity to the degree of polynomial, and poor coverage of confidence intervals. They recommended local linear or quadratic fits with appropriate bandwidth selection instead, fundamentally changing best practice for RDD estimation.
Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2020). A Practical Introduction to Regression Discontinuity Designs: Foundations.
A concise, modern guide to RDD that covers the foundations, estimation, inference, and validation. Designed for practitioners and pairs perfectly with the rdrobust software package. Complements the Extensions volume (Cattaneo, Idrobo, and Titiunik, 2024) which covers more advanced settings.
Cattaneo, M. D., Titiunik, R., & Vazquez-Bare, G. (2019). Power Calculations for Regression-Discontinuity Designs.
Provides methods and software for power calculations in RDD, essential for study design and determining adequate sample sizes near the cutoff. The associated rdsampsi command enables researchers to plan appropriately powered RDD studies before data collection.
Application (6)
Angrist, J. D., & Lavy, V. (1999). Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement.
This celebrated paper exploited a discontinuity in class size created by Maimonides' ancient rule (maximum 40 students per class) to estimate the effect of class size on test scores. It is one of the earliest and most elegant RDD applications in economics.
McCrary, J. (2008). Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.
McCrary developed the standard test for whether agents are manipulating the running variable to sort around the cutoff. If the density of the running variable shows a discontinuity at the cutoff, the RDD is compromised. This density test is now a routine validity check in all RDD papers.
Chava, S., & Roberts, M. R. (2008). How Does Financing Impact Investment? The Role of Debt Covenants.
Chava and Roberts used an RDD around debt covenant thresholds to study how covenant violations affect firm investment. This paper is an important early application of RDD in corporate finance, where accounting-based thresholds create natural discontinuities.
Flammer, C. (2015). Does Corporate Social Responsibility Lead to Superior Financial Performance? A Regression Discontinuity Approach.
Flammer applied an RDD to shareholder votes on CSR proposals, comparing firms where proposals barely passed versus barely failed. She found that adopting CSR proposals led to improved financial performance. This paper is one of the best-known RDD applications in a top management journal.
Dell, M. (2010). The Persistent Effects of Peru's Mining Mita.
Uses a geographic RDD exploiting the historical boundary of the mita forced labor system in Peru to estimate the persistent effect of colonial institutions on economic outcomes centuries later. Demonstrates how RDD can exploit spatial discontinuities, not just score-based cutoffs.
Cunat, V., Gine, M., & Guadalupe, M. (2012). The Vote Is Cast: The Effect of Corporate Governance on Shareholder Value.
Uses close shareholder votes on governance proposals as an RDD to estimate the effect of governance changes on firm value. Demonstrates how the RDD framework can be applied in corporate finance and governance research, complementing Flammer (2015) on CSR proposals.
Survey (2)
Lee, D. S., & Lemieux, T. (2010). Regression Discontinuity Designs in Economics.
This comprehensive survey covers the theory, practice, and applications of RDD in economics. It discusses both sharp and fuzzy designs, graphical analysis, specification testing, and common pitfalls. Essential reading for anyone starting with RDD.
Cattaneo, M. D., & Titiunik, R. (2022). Regression Discontinuity Designs.
A recent survey covering the state of the art in RDD methodology, including extensions to fuzzy designs, geographic RDD, and multi-cutoff designs. Provides guidance on current recommended practices and is an excellent entry point to the modern RDD literature.