Regression Discontinuity Design – Fuzzy
When crossing the cutoff changes the probability of treatment (not a guarantee), use fuzzy RDD — essentially IV at the cutoff.
Quick Reference
- When to Use
- When crossing the cutoff changes the probability of treatment but does not determine it perfectly — there is noncompliance at the threshold. Essentially IV at the cutoff.
- Key Assumption
- The running variable cannot be precisely manipulated, the cutoff is a valid instrument (exclusion restriction holds at the threshold), and the first stage (jump in treatment probability at the cutoff) is sufficiently strong.
- Common Mistake
- Applying sharp RDD methods when there is substantial noncompliance at the threshold, or not reporting and testing the first-stage jump in treatment probability.
- Estimated Time
- 2.5 hours
One-Line Implementation
rdrobust y x, c(0) fuzzy(treatment)rdrobust(y = df$y, x = df$x, c = 0, fuzzy = df$treatment)rdrobust(y=df['y'], x=df['x'], c=0, fuzzy=df['treatment'])Download Full Analysis Code
Complete scripts with diagnostics, robustness checks, and result export.
Motivating Example: Maimonides' Rule and Class Size
The 12th-century scholar Maimonides wrote that a class should not exceed 40 students. Israel adopted this as policy: when enrollment in a grade exceeds 40, the school must split into two classes. At 41 students, average class size drops from 41 to about 20.5.
Angrist and Lavy (1999) exploited this rule to estimate the effect of class size on student achievement. But here is the complication: the rule creates a discontinuity in expected class size, not a perfect deterministic assignment. Some schools do not comply perfectly — they might get exemptions, combine classes, or handle the split differently. Crossing the 40-student threshold changes the probability of being in a small class, but does not guarantee it.
(Angrist & Lavy, 1999)This setting is a fuzzy RDD. The running variable (enrollment) crosses a cutoff (40), and the probability of treatment (small class) jumps — but not from 0 to 1. It jumps from, say, 0.1 to 0.7. To handle this non-compliance, you use the cutoff as an instrument for actual treatment.
A. Overview: Sharp vs. Fuzzy RDD
Sharp RDD (Quick Review)
In a sharp RDD, treatment is a deterministic function of the running variable:
Everyone above the cutoff is treated; everyone below is not. The treatment effect is identified by the jump in the outcome at .
Fuzzy RDD
In a fuzzy RDD, the cutoff changes the probability of treatment, but not deterministically:
There is a jump in the first stage (the probability of treatment) at the cutoff, but it is less than one. Some units above the cutoff do not receive treatment, and some below do.
The Key Insight: Fuzzy RDD = IV at the Cutoff
The fuzzy RDD estimator is the ratio of two jumps:
This ratio is exactly the Wald/IV estimator (the same logic underlying instrumental variables estimation), with the instrument being the indicator for crossing the cutoff: .
Common Confusions
B. Identification
The Three Requirements
Fuzzy RDD requires:
- Relevance — The cutoff causes a jump in the probability of treatment. This jump is the first stage, and it must be meaningfully large.
- Continuity of potential outcomes — and are continuous at . Informally: absent the treatment jump, the outcome would not jump at the cutoff.
- No manipulation — Units cannot precisely control the running variable to sort above or below the cutoff.
Monotonicity
Additionally, fuzzy RDD requires a monotonicity assumption: crossing the cutoff can only increase (or only decrease) the probability of treatment for any individual. There are no "defiers" — units for whom crossing the cutoff would reduce their treatment probability.
Formal Estimand
The fuzzy RDD identifies:
This estimand is the treatment effect for the subpopulation of compliers — those whose treatment status would change if they moved from just below to just above the cutoff — evaluated right at the cutoff.
C. Visual Intuition
Imagine two graphs stacked vertically. The top graph shows the outcome (test scores) as a function of the running variable (enrollment). You see a cloud of data points with a fitted curve on each side of the cutoff. At the cutoff, the outcome might show a jump (the reduced form).
The bottom graph shows the treatment (actual class size) as a function of enrollment. Here, you clearly see a jump at 40: the probability of being in a small class increases sharply. This jump is the first stage.
The fuzzy RDD estimate is the ratio of the top jump to the bottom jump. If test scores jump by 3 points and the probability of small-class treatment jumps by 0.6, the fuzzy RDD estimate is points — the causal effect of small classes for compliers at the cutoff.
If the bottom graph shows no jump (flat first stage), you have no instrument and cannot identify the treatment effect. The strength of the first stage is critical.
Fuzzy RDD as Local IV
The fuzzy RDD estimate equals the jump in the outcome divided by the jump in treatment probability at the cutoff — exactly a Wald IV estimate applied locally. A weak first stage (small treatment jump) inflates the variance of the estimate.
Computed Results
- Fuzzy RDD / Wald Estimate
- 5.00
- Approx. First-Stage F-stat
- 18.0
- Approx. Std. Error of Estimate
- 0.71
Fuzzy RDD: When Compliance Isn't Perfect
DGP: 500 units, R ~ Uniform(-1,1). Above cutoff: P(D=1) = 0.80, below: P(D=1) = 0.10. Y = 5.0·D + 2·R + ε. Bandwidth h = 0.50, local n = 269.
Estimation Results
| Estimator | β̂ | SE | 95% CI | Bias |
|---|---|---|---|---|
| Sharp RD (ITT, target: 3.50)closest | 3.962 | 0.561 | [2.86, 5.06] | +0.462 |
| OLS on D | 6.563 | 0.160 | [6.25, 6.88] | +1.563 |
| Fuzzy RD (IV) | 5.635 | 0.797 | [4.07, 7.20] | +0.635 |
| True β | 5.000 | — | — | — |
Number of observations
Causal effect of treatment on outcome
P(D=1 | R ≥ 0): treatment take-up rate above cutoff
P(D=1 | R < 0): treatment take-up rate below cutoff
Width of the local window |R| < h around the cutoff
Why the difference?
The sharp RD estimates 3.962, which is the intent-to-treat effect (reduced form). But because only 80% above the cutoff actually take treatment (vs. 10% below), the causal effect on those who comply is larger: 5.635 = reduced form / first stage = 3.962 / 0.703. The scaled ratio is the LATE for compliers at the cutoff.
D. Mathematical Derivation
Don't worry about the notation yet — here's what this means in words: The fuzzy RDD estimate divides the jump in the outcome by the jump in treatment probability at the cutoff, just like an IV Wald estimator applied locally.
Define the instrument . The local Wald estimator is:
where:
In practice, both limits are estimated using local polynomial regressions within a bandwidth of the cutoff. The most common approach is local linear regression, weighted by a kernel function.
Implementation via 2SLS within the bandwidth:
For observations within bandwidth of , estimate:
First stage:
Second stage:
The coefficient is the fuzzy RDD estimate. The slope interaction uses the exogenous cutoff indicator rather than , because would be a second endogenous regressor requiring its own first stage. The coefficient from the first stage is the jump in treatment probability.
Bandwidth selection: Imbens and Kalyanaraman (2012) and Calonico et al. (2014) provide data-driven bandwidth selectors that balance bias and variance.
(Calonico et al., 2014)E. Implementation
library(rdrobust)
library(rddensity)
# Fuzzy RDD
frd <- rdrobust(y = df$test_score, x = df$enrollment, c = 40,
fuzzy = df$small_class)
summary(frd)
# First stage
first_stage <- rdrobust(y = df$small_class, x = df$enrollment, c = 40)
summary(first_stage)
# RD plot
rdplot(y = df$test_score, x = df$enrollment, c = 40, nbins = c(20, 20))
# Manipulation test
manip <- rddensity(X = df$enrollment, c = 40)
summary(manip)F. Diagnostics
First-Stage Strength
The first stage must show a clear, statistically significant jump in treatment probability at the cutoff. Report:
- The size of the jump (coefficient on the cutoff indicator in the first-stage regression)
- The effective F-statistic
- A plot of treatment probability against the running variable
If the first stage is weak (small jump), the fuzzy RDD estimate will be imprecise and potentially biased, just like weak-instrument IV. Sensitivity analysis across different bandwidths can help assess the robustness of the result.
Manipulation Test
Use the McCrary (2008) density test or the Cattaneo et al. (2020) test to check whether units bunch on one side of the cutoff. If the density is discontinuous at , units may be manipulating the running variable.
(McCrary, 2008)Covariate Balance at the Cutoff
Pre-determined covariates should not jump at the cutoff. Run the RDD specification replacing the outcome with each covariate. Significant jumps suggest either manipulation or a violation of the continuity assumption.
Bandwidth Sensitivity
Show that results are robust to different bandwidth choices: the optimal bandwidth, half the bandwidth, and twice the bandwidth. The rdrobust package produces bias-corrected estimates that are less sensitive to bandwidth choice.
Interpreting Results
- The fuzzy RDD estimate is the LATE for compliers at the cutoff — the causal effect of treatment for units whose treatment status changes due to the cutoff rule.
- This estimate is not the ATE for the full population. Compliers at the cutoff may be a special group.
- The reduced form (the jump in outcome at the cutoff, without dividing by the first stage) is the ITT analog for RDD. It is generally worth reporting.
- If the first-stage jump is large (close to 1), the fuzzy RDD approaches the sharp RDD, and the LATE approaches a local ATE.
- It is recommended to present the results graphically. The RD plot is the most convincing evidence — readers should see the discontinuity.
G. What Can Go Wrong
| Problem | What It Does | How to Fix It |
|---|---|---|
| Weak first stage | Imprecise and biased estimate | Report effective F-stat; consider whether the design is viable |
| Manipulation of running variable | Units sort above/below cutoff, invalidating the design | McCrary/density test; covariate balance checks |
| Using sharp RDD when fuzzy is needed | Understates the treatment effect (gives ITT, not LATE) | Check for non-compliance; use fuzzy specification |
| Global polynomial | High-order polynomials produce misleading estimates at the cutoff | Use local linear regression with data-driven bandwidth |
| Extrapolating away from cutoff | Estimates are only valid at the cutoff | Report the estimate as local; discuss external validity |
| Donut hole needed | Observations exactly at the cutoff are unusual | Try excluding observations within a small window of the cutoff |
Weak First Stage at the Cutoff
The cutoff induces a large jump in treatment probability (from 0.15 to 0.75)
Fuzzy RDD estimate = 4.8 (SE = 1.2). First-stage jump = 0.60, effective F = 42. Precise and reliable estimate of the LATE for compliers at the cutoff.
Manipulation of the Running Variable
Students cannot precisely control their entrance exam scores near the scholarship cutoff
McCrary density test p = 0.43. Smooth density across the cutoff. Covariate balance tests show no jumps. The quasi-random assignment near the cutoff is credible.
Using a Global High-Order Polynomial Instead of Local Regression
Local linear regression within an MSE-optimal bandwidth of 5 points around the cutoff
Fuzzy RDD estimate = 5.2 (robust bias-corrected SE = 1.8). Estimate is driven by observations close to the cutoff where the design is most credible.
In Angrist and Lavy's class size study, suppose the reduced form (jump in test scores at enrollment = 40) is 2.5 points, and the first stage (jump in the probability of having a small class) is 0.50. What is the fuzzy RDD estimate, and what does it represent?
H. Practice
A scholarship is awarded to students scoring above 80 on an entrance exam. However, some students above 80 decline the scholarship, and some below 80 receive it through appeals. A researcher runs a standard (sharp) RDD, comparing average outcomes just above and below 80. What does this estimate?
In a fuzzy RDD, the first stage shows that crossing the cutoff increases the probability of treatment from 0.20 to 0.28 — a jump of only 0.08. The effective F-statistic is 4.2. What should you be concerned about?
A researcher studies the effect of remedial math classes on college GPA. Students scoring below 60 on a placement test are assigned to remediation, but some below 60 skip it and some above 60 voluntarily enroll. She estimates the fuzzy RDD and finds an effect of 0.4 GPA points for compliers at the cutoff. Can she generalize this to all students?
A researcher fits a fourth-degree global polynomial on both sides of the cutoff to estimate a fuzzy RDD. A reviewer insists she should use local linear regression with data-driven bandwidth instead. Why does the reviewer prefer this approach?
Fuzzy RDD: Medicaid Eligibility and Health Outcomes
A health economist studies whether Medicaid enrollment improves health outcomes. Medicaid eligibility is determined by an income threshold: households earning below 138% of the federal poverty level (FPL) are eligible. However, take-up is imperfect — some eligible households do not enroll, and some ineligible households obtain coverage through other programs. The researcher uses income relative to the 138% FPL cutoff as the running variable.
Read the analysis below carefully and identify the errors.
Select all errors you can find:
Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.
Paper Summary
A study examines whether receiving need-based financial aid improves college graduation rates. Students with family income below \$50,000 are eligible for a grant, but take-up is imperfect: only about 65% of eligible students actually receive the aid (some fail to complete paperwork), and about 10% of ineligible students receive aid through appeals. The authors use a fuzzy RDD with family income as the running variable and the \$50,000 threshold as the cutoff. They report a fuzzy RDD estimate that receiving aid increases 6-year graduation rates by 18 percentage points.
Key Table
| Variable | Coefficient | Robust SE | p-value |
|---|---|---|---|
| Aid received (fuzzy RDD) | 0.180 | 0.065 | 0.006 |
| First-stage jump | 0.550 | ||
| Effective F-statistic | 31.2 | ||
| Bandwidth (MSE-optimal) | $8,200 | ||
| McCrary density test p | 0.03 | ||
| N (within bandwidth) | 4,200 |
Authors' Identification Claim
The \$50,000 income threshold creates a quasi-random assignment of financial aid eligibility near the cutoff. The fuzzy RDD accounts for imperfect compliance by instrumenting actual aid receipt with eligibility status.
I. Swap-In: When to Use Something Else
- Sharp RDD: When compliance at the cutoff is perfect — every unit above the threshold receives treatment, every unit below does not.
- IV / 2SLS: When the source of exogenous variation is not a running-variable threshold but a discrete instrument. Fuzzy RDD is a special case of IV where the instrument is the indicator for crossing the cutoff.
- Difference-in-differences: When there is temporal variation in treatment adoption rather than a threshold-based assignment rule.
- Matching: When there is no threshold-based assignment but rich pre-treatment covariates support a selection-on-observables strategy.
J. Reviewer Checklist
Critical Reading Checklist
Paper Library
Foundational (6)
Hahn, J., Todd, P., & Van der Klaauw, W. (2001). Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design.
This paper provided the formal econometric framework for both sharp and fuzzy regression discontinuity designs. For the fuzzy case, it showed that the treatment effect can be identified as the ratio of the discontinuity in the outcome to the discontinuity in the treatment probability, analogous to a Wald estimator.
Imbens, G. W., & Lemieux, T. (2008). Regression Discontinuity Designs: A Guide to Practice.
Imbens and Lemieux provided a comprehensive practical guide to implementing RDD, covering bandwidth selection, functional form, and graphical analysis. Their treatment of fuzzy RDD as a local IV estimator clarified the interpretation and implementation for applied researchers.
Lee, D. S., & Lemieux, T. (2010). Regression Discontinuity Designs in Economics.
Lee and Lemieux wrote the definitive survey of RDD methods in economics, covering both sharp and fuzzy designs, validity tests, and extensions. This paper is the standard reference for understanding the econometric theory and practical implementation of RDD.
Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014). Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs.
Calonico, Cattaneo, and Titiunik developed bias-corrected confidence intervals for RDD that address the problem of conventional confidence intervals being invalid when using optimal bandwidth selectors. Their rdrobust software package has become the standard tool for implementing RDD in practice.
Dong, Y., & Lewbel, A. (2015). Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models.
Dong and Lewbel extended fuzzy RDD by showing how to identify the effect of changing the policy threshold, not just the effect of treatment at the existing cutoff. This approach allows researchers to evaluate counterfactual policies that shift the eligibility boundary, broadening the policy relevance of fuzzy RDD estimates.
Battistin, E., & Rettore, E. (2008). Ineligibles and Eligible Non-Participants as a Double Comparison Group in Regression-Discontinuity Designs.
Battistin and Rettore addressed the problem of imperfect compliance in fuzzy RDD by proposing a double comparison group strategy that uses both ineligible units and eligible non-participants to bound treatment effects. Their framework clarified how partial compliance affects identification and offered practical tools for strengthening fuzzy RDD inference.
Application (4)
Van der Klaauw, W. (2002). Estimating the Effect of Financial Aid Offers on College Enrollment: A Regression-Discontinuity Approach.
Van der Klaauw applied a fuzzy RDD to study how financial aid offers affect college enrollment decisions, exploiting discontinuities in an aid assignment rule where eligibility changes at GPA thresholds but compliance is imperfect. This paper is one of the earliest and most influential applications of fuzzy RDD.
Angrist, J. D., & Lavy, V. (1999). Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement.
Angrist and Lavy exploited a rule that caps class sizes at 40 students, creating discontinuities in class size as enrollment crosses multiples of 40. The imperfect compliance with the rule makes this a fuzzy RDD. This paper is one of the most widely taught examples of the fuzzy RDD approach.
Flammer, C. (2015). Does Corporate Social Responsibility Lead to Superior Financial Performance? A Regression Discontinuity Approach.
Flammer used a sharp RDD design based on close shareholder votes on CSR proposals, where proposals that barely pass versus barely fail are essentially randomly determined. Published in Management Science, it demonstrated the power of RDD for studying corporate governance questions.
Cunat, V., Gine, M., & Guadalupe, M. (2012). The Vote Is Cast: The Effect of Corporate Governance on Shareholder Value.
Cunat, Gine, and Guadalupe used a fuzzy RDD around the majority threshold in shareholder governance proposals to estimate the causal effect of governance provisions on firm value. This paper is a leading example of fuzzy RDD applied to corporate governance and finance.
Survey (1)
Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2020). A Practical Introduction to Regression Discontinuity Designs: Foundations.
A practical and accessible guide to implementing regression discontinuity designs, covering both sharp and fuzzy cases with worked examples and code. Part of the Cambridge Elements series, it provides step-by-step guidance on bandwidth selection, estimation, and inference using the rdrobust toolkit.