Lab: Bunching Estimation from Scratch
Implement bunching estimation step by step. Simulate an income distribution with bunching at a tax kink, fit a polynomial counterfactual, estimate excess mass, compute the implied behavioral elasticity, and assess robustness to polynomial order.
Overview
Bunching estimation exploits the fact that individuals cluster at kink points or notch points in tax schedules, benefit formulas, or regulatory thresholds. The excess mass at these points — relative to a smooth counterfactual distribution — reveals behavioral responses. This method was formalized by Saez (2010) and has become a standard tool in public finance and labor economics.
What you will learn:
- How bunching at a kink point reveals behavioral responses to incentives
- How to construct a polynomial counterfactual distribution
- How to estimate the excess mass (bunching) at the kink
- How to convert excess mass into a behavioral elasticity
- How to assess sensitivity to polynomial order and bin width
- How to run a placebo test at a non-kink point
Prerequisites: OLS regression, basic understanding of tax policy and labor supply elasticities.
Step 1: Simulate Income Distribution with Bunching
We simulate a taxable income distribution where individuals bunch at a kink point in the tax schedule. Below the kink, the marginal tax rate is low; above the kink, the rate jumps.
set.seed(42)
n <- 50000
# Tax schedule parameters
kink <- 20000 # Kink point (dollars)
tau_below <- 0.15 # Marginal tax rate below kink
tau_above <- 0.40 # Marginal tax rate above kink
delta_tau <- tau_above - tau_below # Tax rate jump = 0.25
# True behavioral elasticity
true_elasticity <- 0.3
# Simulate counterfactual income (what people would earn without the kink)
# Log-normal distribution centered around the kink
income_star <- rlnorm(n, meanlog = log(22000), sdlog = 0.35)
# Bunching: individuals with income_star in (kink, kink + dz*) bunch to kink
# dz* = kink * elasticity * (delta_tau / (1 - tau_above))
# This is the range of counterfactual incomes that bunch at the kink
dz_star <- kink * true_elasticity * (delta_tau / (1 - tau_above))
cat("Theoretical bunching range (dz*):", round(dz_star, 0), "dollars\n")
# Apply bunching: people in (kink, kink + dz*) move to the kink
income_observed <- ifelse(
income_star > kink & income_star <= kink + dz_star,
kink, # They bunch at the kink
income_star
)
# Add small noise to bunchers (they cluster near kink, not exactly at it)
bunchers <- income_star > kink & income_star <= kink + dz_star
income_observed[bunchers] <- kink + rnorm(sum(bunchers), mean = 0, sd = 200)
df <- data.frame(income = income_observed, income_star = income_star)
cat("Kink point:", kink, "\n")
cat("Tax rate below kink:", tau_below, "\n")
cat("Tax rate above kink:", tau_above, "\n")
cat("Number of bunchers:", sum(bunchers), "\n")
cat("Fraction bunching:", round(mean(bunchers), 4), "\n")Expected output:
| Parameter | Value |
|---|---|
| Kink point | $20,000 |
| Tax rate below kink | 15% |
| Tax rate above kink | 40% |
| Bunching range (dz*) | ~$2,500 |
| Number of bunchers | ~3,000–5,000 |
The simulated data should show a clear spike in the income distribution at the kink point, with a "missing mass" in the region just above the kink.
Step 2: Create Histogram and Identify Bunching
Bin the income data and visualize the distribution to see the bunching at the kink.
# Create bins
bin_width <- 500 # $500 bins
bins <- seq(10000, 35000, by = bin_width)
bin_mids <- bins[-length(bins)] + bin_width / 2
# Count observations in each bin
counts <- hist(df$income, breaks = bins, plot = FALSE)$counts
bin_df <- data.frame(
bin_mid = bin_mids,
count = counts,
rel_income = bin_mids - kink # Distance from kink
)
# Plot
barplot(bin_df$count, names.arg = bin_df$bin_mid / 1000,
xlab = "Income (thousands $)", ylab = "Count",
main = "Income Distribution with Bunching at Kink",
col = ifelse(abs(bin_df$bin_mid - kink) <= bin_width, "red", "steelblue"))
abline(v = which.min(abs(bin_df$bin_mid - kink)), lty = 2)
cat("Bin at kink point (count):", bin_df$count[which.min(abs(bin_df$bin_mid - kink))], "\n")
cat("Average bin count (away from kink):",
round(mean(bin_df$count[abs(bin_df$rel_income) > 3000]), 1), "\n")Expected output:
The histogram should show a clear spike at the kink point ($20,000), with the bunching bin containing 3–5 times more observations than a typical bin away from the kink. There should also be a visible "hole" — missing mass — in the bins just above the kink, where individuals would have been located had they not bunched.
Step 3: Fit the Polynomial Counterfactual
The key step in bunching estimation is fitting a smooth polynomial to the bin counts, excluding the bins near the kink. This polynomial represents the counterfactual distribution — what the income distribution would look like without the kink.
# Define excluded region around the kink
excl_lower <- kink - 1000 # Exclude bins $1,000 below kink
excl_upper <- kink + 3000 # Exclude bins $3,000 above kink (includes missing mass)
# Mark excluded bins
bin_df$excluded <- bin_df$bin_mid >= excl_lower & bin_df$bin_mid <= excl_upper
# Fit polynomial of order 7 on non-excluded bins
poly_order <- 7
fit_data <- bin_df[!bin_df$excluded, ]
poly_model <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE), data = fit_data)
# Predict counterfactual for ALL bins (including excluded)
bin_df$counterfactual <- predict(poly_model, newdata = bin_df)
# Excess mass: observed - counterfactual in the bunching region
bunching_bins <- bin_df$excluded
excess_mass <- sum(bin_df$count[bunching_bins]) - sum(bin_df$counterfactual[bunching_bins])
# Normalized excess mass: b = excess / (counterfactual height at kink * bin_width)
cf_at_kink <- bin_df$counterfactual[which.min(abs(bin_df$bin_mid - kink))]
b_normalized <- excess_mass / cf_at_kink
cat("=== Bunching Estimates ===\n")
cat("Excess mass (raw count):", round(excess_mass, 0), "\n")
cat("Counterfactual at kink:", round(cf_at_kink, 1), "\n")
cat("Normalized excess mass (b):", round(b_normalized, 3), "\n")Expected output:
| Statistic | Value |
|---|---|
| Excess mass (raw) | ~3,000–5,000 |
| Counterfactual at kink | ~800–1,200 |
| Normalized excess mass (b) | ~3.0–5.0 |
The normalized excess mass tells you how many "counterfactual bin heights" of individuals are bunching at the kink. A value of 4 means the bunching bin has 4 times more individuals than the counterfactual predicts.
What does the 'normalized excess mass' (b) measure in bunching estimation?
Step 4: Compute the Elasticity
The key output of bunching estimation is the behavioral elasticity. At a kink point, the elasticity is derived from the normalized excess mass and the change in marginal tax rates.
# At a kink: dz* = z* * e * (delta_tau / (1 - tau_above))
# Normalized excess mass b ~ dz* / bin_width (approximately)
# So: e ~ b * bin_width / (z* * delta_tau / (1 - tau_above))
# More precisely, estimate dz* from excess mass
# dz* is the width of the region from which bunchers come
dz_star_hat <- excess_mass / cf_at_kink * bin_width
# Elasticity: e = dz* / (z* * (delta_tau / (1 - tau_above)))
elasticity_hat <- dz_star_hat / (kink * (delta_tau / (1 - tau_above)))
cat("=== Elasticity Estimate ===\n")
cat("Estimated dz*:", round(dz_star_hat, 0), "dollars\n")
cat("True dz*:", round(dz_star, 0), "dollars\n")
cat("\nEstimated elasticity:", round(elasticity_hat, 3), "\n")
cat("True elasticity:", true_elasticity, "\n")
cat("\nTax schedule parameters used:\n")
cat(" z* (kink point):", kink, "\n")
cat(" delta_tau (rate jump):", delta_tau, "\n")
cat(" 1 - tau_above:", 1 - tau_above, "\n")Expected output:
| Parameter | Estimate | True Value |
|---|---|---|
| dz* (bunching range) | ~2,000–3,000 | ~2,500 |
| Elasticity | ~0.25–0.35 | 0.30 |
The estimated elasticity should be close to the true value of 0.30. This elasticity tells us that a 1% increase in the net-of-tax rate leads to a 0.30% increase in reported taxable income.
Step 5: Robustness to Polynomial Order
The polynomial counterfactual is the most sensitive part of the estimation. Different polynomial orders can produce different estimates.
# Try polynomial orders 3 through 10
results <- data.frame(poly_order = integer(), excess = numeric(),
b_norm = numeric(), elasticity = numeric())
for (p in 3:10) {
poly_p <- lm(count ~ poly(bin_mid, p, raw = TRUE), data = fit_data)
cf_p <- predict(poly_p, newdata = bin_df)
excess_p <- sum(bin_df$count[bunching_bins]) - sum(cf_p[bunching_bins])
cf_kink_p <- cf_p[which.min(abs(bin_df$bin_mid - kink))]
b_p <- excess_p / cf_kink_p
dz_p <- excess_p / cf_kink_p * bin_width
e_p <- dz_p / (kink * (delta_tau / (1 - tau_above)))
results <- rbind(results, data.frame(
poly_order = p, excess = round(excess_p, 0),
b_norm = round(b_p, 3), elasticity = round(e_p, 3)
))
}
cat("=== Robustness to Polynomial Order ===\n")
print(results)
cat("\nTrue elasticity:", true_elasticity, "\n")
cat("Preferred specification: order 7 (standard in the literature)\n")Expected output:
| Polynomial Order | Excess Mass | b (normalized) | Elasticity |
|---|---|---|---|
| 3 | ~3,500 | ~3.5 | ~0.28 |
| 5 | ~3,800 | ~3.8 | ~0.30 |
| 7 | ~3,900 | ~3.9 | ~0.31 |
| 9 | ~3,800 | ~3.8 | ~0.30 |
| 10 | ~3,900 | ~3.9 | ~0.31 |
The estimates should be relatively stable across polynomial orders 5–10, with orders 3–4 potentially underfitting and very high orders potentially overfitting. Order 7 is conventional in the literature.
Why is the bunching estimator sensitive to the polynomial order used for the counterfactual?
Step 6: Placebo Test
A placebo test checks whether the method spuriously detects bunching at a non-kink point. If the method finds significant bunching where none should exist, the results are unreliable.
# Placebo: test for bunching at $15,000 (no kink there)
placebo_kink <- 15000
excl_lower_p <- placebo_kink - 1000
excl_upper_p <- placebo_kink + 3000
bin_df$excluded_placebo <- bin_df$bin_mid >= excl_lower_p &
bin_df$bin_mid <= excl_upper_p
fit_placebo <- bin_df[!bin_df$excluded_placebo, ]
poly_placebo <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE),
data = fit_placebo)
bin_df$cf_placebo <- predict(poly_placebo, newdata = bin_df)
excess_placebo <- sum(bin_df$count[bin_df$excluded_placebo]) -
sum(bin_df$cf_placebo[bin_df$excluded_placebo])
cf_at_placebo <- bin_df$cf_placebo[which.min(abs(bin_df$bin_mid - placebo_kink))]
b_placebo <- excess_placebo / cf_at_placebo
cat("=== Placebo Test at $15,000 ===\n")
cat("Excess mass (placebo):", round(excess_placebo, 0), "\n")
cat("Normalized b (placebo):", round(b_placebo, 3), "\n")
cat("\nComparison:\n")
cat(" True kink ($20,000): b =", round(b_normalized, 3), "\n")
cat(" Placebo ($15,000): b =", round(b_placebo, 3), "\n")
cat("\nPlacebo should be close to zero (no bunching at $15,000).\n")Expected output:
| Location | Normalized Excess Mass (b) | Significant? |
|---|---|---|
| True kink ($20,000) | ~3.5–4.5 | Yes |
| Placebo ($15,000) | ~-0.1 to 0.1 | No |
The placebo test should find no significant bunching at $15,000, confirming that the method is detecting a real behavioral response at the true kink point rather than an artifact of the estimation procedure.
Step 7: Bootstrap Standard Errors
Bunching estimates require bootstrap standard errors because the counterfactual polynomial introduces estimation uncertainty that analytical formulas do not capture.
# Bootstrap standard errors for the elasticity
n_boot <- 200
boot_elasticities <- numeric(n_boot)
for (b_iter in 1:n_boot) {
# Resample individuals
boot_sample <- sample(df$income, n, replace = TRUE)
boot_counts <- hist(boot_sample, breaks = bins, plot = FALSE)$counts
boot_df <- data.frame(bin_mid = bin_mids, count = boot_counts)
boot_df$excluded <- boot_df$bin_mid >= excl_lower & boot_df$bin_mid <= excl_upper
boot_fit <- boot_df[!boot_df$excluded, ]
boot_poly <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE), data = boot_fit)
boot_cf <- predict(boot_poly, newdata = boot_df)
boot_excess <- sum(boot_df$count[boot_df$excluded]) -
sum(boot_cf[boot_df$excluded])
boot_cf_kink <- boot_cf[which.min(abs(boot_df$bin_mid - kink))]
boot_dz <- boot_excess / boot_cf_kink * bin_width
boot_elasticities[b_iter] <- boot_dz / (kink * (delta_tau / (1 - tau_above)))
}
cat("=== Bootstrap Results (", n_boot, "replications) ===\n")
cat("Mean elasticity:", round(mean(boot_elasticities), 3), "\n")
cat("SE:", round(sd(boot_elasticities), 3), "\n")
cat("95% CI: [", round(quantile(boot_elasticities, 0.025), 3), ",",
round(quantile(boot_elasticities, 0.975), 3), "]\n")
cat("\nTrue elasticity:", true_elasticity, "\n")Expected output:
| Statistic | Value |
|---|---|
| Mean elasticity (bootstrap) | ~0.28–0.32 |
| Bootstrap SE | ~0.02–0.05 |
| 95% CI | ~[0.22, 0.38] |
| True elasticity | 0.30 |
The 95% confidence interval should contain the true elasticity of 0.30.
Step 8: Exercises
Computing the Elasticity from Bunching
At a tax kink, the marginal tax rate jumps from 25% to 45% at an income level of $30,000. You estimate a normalized excess mass of b = 4.0. The bin width used in the histogram is $500.
-
Bin width sensitivity. Re-estimate the model with bin widths of 500, and $1,000. How do the estimates change? Narrower bins provide more resolution but noisier counts.
-
Notch vs. kink. A notch is a jump in the average tax rate (not the marginal rate). The bunching at a notch is typically much larger. Modify the simulation to create a notch and compare.
-
Diffuse bunching. In practice, optimization frictions mean that bunching is not sharp — individuals cluster in a region around the kink rather than exactly at it. Try increasing the noise for bunchers and see how it affects the estimates.
✓Key Takeaways
- Bunching estimation exploits the clustering of individuals at kink points or notch points in tax schedules or regulatory thresholds
- The key estimation step is fitting a smooth polynomial to the non-bunching bins to construct the counterfactual distribution
- The normalized excess mass measures the local behavioral response relative to the counterfactual density
- The behavioral elasticity is derived from the excess mass, the kink location, and the change in marginal tax rates
- Results are sensitive to the polynomial order and the excluded region; always report robustness to these choices
- Placebo tests at non-kink points help validate that the method is detecting real bunching, not estimation artifacts
- Bootstrap standard errors are necessary because the counterfactual polynomial introduces estimation uncertainty
- Optimization frictions (real-world adjustment costs) can attenuate observed bunching and bias elasticity estimates downward