MethodAtlas
Lab·tutorial·6 min read
tutorial90 minutes

Lab: Bunching Estimation from Scratch

Implement bunching estimation step by step. Simulate an income distribution with bunching at a tax kink, fit a polynomial counterfactual, estimate excess mass, compute the implied behavioral elasticity, and assess robustness to polynomial order.

LanguagesPython, R, Stata
DatasetTaxable income distribution at EITC kink (simulated)

Overview

Bunching estimation exploits the fact that individuals cluster at kink points or notch points in tax schedules, benefit formulas, or regulatory thresholds. The excess mass at these points — relative to a smooth counterfactual distribution — reveals behavioral responses. This method was formalized by Saez (2010) and has become a standard tool in public finance and labor economics.

What you will learn:

  • How bunching at a kink point reveals behavioral responses to incentives
  • How to construct a polynomial counterfactual distribution
  • How to estimate the excess mass (bunching) at the kink
  • How to convert excess mass into a behavioral elasticity
  • How to assess sensitivity to polynomial order and bin width
  • How to run a placebo test at a non-kink point

Prerequisites: OLS regression, basic understanding of tax policy and labor supply elasticities.


Step 1: Simulate Income Distribution with Bunching

We simulate a taxable income distribution where individuals bunch at a kink point in the tax schedule. Below the kink, the marginal tax rate is low; above the kink, the rate jumps.

set.seed(42)
n <- 50000

# Tax schedule parameters
kink <- 20000          # Kink point (dollars)
tau_below <- 0.15      # Marginal tax rate below kink
tau_above <- 0.40      # Marginal tax rate above kink
delta_tau <- tau_above - tau_below  # Tax rate jump = 0.25

# True behavioral elasticity
true_elasticity <- 0.3

# Simulate counterfactual income (what people would earn without the kink)
# Log-normal distribution centered around the kink
income_star <- rlnorm(n, meanlog = log(22000), sdlog = 0.35)

# Bunching: individuals with income_star in (kink, kink + dz*) bunch to kink
# dz* = kink * elasticity * (delta_tau / (1 - tau_above))
# This is the range of counterfactual incomes that bunch at the kink
dz_star <- kink * true_elasticity * (delta_tau / (1 - tau_above))
cat("Theoretical bunching range (dz*):", round(dz_star, 0), "dollars\n")

# Apply bunching: people in (kink, kink + dz*) move to the kink
income_observed <- ifelse(
income_star > kink & income_star <= kink + dz_star,
kink,  # They bunch at the kink
income_star
)

# Add small noise to bunchers (they cluster near kink, not exactly at it)
bunchers <- income_star > kink & income_star <= kink + dz_star
income_observed[bunchers] <- kink + rnorm(sum(bunchers), mean = 0, sd = 200)

df <- data.frame(income = income_observed, income_star = income_star)

cat("Kink point:", kink, "\n")
cat("Tax rate below kink:", tau_below, "\n")
cat("Tax rate above kink:", tau_above, "\n")
cat("Number of bunchers:", sum(bunchers), "\n")
cat("Fraction bunching:", round(mean(bunchers), 4), "\n")

Expected output:

ParameterValue
Kink point$20,000
Tax rate below kink15%
Tax rate above kink40%
Bunching range (dz*)~$2,500
Number of bunchers~3,000–5,000

The simulated data should show a clear spike in the income distribution at the kink point, with a "missing mass" in the region just above the kink.


Step 2: Create Histogram and Identify Bunching

Bin the income data and visualize the distribution to see the bunching at the kink.

# Create bins
bin_width <- 500  # $500 bins
bins <- seq(10000, 35000, by = bin_width)
bin_mids <- bins[-length(bins)] + bin_width / 2

# Count observations in each bin
counts <- hist(df$income, breaks = bins, plot = FALSE)$counts

bin_df <- data.frame(
bin_mid = bin_mids,
count = counts,
rel_income = bin_mids - kink  # Distance from kink
)

# Plot
barplot(bin_df$count, names.arg = bin_df$bin_mid / 1000,
      xlab = "Income (thousands $)", ylab = "Count",
      main = "Income Distribution with Bunching at Kink",
      col = ifelse(abs(bin_df$bin_mid - kink) <= bin_width, "red", "steelblue"))
abline(v = which.min(abs(bin_df$bin_mid - kink)), lty = 2)

cat("Bin at kink point (count):", bin_df$count[which.min(abs(bin_df$bin_mid - kink))], "\n")
cat("Average bin count (away from kink):",
  round(mean(bin_df$count[abs(bin_df$rel_income) > 3000]), 1), "\n")

Expected output:

The histogram should show a clear spike at the kink point ($20,000), with the bunching bin containing 3–5 times more observations than a typical bin away from the kink. There should also be a visible "hole" — missing mass — in the bins just above the kink, where individuals would have been located had they not bunched.


Step 3: Fit the Polynomial Counterfactual

The key step in bunching estimation is fitting a smooth polynomial to the bin counts, excluding the bins near the kink. This polynomial represents the counterfactual distribution — what the income distribution would look like without the kink.

# Define excluded region around the kink
excl_lower <- kink - 1000  # Exclude bins $1,000 below kink
excl_upper <- kink + 3000  # Exclude bins $3,000 above kink (includes missing mass)

# Mark excluded bins
bin_df$excluded <- bin_df$bin_mid >= excl_lower & bin_df$bin_mid <= excl_upper

# Fit polynomial of order 7 on non-excluded bins
poly_order <- 7
fit_data <- bin_df[!bin_df$excluded, ]

poly_model <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE), data = fit_data)

# Predict counterfactual for ALL bins (including excluded)
bin_df$counterfactual <- predict(poly_model, newdata = bin_df)

# Excess mass: observed - counterfactual in the bunching region
bunching_bins <- bin_df$excluded
excess_mass <- sum(bin_df$count[bunching_bins]) - sum(bin_df$counterfactual[bunching_bins])

# Normalized excess mass: b = excess / (counterfactual height at kink * bin_width)
cf_at_kink <- bin_df$counterfactual[which.min(abs(bin_df$bin_mid - kink))]
b_normalized <- excess_mass / cf_at_kink

cat("=== Bunching Estimates ===\n")
cat("Excess mass (raw count):", round(excess_mass, 0), "\n")
cat("Counterfactual at kink:", round(cf_at_kink, 1), "\n")
cat("Normalized excess mass (b):", round(b_normalized, 3), "\n")
RequiresMASS

Expected output:

StatisticValue
Excess mass (raw)~3,000–5,000
Counterfactual at kink~800–1,200
Normalized excess mass (b)~3.0–5.0

The normalized excess mass tells you how many "counterfactual bin heights" of individuals are bunching at the kink. A value of 4 means the bunching bin has 4 times more individuals than the counterfactual predicts.

Concept Check

What does the 'normalized excess mass' (b) measure in bunching estimation?


Step 4: Compute the Elasticity

The key output of bunching estimation is the behavioral elasticity. At a kink point, the elasticity is derived from the normalized excess mass and the change in marginal tax rates.

# At a kink: dz* = z* * e * (delta_tau / (1 - tau_above))
# Normalized excess mass b ~ dz* / bin_width (approximately)
# So: e ~ b * bin_width / (z* * delta_tau / (1 - tau_above))

# More precisely, estimate dz* from excess mass
# dz* is the width of the region from which bunchers come
dz_star_hat <- excess_mass / cf_at_kink * bin_width

# Elasticity: e = dz* / (z* * (delta_tau / (1 - tau_above)))
elasticity_hat <- dz_star_hat / (kink * (delta_tau / (1 - tau_above)))

cat("=== Elasticity Estimate ===\n")
cat("Estimated dz*:", round(dz_star_hat, 0), "dollars\n")
cat("True dz*:", round(dz_star, 0), "dollars\n")
cat("\nEstimated elasticity:", round(elasticity_hat, 3), "\n")
cat("True elasticity:", true_elasticity, "\n")
cat("\nTax schedule parameters used:\n")
cat("  z* (kink point):", kink, "\n")
cat("  delta_tau (rate jump):", delta_tau, "\n")
cat("  1 - tau_above:", 1 - tau_above, "\n")
RequiresMASS

Expected output:

ParameterEstimateTrue Value
dz* (bunching range)~2,000–3,000~2,500
Elasticity~0.25–0.350.30

The estimated elasticity should be close to the true value of 0.30. This elasticity tells us that a 1% increase in the net-of-tax rate leads to a 0.30% increase in reported taxable income.


Step 5: Robustness to Polynomial Order

The polynomial counterfactual is the most sensitive part of the estimation. Different polynomial orders can produce different estimates.

# Try polynomial orders 3 through 10
results <- data.frame(poly_order = integer(), excess = numeric(),
                    b_norm = numeric(), elasticity = numeric())

for (p in 3:10) {
poly_p <- lm(count ~ poly(bin_mid, p, raw = TRUE), data = fit_data)
cf_p <- predict(poly_p, newdata = bin_df)
excess_p <- sum(bin_df$count[bunching_bins]) - sum(cf_p[bunching_bins])
cf_kink_p <- cf_p[which.min(abs(bin_df$bin_mid - kink))]
b_p <- excess_p / cf_kink_p
dz_p <- excess_p / cf_kink_p * bin_width
e_p <- dz_p / (kink * (delta_tau / (1 - tau_above)))

results <- rbind(results, data.frame(
  poly_order = p, excess = round(excess_p, 0),
  b_norm = round(b_p, 3), elasticity = round(e_p, 3)
))
}

cat("=== Robustness to Polynomial Order ===\n")
print(results)
cat("\nTrue elasticity:", true_elasticity, "\n")
cat("Preferred specification: order 7 (standard in the literature)\n")

Expected output:

Polynomial OrderExcess Massb (normalized)Elasticity
3~3,500~3.5~0.28
5~3,800~3.8~0.30
7~3,900~3.9~0.31
9~3,800~3.8~0.30
10~3,900~3.9~0.31

The estimates should be relatively stable across polynomial orders 5–10, with orders 3–4 potentially underfitting and very high orders potentially overfitting. Order 7 is conventional in the literature.

Concept Check

Why is the bunching estimator sensitive to the polynomial order used for the counterfactual?


Step 6: Placebo Test

A placebo test checks whether the method spuriously detects bunching at a non-kink point. If the method finds significant bunching where none should exist, the results are unreliable.

# Placebo: test for bunching at $15,000 (no kink there)
placebo_kink <- 15000
excl_lower_p <- placebo_kink - 1000
excl_upper_p <- placebo_kink + 3000

bin_df$excluded_placebo <- bin_df$bin_mid >= excl_lower_p &
                          bin_df$bin_mid <= excl_upper_p

fit_placebo <- bin_df[!bin_df$excluded_placebo, ]
poly_placebo <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE),
                 data = fit_placebo)
bin_df$cf_placebo <- predict(poly_placebo, newdata = bin_df)

excess_placebo <- sum(bin_df$count[bin_df$excluded_placebo]) -
                sum(bin_df$cf_placebo[bin_df$excluded_placebo])

cf_at_placebo <- bin_df$cf_placebo[which.min(abs(bin_df$bin_mid - placebo_kink))]
b_placebo <- excess_placebo / cf_at_placebo

cat("=== Placebo Test at $15,000 ===\n")
cat("Excess mass (placebo):", round(excess_placebo, 0), "\n")
cat("Normalized b (placebo):", round(b_placebo, 3), "\n")
cat("\nComparison:\n")
cat("  True kink ($20,000): b =", round(b_normalized, 3), "\n")
cat("  Placebo ($15,000):   b =", round(b_placebo, 3), "\n")
cat("\nPlacebo should be close to zero (no bunching at $15,000).\n")
RequiresMASS

Expected output:

LocationNormalized Excess Mass (b)Significant?
True kink ($20,000)~3.5–4.5Yes
Placebo ($15,000)~-0.1 to 0.1No

The placebo test should find no significant bunching at $15,000, confirming that the method is detecting a real behavioral response at the true kink point rather than an artifact of the estimation procedure.


Step 7: Bootstrap Standard Errors

Bunching estimates require bootstrap standard errors because the counterfactual polynomial introduces estimation uncertainty that analytical formulas do not capture.

# Bootstrap standard errors for the elasticity
n_boot <- 200
boot_elasticities <- numeric(n_boot)

for (b_iter in 1:n_boot) {
# Resample individuals
boot_sample <- sample(df$income, n, replace = TRUE)
boot_counts <- hist(boot_sample, breaks = bins, plot = FALSE)$counts

boot_df <- data.frame(bin_mid = bin_mids, count = boot_counts)
boot_df$excluded <- boot_df$bin_mid >= excl_lower & boot_df$bin_mid <= excl_upper

boot_fit <- boot_df[!boot_df$excluded, ]
boot_poly <- lm(count ~ poly(bin_mid, poly_order, raw = TRUE), data = boot_fit)
boot_cf <- predict(boot_poly, newdata = boot_df)

boot_excess <- sum(boot_df$count[boot_df$excluded]) -
               sum(boot_cf[boot_df$excluded])
boot_cf_kink <- boot_cf[which.min(abs(boot_df$bin_mid - kink))]
boot_dz <- boot_excess / boot_cf_kink * bin_width
boot_elasticities[b_iter] <- boot_dz / (kink * (delta_tau / (1 - tau_above)))
}

cat("=== Bootstrap Results (", n_boot, "replications) ===\n")
cat("Mean elasticity:", round(mean(boot_elasticities), 3), "\n")
cat("SE:", round(sd(boot_elasticities), 3), "\n")
cat("95% CI: [", round(quantile(boot_elasticities, 0.025), 3), ",",
  round(quantile(boot_elasticities, 0.975), 3), "]\n")
cat("\nTrue elasticity:", true_elasticity, "\n")

Expected output:

StatisticValue
Mean elasticity (bootstrap)~0.28–0.32
Bootstrap SE~0.02–0.05
95% CI~[0.22, 0.38]
True elasticity0.30

The 95% confidence interval should contain the true elasticity of 0.30.


Step 8: Exercises

Guided Exercise

Computing the Elasticity from Bunching

At a tax kink, the marginal tax rate jumps from 25% to 45% at an income level of $30,000. You estimate a normalized excess mass of b = 4.0. The bin width used in the histogram is $500.

What is the change in marginal tax rate (delta_tau)?

What is the estimated dz* (bunching range) in dollars? (dz* = b * bin_width)

What is the estimated elasticity? Use e = dz* / (z* * delta_tau / (1 - tau_above)). Round to 2 decimal places.

  1. Bin width sensitivity. Re-estimate the model with bin widths of 250,250, 500, and $1,000. How do the estimates change? Narrower bins provide more resolution but noisier counts.

  2. Notch vs. kink. A notch is a jump in the average tax rate (not the marginal rate). The bunching at a notch is typically much larger. Modify the simulation to create a notch and compare.

  3. Diffuse bunching. In practice, optimization frictions mean that bunching is not sharp — individuals cluster in a region around the kink rather than exactly at it. Try increasing the noise for bunchers and see how it affects the estimates.


Key Takeaways

  • Bunching estimation exploits the clustering of individuals at kink points or notch points in tax schedules or regulatory thresholds
  • The key estimation step is fitting a smooth polynomial to the non-bunching bins to construct the counterfactual distribution
  • The normalized excess mass measures the local behavioral response relative to the counterfactual density
  • The behavioral elasticity is derived from the excess mass, the kink location, and the change in marginal tax rates
  • Results are sensitive to the polynomial order and the excluded region; always report robustness to these choices
  • Placebo tests at non-kink points help validate that the method is detecting real bunching, not estimation artifacts
  • Bootstrap standard errors are necessary because the counterfactual polynomial introduces estimation uncertainty
  • Optimization frictions (real-world adjustment costs) can attenuate observed bunching and bias elasticity estimates downward