MethodAtlas
Design-BasedModern

Bunching Estimation

Identifies behavioral responses from excess mass at notch or kink points in budget sets, estimating elasticities from distributional distortions.

Quick Reference

When to Use
When agents bunch at a threshold in the assignment variable and you want to estimate the elasticity of behavioral response.
Key Assumption
The counterfactual distribution (without the threshold) is smooth and can be approximated by a polynomial fitted to the observed distribution excluding the bunching region.
Common Mistake
Confusing notches (discrete jumps) with kinks (slope changes). Notch bunching implies larger elasticities. Also, ignoring optimization frictions that attenuate observed bunching.
Estimated Time
3 hours

One-Line Implementation

Stata: bunching income, kink(threshold) binwidth(500) poly(7) bandwidth(10000)
R: library(bunching); bunchit(z_vector = income, zstar = threshold, binwidth = 500, poly = 7)
Python: # Manual: fit polynomial to histogram bins excluding bunching region, integrate excess mass

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example

An economist wants to estimate how much taxpayers adjust their reported income in response to marginal tax rate changes. The US income tax schedule has kink points -- thresholds where the marginal tax rate jumps discretely. For example, at the boundary of the 10% and 12% brackets, the marginal rate increases by 2 percentage points. If taxpayers are responsive, they should cluster their reported income just below the kink to avoid the higher rate.

Here is the problem: the economist cannot run an experiment. She cannot randomly assign taxpayers to different kink points. She cannot compare taxpayers at different income levels, because high-income and low-income taxpayers differ in countless unobservable ways. A simple regression of income on tax rates would be hopelessly confounded by ability, education, occupation, and preferences.

The approach, developed by Saez (2010), solves this problem by exploiting the shape of the income distribution around the kink. If taxpayers are unresponsive to taxes, the income distribution should be smooth through the kink point -- there is no reason for a discontinuity in density. But if taxpayers respond by reducing their reported income to stay below the kink, we should see an excess mass of taxpayers bunching at the kink relative to a smooth .

The key insight is that the amount of excess bunching is proportional to the behavioral elasticity. More bunching means taxpayers are more responsive. The researcher estimates the counterfactual density by fitting a polynomial to the observed income histogram, excluding the bins near the kink, and then interpolating through the excluded region. The difference between the observed density and the counterfactual in the bunching region measures the excess mass, from which the elasticity is recovered.

This approach has been extended from kinks (where the marginal rate changes) to notches (where the average tax rate jumps discretely), where the incentive to bunch is much stronger and the implied elasticities can be identified more precisely (Kleven & Waseem, 2013).


A. Overview

What Bunching Estimation Does

Bunching estimation identifies behavioral responses to policy thresholds by measuring the excess mass of observations clustered at a kink or notch point relative to a smooth counterfactual density. The method proceeds in four steps:

  1. Bin the data: create a histogram of the running variable (e.g., reported income) with fine bins
  2. Exclude the bunching region: remove bins near the threshold where bunching occurs
  3. Fit a counterfactual: estimate a polynomial through the remaining bins to approximate the density that would have prevailed without the threshold
  4. Compute excess mass: integrate the difference between the observed density and the counterfactual over the bunching region

The excess mass BB -- or the normalized excess mass b=B/h0(z)b = B / h_0(z^*), where h0(z)h_0(z^*) is the counterfactual density height at the threshold -- is the key estimand. The implied elasticity is then:

e^=blog(1t1)log(1t0)\hat{e} = \frac{b}{\log(1 - t_1) - \log(1 - t_0)}

where t0t_0 and t1t_1 are the marginal tax rates below and above the kink.

Kinks vs. Notches

Different types of thresholds produce different bunching patterns:

  • Kink point: the marginal tax rate changes (the slope of the tax schedule changes). Bunching is typically modest because the incentive to adjust is proportional to the tax rate change. The income distribution shows a spike but no "missing mass" above the kink.

  • Notch point: the average tax rate jumps discontinuously (the level of the tax schedule jumps). The incentive to bunch is much stronger -- there is a region above the notch where no rational agent should locate (the "dominated region"). The income distribution shows both a spike at the notch and a "hole" (missing mass) above it. Kleven and Waseem (2013) developed the formal framework for notch bunching.

The Counterfactual Density

The is the distribution that would have prevailed in the absence of the threshold. It is estimated by fitting a polynomial of order pp (typically p=5p = 5 to 99) to the binned frequency counts, excluding bins within the bunching window [zδL,z+δR][z^* - \delta_L, z^* + \delta_R]. The polynomial is then used to interpolate through the excluded region, and an integration constraint ensures that the total mass is preserved -- the excess mass at the threshold must equal the "missing mass" above it.

How It Differs from RDD

Bunching estimation and (RKD) both exploit kink points, but they answer different questions with different identifying assumptions:

  • RKD estimates the causal effect of a treatment on an outcome using the kink in the treatment assignment function. It requires that the conditional expectation of the outcome is smooth through the kink.
  • Bunching estimates the elasticity of the running variable itself (e.g., reported income) with respect to the tax rate change at the kink. It requires that the density of the running variable would be smooth in the absence of the kink.

Common Confusions

When to Use

  1. Agents face a clear threshold in their budget set. Tax kinks, notch points, regulatory cutoffs, benefit eligibility thresholds -- any setting where the incentive structure changes discretely at a known point.

  2. The running variable is continuous and finely measured. Income, revenue, asset holdings, or other variables measured with sufficient precision to construct a histogram with fine bins.

  3. You want to estimate a behavioral elasticity. Bunching directly identifies how responsive agents are to changes in incentive structures, which is the key policy parameter for welfare analysis and optimal policy design.

  4. You have a large sample. Bunching requires enough observations in each bin to construct a reliable density estimate. Administrative data (tax records, social insurance registers) are ideal.

Do NOT Use Bunching When:

  1. The threshold is not known to agents. If agents are unaware of the kink or notch, they cannot respond to it, and bunching tells you about information, not preferences.

  2. The running variable is coarsely measured or discrete. If income is reported in broad brackets or round numbers only, the histogram is too coarse to estimate a smooth counterfactual density.

  3. Round-number bunching confounds the threshold. If the kink point coincides with a salient round number, behavioral bunching at the kink and round-number bunching are confounded.

  4. You want to estimate effects on a different outcome. Bunching estimates the elasticity of the running variable. If you want the causal effect on, say, labor hours or firm investment, you need RDD or RKD instead.

Connection to Other Methods

Bunching estimation is related to several other causal inference methods:

  • Regression Kink Design (RKD): Both exploit kink points, but RKD estimates the causal effect of a kink-assigned treatment on an outcome (e.g., the effect of unemployment insurance generosity on unemployment duration), while bunching estimates the elasticity of the running variable itself (e.g., how much reported income responds to the tax rate change). RKD requires a smooth conditional outcome expectation; bunching requires a smooth counterfactual density.

  • Regression Discontinuity (RDD): RDD exploits a cutoff where treatment status jumps. In bunching, the "treatment" is the change in incentives, and the response is the distortion of the density. The McCrary density test in RDD checks for manipulation (bunching) as a threat; in bunching estimation, the density distortion is the estimand.

  • Difference-in-Differences (DiD): Some bunching applications use a DiD-like comparison, exploiting tax reforms that change the location or size of kink points. By comparing bunching before and after a reform, researchers can difference out round-number effects and other non-behavioral density features.

  • Instrumental Variables (IV): Bunching can be viewed as a distributional version of IV. The kink or notch is the "instrument" -- an exogenous change in the budget set -- and the excess mass is the "first stage" response in the density of the running variable.


B. Identification

For bunching estimation to provide valid elasticity estimates, three key assumptions must hold (Saez, 2010).

Assumption 1: Smooth Counterfactual Density

Plain language: In the absence of the threshold, the density of the running variable would be smooth through the kink or notch point. There would be no "natural" pile-up or gap at the threshold -- any excess mass or missing mass is caused by behavioral responses to the threshold.

Formally: The counterfactual density h0(z)h_0(z) is continuously differentiable in a neighborhood of zz^*, the threshold. This rules out round-number bunching, institutional features, or other non-behavioral reasons for density discontinuities at the threshold.

This assumption is tested by checking for bunching at placebo thresholds -- points in the income distribution where no tax rate change occurs. If similar bunching appears at placebo kinks, the identifying assumption is likely violated.

Assumption 2: No Other Behavioral Responses at the Threshold

Plain language: The threshold affects behavior only through the mechanism of interest (e.g., the tax rate change). There is no other policy, regulation, or incentive that kicks in at the same threshold and could independently cause bunching.

For example, if a tax kink coincides with an eligibility threshold for a government transfer program, bunching could reflect responses to the transfer, the tax change, or both. The elasticity estimate would be confounded.

Assumption 3: Known Functional Form of Response

Plain language: The standard bunching estimator assumes a specific model of individual behavior -- typically isoelastic preferences -- to map from the observed excess mass to the structural elasticity. Blomquist and Newey (2017) showed that the bunching estimator identifies the elasticity only under these functional form restrictions. Without them, the same amount of bunching is consistent with a range of elasticities.

This assumption is the most debated in the bunching literature. Sensitivity analysis should explore how the implied elasticity changes under alternative preference specifications.


C. Visual Intuition

Adjust the elasticity and tax rate change to see how bunching emerges at the kink point. The counterfactual polynomial estimates what the density would look like without the threshold; the excess mass between the observed and counterfactual densities identifies the behavioral response.

Interactive Simulation

Bunching at a Tax Kink: Excess Mass and Elasticity

Income ~ LogNormal(10, 0.5). Threshold at z* = 22,026. Tax rate change = 10pp. Elasticity = 0.4. Friction = 30%. N = 5,000.

0438612917221516k18k20k22k24k27kIncomeFrequencyz*
Observed histogramCounterfactual densityExcess mass

Estimation Results

Estimatorβ̂SE95% CIBias
Excess mass (B)closest116.41110.789[95.26, 137.56]+0.000
Normalized excess mass (b)1.4100.141[1.13, 1.69]+0.000
Implied elasticity13.3802.007[9.45, 17.31]+12.980
True β0.400
5000

Number of taxpayers

0.4

Taxable income elasticity (0 = no response, 2 = very elastic)

10%

Percentage point increase in marginal rate at the kink

0.3

Fraction of agents unable to adjust (0 = frictionless, 1 = no bunching)

Why the difference?

With elasticity e = 0.4 and a tax rate change of 10 percentage points, taxpayers with incomes slightly above the threshold z* reduce their reported income to z*. This creates visible excess mass (bunching) at the threshold. The counterfactual polynomial (fitted to bins outside the bunching region and interpolated through it) provides the baseline density. The normalized excess mass b = 1.41 maps to an implied elasticity of 13.38.


D. Mathematical Derivation

The Bunching Estimator

Don't worry about the notation yet — here's what this means in words: Derives the relationship between excess mass at a kink point and the compensated elasticity of taxable income, under isoelastic preferences.

Setup. Consider a kink point at zz^* where the marginal tax rate increases from t0t_0 to t1t_1 (with t1>t0t_1 > t_0). Under the new (higher) marginal rate, an agent who would have chosen income z0>zz_0 > z^* in the absence of the kink may reduce income to exactly zz^* if the utility gain from bunching exceeds the utility cost of the income reduction.

Step 1: Define the marginal buncher. The marginal buncher is the agent who, absent the kink, would have chosen income z+Δzz^* + \Delta z^* and is just indifferent between bunching at zz^* and staying at z+Δzz^* + \Delta z^*. All agents with counterfactual income in [z,z+Δz][z^*, z^* + \Delta z^*] will bunch at zz^*.

Step 2: Compute excess mass. The excess mass at the kink is:

B=zz+Δzh0(z)dzh0(z)ΔzB = \int_{z^*}^{z^* + \Delta z^*} h_0(z)\, dz \approx h_0(z^*) \cdot \Delta z^*

The normalized excess mass is b=B/h0(z)Δzb = B / h_0(z^*) \approx \Delta z^*.

Step 3: Link to the elasticity. Under isoelastic preferences with compensated elasticity ee, the marginal buncher satisfies:

Δz=z[(1t01t1)e1]\Delta z^* = z^* \left[ \left(\frac{1 - t_0}{1 - t_1}\right)^e - 1 \right]

For a small tax rate change, this simplifies to:

bΔzezt1t01t0b \approx \Delta z^* \approx e \cdot z^* \cdot \frac{t_1 - t_0}{1 - t_0}

Step 4: Solve for the elasticity.

e^=bz1t0t1t0\hat{e} = \frac{b}{z^*} \cdot \frac{1 - t_0}{t_1 - t_0}

Equivalently:

e^=blog(1t0)log(1t1)1z\hat{e} = \frac{b}{\log(1 - t_0) - \log(1 - t_1)} \cdot \frac{1}{z^*}

This expression is the standard bunching elasticity estimator for kink points. The estimator is a sample analog: estimate bb from the data, plug in the known tax rates, and compute e^\hat{e}.

Don't worry about the notation yet — here's what this means in words: Derives how notch bunching differs from kink bunching and why notches produce stronger responses, following Kleven and Waseem (2013).

Setup. At a notch, the average tax rate (not just the marginal rate) jumps at zz^*. An agent earning z+ϵz^* + \epsilon pays discretely more in taxes than an agent earning zϵz^* - \epsilon. This jump creates a "dominated region" above zz^* where no rational agent should locate.

Step 1: Define the dominated region. For agents just above zz^*, the discrete tax increase means that earning z+ϵz^* + \epsilon yields strictly less after-tax income than earning zz^*. The dominated region extends from zz^* to z+Δznotchz^* + \Delta z^*_{notch}, where Δznotch\Delta z^*_{notch} solves the indifference condition. The upper bound of the dominated region depends on the elasticity and the notch size.

Step 2: Key difference from kinks. At a kink, agents smoothly shade their income down to zz^*. At a notch, ALL agents in the dominated region should jump to exactly zz^* (or above the dominated region). This produces:

  • A sharp spike at zz^* (bunching)
  • A "hole" (missing mass) just above zz^*, extending through the dominated region
  • The excess mass must equal the missing mass (integration constraint)

Step 3: Elasticity from notch bunching. The elasticity is identified from the width of the dominated region, which is observed as the upper bound of the missing-mass region. The formula involves inverting the indifference condition and is detailed in Kleven and Waseem (2013).

Key advantage of notches: Because the dominated region is directly observable in the data, notch bunching provides a tighter bound on the elasticity than kink bunching. The integration constraint -- excess mass equals missing mass -- serves as an internal consistency check.


E. Implementation

Bunching Estimation: Step by Step

library(bunching)

# ---- Step 1: Set parameters ----
zstar    <- 10000   # kink point (threshold)
binwidth <- 500     # bin width
poly     <- 7       # polynomial order

# ---- Step 2: Run bunching estimator ----
result <- bunchit(
z_vector  = df$income,
zstar     = zstar,
binwidth  = binwidth,
bins_l    = 20,    # bins left of excluded region
bins_r    = 20,    # bins right of excluded region
poly      = poly,
t0        = 0,     # kink (not notch)
notch     = FALSE
)

# ---- Step 3: Extract key results ----
summary(result)
# B     = excess mass
# b     = normalized excess mass
# e     = implied elasticity
# B_se  = bootstrap SE
RequiresMASS

F. Diagnostics

F.1 Placebo Tests at Non-Kink Points

Apply the same bunching estimation procedure at income levels where no kink or notch exists. If you find significant "bunching" at placebo points, the smooth counterfactual assumption may be violated -- round-number effects, institutional features, or other non-behavioral factors are producing density irregularities that the polynomial cannot capture.

F.2 Integration Constraint (Notch Designs)

For notch bunching, the excess mass at the threshold must equal the missing mass above it. This constraint is an internal consistency check: agents who bunch at the notch must come from somewhere (the dominated region above the notch). If excess mass exceeds missing mass, the model is misspecified -- perhaps the dominated region boundaries are wrong, or there is bunching from agents who were initially below the threshold.

F.3 Polynomial Order Sensitivity

Plot the estimated excess mass B^\hat{B} as a function of the polynomial order pp for p = 3, 4, 5, 6, 7, 8, 9, 10. The estimate should stabilize for moderate p values. If the excess mass changes substantially as p increases, the counterfactual is sensitive to functional form and the estimate is fragile. Low-order polynomials (p below 5) may be too inflexible; high-order polynomials (p above 9) may overfit.

F.4 Bunching Window Sensitivity

The choice of the excluded region [zδL,z+δR][z^* - \delta_L, z^* + \delta_R] affects the estimate. Too narrow: the polynomial is fit using bins that are part of the bunching response, biasing the counterfactual downward and understating excess mass. Too wide: the polynomial has fewer data points to fit and the interpolation is less precise. Report results for a range of window widths.

F.5 Visual Inspection

Always produce two plots:

  1. Histogram with counterfactual overlay: the observed binned density as bars, the polynomial counterfactual as a dashed line, and the bunching region shaded. The excess mass should be visually apparent.
  2. Sensitivity plots: excess mass and implied elasticity as functions of polynomial order and bunching window width.
# Sensitivity to polynomial order
poly_range <- 5:9
B_by_poly <- sapply(poly_range, function(p) {
res <- bunchit(z_vector = df$income, zstar = zstar,
                binwidth = binwidth, bins_l = 20,
                bins_r = 20, poly = p, notch = FALSE)
res$B
})

plot(poly_range, B_by_poly, type = "b", pch = 19,
   xlab = "Polynomial Order", ylab = "Excess Mass (B)",
   main = "Sensitivity to Polynomial Order")

# Placebo test at a non-kink point
placebo_zstar <- 15000
placebo <- bunchit(z_vector = df$income,
                  zstar = placebo_zstar,
                  binwidth = binwidth, bins_l = 20,
                  bins_r = 20, poly = 7, notch = FALSE)
cat("Placebo B:", placebo$B, "\n")
RequiresMASS

Reading the Output

The key outputs from a bunching estimation are:

StatisticSymbolInterpretation
Excess massB^\hat{B}Number of "extra" observations at the threshold above what a smooth density would predict
Normalized excess massb^\hat{b}B^\hat{B} divided by the counterfactual density height at the threshold. Comparable across settings
Implied elasticitye^\hat{e}The compensated elasticity of the running variable with respect to the net-of-tax rate
Bootstrap SESE(B^)SE(\hat{B})Standard error from resampling residuals of the polynomial fit

What to Report

A well-reported bunching analysis should include:

  1. The histogram with counterfactual overlay and bunching region shaded
  2. Excess mass B^\hat{B} (or normalized b^\hat{b}) with bootstrap confidence interval
  3. Implied elasticity e^\hat{e} with standard error
  4. Polynomial order used and sensitivity across orders 5-9
  5. Bunching window boundaries and sensitivity to window width
  6. Bin width used and sensitivity
  7. Placebo tests at non-kink points
  8. Discussion of frictions -- are the estimates naive or friction-corrected?

G. What Can Go Wrong

Assumption Failure Demo

Round-Number Bunching Confounds the Threshold

A tax kink at $12,750 -- a non-round number with no special salience. Any excess density at this threshold is plausibly a behavioral response to the tax rate change.

Normalized excess mass b = 1.2 (SE = 0.3). Placebo tests at nearby round numbers show no comparable bunching. The implied elasticity of 0.25 is credibly identified.

Assumption Failure Demo

Ignoring Optimization Frictions

Bunching estimation with a friction-corrected model following Kleven and Waseem (2013), which accounts for the fact that many agents face adjustment costs and cannot perfectly locate at the kink

Frictionless (naive) elasticity: 0.10. Friction-corrected structural elasticity: 0.35. The friction model estimates that 60% of agents face positive adjustment costs. The naive estimator understates the true behavioral response by more than 3x.

Assumption Failure Demo

Wrong Polynomial Order Distorts the Counterfactual

Polynomial of order 7 fits the non-bunching region well, and results are stable across orders 5-9

B = 450 (SE = 85) with p = 7. Sensitivity: B = 420 (p=5), 445 (p=6), 450 (p=7), 455 (p=8), 460 (p=9). The estimate is robust.

Assumption Failure Demo

Bunching Window Too Narrow -- Counterfactual Contaminated

Bunching window of +/- 3 bins around the threshold, chosen to include all visible bunching. The polynomial is fit to bins outside this region, which show a smooth density.

B = 450 (SE = 85). The counterfactual polynomial fits the non-bunching region well (R-squared = 0.97). Visual inspection confirms the excluded region captures all excess mass.


H. Practice

H.1 Concept Checks

Concept Check

A researcher estimates normalized excess mass of b = 2.0 at a kink where the marginal tax rate increases from 15% to 25%. She computes the elasticity as e = b / (t1 - t0) = 2.0 / 0.10 = 20. Is this calculation correct?

Concept Check

Self-employed taxpayers show much more bunching at tax kinks than wage earners. Does this necessarily mean the self-employed have a higher elasticity of taxable income?

Concept Check

A researcher applies bunching estimation at a tax notch and finds large excess mass at the threshold but no visible 'hole' (missing mass) above it. She reports a large implied elasticity. What is wrong?

H.2 Guided Exercise

Guided Exercise

Interpreting Bunching Output from an EITC Study

You study bunching at the first EITC kink point, where the marginal tax rate changes from -40% (subsidy) to 0% (phase-in range to flat range) at \$7,340 of earned income. You use IRS administrative data (2010-2015, N = 8.2 million returns), \$250 bins, a 7th-order polynomial, and a bunching window of +/- \$1,500. Your output: Statistic | Value Excess mass (B) | 142,500 Counterfactual height | 48,200 Normalized excess mass | 2.96 Bootstrap SE (B) | 18,300 Polynomial order | 7 Bin width | 250 Sensitivity (polynomial order): p=5: B=138,000 | p=6: B=141,200 | p=7: B=142,500 | p=8: B=143,100 | p=9: B=143,800 Sensitivity (window width): +/-1000: B=128,400 | +/-1500: B=142,500 | +/-2000: B=148,200 | +/-2500: B=151,000 Placebo at 10,000 (no kink): B = 8,200 (SE = 12,400, p = 0.51) Placebo at 15,000 (no kink): B = -2,100 (SE = 11,800, p = 0.86)

What is the normalized excess mass, and is it statistically significant?

Is the estimate robust to polynomial order and bunching window width? What do the placebo tests tell you?

Compute the implied elasticity. The net-of-tax rate changes from 1.40 (40% subsidy) to 1.00 (0% rate) at the kink.

The sample includes both self-employed and wage earners. Among the self-employed only, b = 8.4 and the implied elasticity is 4.0. Among wage earners, b = 0.3 and the elasticity is 0.14. What does this tell you about frictions?

H.3 Error Detective

Error Detective

Read the analysis below carefully and identify the errors.

A public finance researcher studies bunching at a state income tax kink where the marginal rate increases from 5% to 7% at $50,000 of taxable income. She uses $1,000 bins, a 3rd-order polynomial, and a bunching window of +/- $1,000 (1 bin on each side). She finds: "We estimate normalized excess mass of b = 4.2 (SE = 0.8, p < 0.001). The implied elasticity of taxable income is 0.85. Bunching is clearly visible in the histogram, confirming that taxpayers are highly responsive to marginal tax rate changes." She does not report polynomial sensitivity, bunching window sensitivity, or placebo tests.

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

A health economist studies bunching at a Medicare billing threshold. Hospitals receive higher reimbursement for patients classified as DRG weight >= 1.5 compared to < 1.5. She applies the Saez (2010) kink bunching methodology and estimates: "Using the bunching estimator, we find significant excess mass of hospital cases just above the 1.5 DRG threshold (b = 3.1, SE = 0.6). This implies a price elasticity of case classification of 0.55." She uses a 7th-order polynomial and reports sensitivity to polynomial order (stable from 5-9).

Select all errors you can find:

H.4 You Are the Referee

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors estimate the elasticity of taxable income using bunching at a state income tax kink where the marginal rate increases from 4% to 6% at \$75,000 of adjusted gross income. Using state tax returns for 2015-2019 (N = 1.8 million returns), they estimate normalized excess mass of b = 1.85 and an implied compensated elasticity of 0.42. They use \$500 bins, a 4th-order polynomial, and a bunching window of +/- \$2,000. They do not report sensitivity to polynomial order, window width, or bin size. No placebo tests are conducted.

Key Table

VariableCoefficientSEp-value
Excess mass (B)12,4002,800<0.001
Normalized b1.850.42<0.001
Implied elasticity0.420.10<0.001
Polynomial order4
Bin width500
Window+/- 2,000
N (returns)1,800,000

Authors' Identification Claim

The authors argue that the kink in the tax schedule creates a well-defined change in the marginal tax rate, and that excess bunching at the kink reveals taxpayers' behavioral response to the rate change.


I. Swap-In: When to Use Something Else

  • Regression Kink Design (RKD): when you want to estimate the causal effect of a kink-determined treatment on a downstream outcome (e.g., the effect of unemployment insurance on job search duration), rather than the elasticity of the running variable itself.

  • Regression Discontinuity (RDD): when treatment status (not just the incentive slope) changes at the threshold. RDD estimates the local treatment effect; bunching estimates the behavioral elasticity. If you have a notch, you can use RDD for the outcome effect and bunching for the elasticity -- they complement each other.

  • Instrumental Variables (IV): when you want to estimate a behavioral parameter but do not have a density-based approach -- for example, using variation in tax rates across jurisdictions as instruments for reported income.

  • Structural estimation: when frictions are severe and you need a fully parametric model of the decision problem to identify the elasticity. The Kleven and Waseem (2013) model is a hybrid: it combines reduced-form bunching with a structural friction distribution.


J. Reviewer Checklist

Critical Reading Checklist

Paper Library

Foundational (3)

Saez, E. (2010). Do Taxpayers Bunch at Kink Points?.

American Economic Journal: Economic PolicyDOI: 10.1257/pol.2.3.180

Saez introduced the modern bunching methodology by examining taxpayer responses to kink points in the US income tax schedule, where marginal tax rates change discretely. He showed how to estimate the compensated elasticity of reported income from the excess mass of taxpayers at kink points relative to a smooth counterfactual density fitted by polynomial. The paper established the standard empirical approach: bin the data, fit a polynomial excluding the bunching region, and compute the excess mass. He found modest elasticities overall but sharp bunching among the self-employed near the first EITC kink.

Kleven, H. J., & Waseem, M. (2013). Using Notches to Uncover Optimization Frictions and Structural Elasticities: Theory and Evidence from Pakistan.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjt004

Kleven and Waseem extended bunching estimation from kinks to notches -- discrete jumps in the tax schedule where the average tax rate changes discontinuously. They developed a structural framework that distinguishes between frictionless and frictional bunching, showing that optimization frictions attenuate observed bunching and cause the naive estimator to understate the true elasticity. Their model identifies both the structural elasticity and the friction distribution from the observed bunching pattern. Applied to Pakistan's income tax notches, they demonstrated that frictions are empirically important and that ignoring them substantially biases elasticity estimates downward.

Blomquist, S., & Newey, W. K. (2017). The Bunching Estimator Cannot Identify the Taxable Income Elasticity.

NBER Working Paper No. 24136DOI: 10.3386/w24136

Blomquist and Newey provide a critical examination of the identification assumptions underlying bunching estimation. They show that the standard bunching estimator identifies the elasticity only under strong assumptions about the functional form of the counterfactual density and the distribution of preferences. Without these assumptions, the amount of bunching is consistent with a range of elasticities. The paper sparked an important methodological debate about what bunching can and cannot identify, and motivated subsequent work on tightening identification in bunching designs.

Application (1)

Chetty, R., Friedman, J. N., Olsen, T., & Pistaferri, L. (2011). Adjustment Costs, Firm Responses, and Micro vs. Macro Labor Supply Elasticities: Evidence from Danish Tax Records.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjr013

Chetty, Friedman, Olsen, and Pistaferri used Danish administrative tax data to reconcile the gap between micro and macro labor supply elasticities using bunching methods. They showed that adjustment frictions explain why micro estimates from bunching at tax kinks are small: many workers cannot freely adjust hours, so observed bunching understates the frictionless elasticity. They estimated that accounting for frictions raises the implied elasticity substantially. The paper is a landmark application of bunching to the micro-macro elasticity puzzle and introduced key methods for dealing with frictions in bunching designs.

Survey (1)

Kleven, H. J. (2016). Bunching.

Annual Review of EconomicsDOI: 10.1146/annurev-economics-080315-015234

Kleven provides a comprehensive survey of the bunching methodology, covering both kink and notch designs, the role of optimization frictions, and extensions to multiple applications beyond taxation. The survey unifies the theoretical frameworks from Saez (2010) and Kleven and Waseem (2013), discusses practical implementation issues (polynomial order, bandwidth, bin width), and catalogs the growing literature applying bunching to estimate behavioral elasticities in public finance, labor economics, and regulation. Essential reading for anyone starting with bunching methods.

Tags

design-baseddensityelasticitypublic-finance