MethodAtlas
Method·advanced·20 min read
Design-BasedModern

Bunching Estimation

Identifies behavioral responses from excess mass at notch or kink points in budget sets, estimating elasticities from distributional distortions.

When to UseWhen agents bunch at a threshold in the assignment variable and you want to estimate the elasticity of behavioral response.
AssumptionThe counterfactual distribution (without the threshold) is smooth and can be approximated by a polynomial fitted to the observed distribution excluding the bunching region.
MistakeConfusing notches (discrete jumps) with kinks (slope changes). Notch bunching implies larger elasticities. Also, ignoring optimization frictions that attenuate observed bunching.
Reading Time~20 min read · 11 sections · 8 interactive exercises

One-Line Implementation

Rlibrary(bunching); bunchit(z_vector = income, zstar = threshold, binwidth = 500, poly = 7)
Statabunching income, kink(threshold) binwidth(500) poly(7) bandwidth(10000)
Python# Manual: fit polynomial to histogram bins excluding bunching region, integrate excess mass

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: Taxpayer Responses at Income Tax Kinks

An economist wants to estimate how much taxpayers adjust their reported income in response to marginal tax rate changes. The US income tax schedule has s — thresholds where the marginal tax rate jumps discretely. For example, at the boundary of the 10% and 12% brackets, the marginal rate increases by 2 percentage points. If taxpayers are responsive, they should cluster their reported income just below the kink to avoid the higher rate.

Here is the problem: the economist cannot run an experiment. She cannot randomly assign taxpayers to different kink points. She cannot compare taxpayers at different income levels, because high-income and low-income taxpayers differ in countless unobservable ways. A simple regression of income on tax rates would be hopelessly confounded by ability, education, occupation, and preferences.

The approach, developed by Saez (2010), solves this problem by exploiting the shape of the income distribution around the kink. If taxpayers are unresponsive to taxes, the income distribution should be smooth through the kink point — there is no reason for a discontinuity in density. But if taxpayers respond by reducing their reported income to stay below the kink, we should see an excess mass of taxpayers bunching at the kink relative to a smooth .

The key insight is that the amount of excess bunching is proportional to the behavioral . More bunching means taxpayers are more responsive. The researcher estimates the counterfactual density by fitting a polynomial to the observed income histogram, excluding the bins near the kink, and then interpolating through the excluded region. The difference between the observed density and the counterfactual in the bunching region measures the excess mass, from which the elasticity is recovered.

This approach has been extended from kinks (where the marginal rate changes) to notches (where the average tax rate jumps discretely), where the incentive to bunch is much stronger and the implied elasticities can be identified more precisely (Kleven & Waseem, 2013).


AOverview

What Bunching Estimation Does

Bunching estimation identifies behavioral responses to policy thresholds by measuring the excess mass of observations clustered at a kink or notch point relative to a smooth counterfactual density. The method proceeds in four steps:

  1. Bin the data: create a histogram of the running variable (e.g., reported income) with fine bins
  2. Exclude the bunching region: remove bins near the threshold where bunching occurs
  3. Fit a counterfactual: estimate a polynomial through the remaining bins to approximate the density that would have prevailed without the threshold
  4. Compute excess mass: integrate the difference between the observed density and the counterfactual over the bunching region

The excess mass BB — or the normalized excess mass b=B/h0(z)b = B / h_0(z^*), where h0(z)h_0(z^*) is the counterfactual density height at the threshold — is the key estimand. Because Bh0(z)ΔzB \approx h_0(z^*) \cdot \Delta z^*, the normalized excess mass approximates the bunching range directly: bΔzb \approx \Delta z^*. The implied elasticity is then:

e^=Δzz[log(1t0)log(1t1)]\hat{e} = \frac{\Delta z^*}{z^* \cdot [\log(1 - t_0) - \log(1 - t_1)]}

where Δzb\Delta z^* \approx b is the bunching range in income units, zz^* is the kink point, and t0t_0 and t1t_1 are the marginal tax rates below and above the kink. Note: when working with binned data, b=B/h0b = B / h_0 is in bins and must be multiplied by the bin width to convert to income units before applying this formula.

Kinks vs. Notches

Different types of thresholds produce different bunching patterns:

  • Kink point: the marginal tax rate changes (the slope of the tax schedule changes). Bunching is typically modest because the incentive to adjust is proportional to the tax rate change. The income distribution shows a spike at the kink and reduced density (missing mass) in the region just above it — agents from this region have relocated to the kink point. Unlike a notch, the density above a kink is reduced but not zero.

  • Notch point: the average tax rate jumps discontinuously (the level of the tax schedule jumps). The incentive to bunch is much stronger — there is a region above the notch where no rational agent should locate (the "dominated region"). The income distribution shows both a spike at the notch and a "hole" (missing mass) above it. Kleven and Waseem (2013) developed the formal framework for notch bunching.

The Counterfactual Density

The is the distribution that would have prevailed in the absence of the threshold. It is estimated by fitting a polynomial of order pp (typically p=5p = 5 to 99) to the binned frequency counts, excluding bins within the bunching window [zδL,z+δR][z^* - \delta_L, z^* + \delta_R]. The polynomial is then used to interpolate through the excluded region, and an integration constraint ensures that the total mass is preserved. For notch designs, this takes the form of a strict testable condition: the excess mass at the threshold must equal the missing mass in the dominated region above it. For kink designs, mass conservation is imposed as part of the polynomial fitting procedure.

How It Differs from RDD

Bunching estimation and (RKD) both exploit kink points, but they answer different questions with different identifying assumptions:

  • RKD estimates the causal effect of a treatment on an outcome using the kink in the treatment assignment function. It requires that the conditional expectation of the outcome is smooth through the kink.
  • Bunching estimates the elasticity of the running variable itself (e.g., reported income) with respect to the tax rate change at the kink. It requires that the density of the running variable would be smooth in the absence of the kink.

Common Confusions

When to Use

  1. Agents face a clear threshold in their budget set. Tax kinks, notch points, regulatory cutoffs, benefit eligibility thresholds — any setting where the incentive structure changes discretely at a known point.

  2. The running variable is continuous and finely measured. Income, revenue, asset holdings, or other variables measured with sufficient precision to construct a histogram with fine bins.

  3. You want to estimate a behavioral elasticity. Bunching directly identifies how responsive agents are to changes in incentive structures, which is the key policy parameter for welfare analysis and optimal policy design.

  4. You have a large sample. Bunching requires enough observations in each bin to construct a reliable density estimate. Administrative data (tax records, social insurance registers) are ideal.

Do NOT Use Bunching When:

  1. The threshold is not known to agents. If agents are unaware of the kink or notch, they cannot respond to it, and bunching tells you about information, not preferences.

  2. The running variable is coarsely measured or discrete. If income is reported in broad brackets or round numbers only, the histogram is too coarse to estimate a smooth counterfactual density.

  3. Round-number bunching confounds the threshold. If the kink point coincides with a salient round number, behavioral bunching at the kink and round-number bunching are confounded.

  4. You want to estimate effects on a different outcome. Bunching estimates the elasticity of the running variable. If you want the causal effect on, say, labor hours or firm investment, you need RDD or RKD instead.

Connection to Other Methods

Bunching estimation is related to several other causal inference methods:

  • Regression Kink Design (RKD): Both exploit kink points, but RKD estimates the causal effect of a kink-assigned treatment on an outcome (e.g., the effect of unemployment insurance generosity on unemployment duration), while bunching estimates the elasticity of the running variable itself (e.g., how much reported income responds to the tax rate change). RKD requires a smooth conditional outcome expectation; bunching requires a smooth counterfactual density.

  • Regression Discontinuity (RDD): RDD exploits a cutoff where treatment status jumps. In bunching, the "treatment" is the change in incentives, and the response is the distortion of the density. The McCrary density test in RDD checks for manipulation (bunching) as a threat; in bunching estimation, the density distortion is the estimand.

  • Difference-in-Differences (DiD): Some bunching applications use a DiD-like comparison, exploiting tax reforms that change the location or size of kink points. By comparing bunching before and after a reform, researchers can difference out round-number effects and other non-behavioral density features.

  • Instrumental Variables (IV): Bunching can be viewed as a distributional version of IV. The kink or notch is the "instrument" — an exogenous change in the budget set — and the excess mass is the "first stage" response in the density of the running variable.


BIdentification

For bunching estimation to provide valid elasticity estimates, three key assumptions must hold (Saez, 2010).

Assumption 1: Smooth Counterfactual Density

Plain language: In the absence of the threshold, the density of the running variable would be smooth through the kink or notch point. There would be no "natural" pile-up or gap at the threshold — any excess mass or missing mass is caused by behavioral responses to the threshold.

Formally: The counterfactual density h0(z)h_0(z) is continuously differentiable in a neighborhood of zz^*, the threshold. This rules out round-number bunching, institutional features, or other non-behavioral reasons for density discontinuities at the threshold.

This assumption is tested by checking for bunching at placebo thresholds — points in the income distribution where no tax rate change occurs. If similar bunching appears at placebo kinks, the identifying assumption is likely violated.

Assumption 2: No Other Behavioral Responses at the Threshold

Plain language: The threshold affects behavior only through the mechanism of interest (e.g., the tax rate change). There is no other policy, regulation, or incentive that kicks in at the same threshold and could independently cause bunching.

For example, if a tax kink coincides with an eligibility threshold for a government transfer program, bunching could reflect responses to the transfer, the tax change, or both. The elasticity estimate would be confounded.

Assumption 3: Known Functional Form of Response

Plain language: The standard bunching estimator assumes a specific model of individual behavior — typically isoelastic preferences — to map from the observed excess mass to the structural elasticity. Blomquist et al. (2021) showed that the bunching estimator identifies the elasticity only under these functional form restrictions. Without them, the same amount of bunching is consistent with a range of elasticities.

This assumption is the most debated in the bunching literature. Sensitivity analysis should explore how the implied elasticity changes under alternative preference specifications.


CVisual Intuition

Adjust the elasticity and tax rate change to see how bunching emerges at the kink point. The counterfactual polynomial estimates what the density would look like without the threshold; the excess mass between the observed and counterfactual densities identifies the behavioral response.

See how the Earned Income Tax Credit (EITC) kink produces bunching that varies with the taxable income elasticity:


DMathematical Derivation

The Bunching Estimator

Don't worry about the notation yet — here's what this means in words: Derives the relationship between excess mass at a kink point and the compensated elasticity of taxable income, under isoelastic preferences.

Setup. Consider a kink point at zz^* where the marginal tax rate increases from t0t_0 to t1t_1 (with t1>t0t_1 > t_0). Under the new (higher) marginal rate, an agent who would have chosen income z0>zz_0 > z^* in the absence of the kink may reduce income to exactly zz^* if the utility gain from bunching exceeds the utility cost of the income reduction.

Step 1: Define the marginal buncher. The marginal buncher is the agent who, absent the kink, would have chosen income z+Δzz^* + \Delta z^* and is just indifferent between bunching at zz^* and staying at z+Δzz^* + \Delta z^*. All agents with counterfactual income in [z,z+Δz][z^*, z^* + \Delta z^*] will bunch at zz^*.

Step 2: Compute excess mass. The excess mass at the kink is:

B=zz+Δzh0(z)dzh0(z)ΔzB = \int_{z^*}^{z^* + \Delta z^*} h_0(z)\, dz \approx h_0(z^*) \cdot \Delta z^*

The normalized excess mass is b=B/h0(z)Δzb = B / h_0(z^*) \approx \Delta z^*.

Step 3: Link to the elasticity. Under isoelastic preferences with compensated elasticity ee, the marginal buncher satisfies:

Δz=z[(1t01t1)e1]\Delta z^* = z^* \left[ \left(\frac{1 - t_0}{1 - t_1}\right)^e - 1 \right]

For a small tax rate change, this expression simplifies to:

bΔzezt1t01t0b \approx \Delta z^* \approx e \cdot z^* \cdot \frac{t_1 - t_0}{1 - t_0}

Step 4: Solve for the elasticity.

e^=bz1t0t1t0\hat{e} = \frac{b}{z^*} \cdot \frac{1 - t_0}{t_1 - t_0}

Without the small-tax-change approximation, inverting the exact bunching range formula gives:

e^=blog(1t0)log(1t1)1z\hat{e} = \frac{b}{\log(1 - t_0) - \log(1 - t_1)} \cdot \frac{1}{z^*}

The two expressions coincide when the tax rate change is small (since t1t01t0log(1t0)log(1t1)\frac{t_1 - t_0}{1 - t_0} \approx \log(1 - t_0) - \log(1 - t_1) to first order). The log version is preferred because it remains accurate for larger tax rate changes. This expression is the standard bunching elasticity estimator for kink points. The estimator is a sample analog: estimate bb from the data, plug in the known tax rates, and compute e^\hat{e}.

Don't worry about the notation yet — here's what this means in words: Derives how notch bunching differs from kink bunching and why notches produce stronger responses, following Kleven and Waseem (2013).

Setup. At a notch, the average tax rate (not just the marginal rate) jumps at zz^*. An agent earning z+ϵz^* + \epsilon pays discretely more in taxes than an agent earning zϵz^* - \epsilon. This jump creates a "dominated region" above zz^* where no rational agent should locate.

Step 1: Define the dominated region. For agents just above zz^*, the discrete tax increase means that earning z+ϵz^* + \epsilon yields strictly less after-tax income than earning zz^*. The dominated region extends from zz^* to z+Δznotchz^* + \Delta z^*_{notch}, where Δznotch\Delta z^*_{notch} solves the indifference condition. The upper bound of the dominated region depends on the elasticity and the notch size.

Step 2: Key difference from kinks. At a kink, agents smoothly shade their income down to zz^*. At a notch, ALL agents in the dominated region should jump to exactly zz^* (or above the dominated region). This behavior produces:

  • A sharp spike at zz^* (bunching)
  • A "hole" (missing mass) just above zz^*, extending through the dominated region
  • The excess mass must equal the missing mass (integration constraint)

Step 3: Elasticity from notch bunching. The elasticity is identified from the width of the dominated region, which is observed as the upper bound of the missing-mass region. The formula involves inverting the indifference condition and is detailed in Kleven and Waseem (2013).

Key advantage of notches: Because the dominated region is directly observable in the data, notch bunching provides a tighter bound on the elasticity than kink bunching. The integration constraint — excess mass equals missing mass — serves as an internal consistency check.


EImplementation

Bunching Estimation: Step by Step

# Requires: bunching
# bunching: R package for kink/notch bunching estimation (Mavrokonstantis)
library(bunching)

# --- Step 1: Set estimation parameters ---
# Define the threshold, bin width, and polynomial order for the counterfactual
zstar    <- 10000   # kink point (threshold where marginal rate changes)
binwidth <- 500     # bin width for the income histogram
poly     <- 7       # polynomial order for counterfactual density (robustness: try 5-9)

# --- Step 2: Run bunching estimator ---
# bunchit() bins the data, fits a polynomial counterfactual excluding the
# bunching region, and computes the excess mass and implied elasticity
result <- bunchit(
z_vector  = df$income,
zstar     = zstar,
binwidth  = binwidth,
bins_l    = 20,    # bins left of excluded region (defines polynomial fit window)
bins_r    = 20,    # bins right of excluded region
poly      = poly,
t0        = 0,     # marginal tax rate below the kink
t1        = 0.25,  # marginal tax rate above the kink
notch     = FALSE  # set TRUE for notch designs where average rate jumps
)

# --- Step 3: Extract and interpret key results ---
summary(result)
# B     = excess mass (count of bunchers above the counterfactual density)
# b     = normalized excess mass (B / counterfactual height at threshold)
# e     = implied elasticity of taxable income w.r.t. net-of-tax rate
# B_se  = bootstrap standard error of excess mass estimate
RequiresMASS

FDiagnostics

F.1 Placebo Tests at Non-Kink Points

Apply the same bunching estimation procedure at income levels where no kink or notch exists. If you find significant "bunching" at placebo points, the smooth counterfactual assumption may be violated — round-number effects, institutional features, or other non-behavioral factors are producing density irregularities that the polynomial cannot capture.

F.2 Integration Constraint (Notch Designs)

For notch bunching, the excess mass at the threshold must equal the missing mass above it. This constraint is an internal consistency check: agents who bunch at the notch must come from somewhere (the dominated region above the notch). If excess mass exceeds missing mass, the model is misspecified — perhaps the dominated region boundaries are wrong, or there is bunching from agents who were initially below the threshold.

F.3 Polynomial Order Sensitivity

Plot the estimated excess mass B^\hat{B} as a function of the polynomial order pp for p = 3, 4, 5, 6, 7, 8, 9, 10. The estimate should stabilize for moderate p values. If the excess mass changes substantially as p increases, the counterfactual is sensitive to functional form and the estimate is fragile. Low-order polynomials (p below 5) may be too inflexible; high-order polynomials (p above 9) may overfit.

F.4 Bunching Window Sensitivity

The choice of the excluded region [zδL,z+δR][z^* - \delta_L, z^* + \delta_R] affects the estimate. Too narrow: the polynomial is fit using bins that are part of the bunching response, biasing the counterfactual downward and understating excess mass. Too wide: the polynomial has fewer data points to fit and the interpolation is less precise. Report results for a range of window widths.

F.5 Visual Inspection

It is good practice to produce two plots:

  1. Histogram with counterfactual overlay: the observed binned density as bars, the polynomial counterfactual as a dashed line, and the bunching region shaded. The excess mass should be visually apparent.
  2. Sensitivity plots: excess mass and implied elasticity as functions of polynomial order and bunching window width.
# Sensitivity to polynomial order
poly_range <- 5:9
B_by_poly <- sapply(poly_range, function(p) {
res <- bunchit(z_vector = df$income, zstar = zstar,
                binwidth = binwidth, bins_l = 20,
                bins_r = 20, poly = p, notch = FALSE)
res$B
})

plot(poly_range, B_by_poly, type = "b", pch = 19,
   xlab = "Polynomial Order", ylab = "Excess Mass (B)",
   main = "Sensitivity to Polynomial Order")

# Placebo test at a non-kink point
placebo_zstar <- 15000
placebo <- bunchit(z_vector = df$income,
                  zstar = placebo_zstar,
                  binwidth = binwidth, bins_l = 20,
                  bins_r = 20, poly = 7, notch = FALSE)
cat("Placebo B:", placebo$B, "\n")
RequiresMASS

Reading the Output

The key outputs from a bunching estimation are:

StatisticSymbolInterpretation
Excess massB^\hat{B}Number of "extra" observations at the threshold above what a smooth density would predict
Normalized excess massb^\hat{b}B^\hat{B} divided by the counterfactual density height at the threshold. Comparable across settings
Implied elasticitye^\hat{e}The compensated elasticity of the running variable with respect to the net-of-tax rate
Bootstrap SESE(B^)SE(\hat{B})Standard error from resampling residuals of the polynomial fit

What to Report

A well-reported bunching analysis should include:

  1. The histogram with counterfactual overlay and bunching region shaded
  2. Excess mass B^\hat{B} (or normalized b^\hat{b}) with bootstrap confidence interval
  3. Implied elasticity e^\hat{e} with standard error
  4. Polynomial order used and sensitivity across orders 5-9
  5. Bunching window boundaries and sensitivity to window width
  6. Bin width used and sensitivity
  7. Placebo tests at non-kink points
  8. Discussion of frictions — are the estimates naive or friction-corrected?

GWhat Can Go Wrong

What Can Go Wrong

Round-Number Bunching Confounds the Threshold

A tax kink at $12,750 — a non-round number with no special salience. Any excess density at this threshold is plausibly a behavioral response to the tax rate change.

Normalized excess mass b = 1.2 (SE = 0.3). Placebo tests at nearby round numbers show no comparable bunching. The implied elasticity of 0.25 is credibly identified.

What Can Go Wrong

Ignoring Optimization Frictions

Bunching estimation with a friction-corrected model following Kleven and Waseem (2013), which accounts for the fact that many agents face adjustment costs and cannot perfectly locate at the kink

Frictionless (naive) elasticity: 0.10. Friction-corrected structural elasticity: 0.35. The friction model estimates that 60% of agents face positive adjustment costs. The naive estimator understates the true behavioral response by more than 3x.

What Can Go Wrong

Wrong Polynomial Order Distorts the Counterfactual

Polynomial of order 7 fits the non-bunching region well, and results are stable across orders 5-9

B = 450 (SE = 85) with p = 7. Sensitivity: B = 420 (p=5), 445 (p=6), 450 (p=7), 455 (p=8), 460 (p=9). The estimate is robust.

What Can Go Wrong

Bunching Window Too Narrow — Counterfactual Contaminated

Bunching window of +/- 3 bins around the threshold, chosen to include all visible bunching. The polynomial is fit to bins outside this region, which show a smooth density.

B = 450 (SE = 85). The counterfactual polynomial fits the non-bunching region well (R-squared = 0.97). Visual inspection confirms the excluded region captures all excess mass.


HPractice

H.1 Concept Checks

Concept Check

A researcher estimates normalized excess mass of b = 2.0 at a kink where the marginal tax rate increases from 15% to 25%. She computes the elasticity as e = b / (t1 - t0) = 2.0 / 0.10 = 20. Is this calculation correct?

Concept Check

Self-employed taxpayers show much more bunching at tax kinks than wage earners. Does this necessarily mean the self-employed have a higher elasticity of taxable income?

Concept Check

A researcher applies bunching estimation at a tax notch and finds large excess mass at the threshold but no visible 'hole' (missing mass) above it. She reports a large implied elasticity. What is wrong?

H.2 Guided Exercise

Guided Exercise

Interpreting Bunching Output from an EITC Study

You study bunching at the first EITC kink point for filers with two qualifying children, where the marginal tax rate changes from -40% (subsidy) to 0% (phase-in range to flat range) at $7,340 of earned income. You use IRS administrative data (2010-2015, N = 8.2 million returns), $250 bins, a 7th-order polynomial, and a bunching window of +/- $1,500. Your output:

StatisticValue
Excess mass (B)142,500
Counterfactual height48,200
Normalized excess mass2.96
Bootstrap SE (B)18,300
Polynomial order7
Bin width250

Sensitivity (polynomial order):

Sensitivity (window width):

Placebo at 10,000 (no kink): B = 8,200 (SE = 12,400, p = 0.51) Placebo at 15,000 (no kink): B = -2,100 (SE = 11,800, p = 0.86)

What is the normalized excess mass, and is it statistically significant?

Is the estimate robust to polynomial order and bunching window width? What do the placebo tests tell you?

Compute the implied elasticity. The net-of-tax rate changes from 1.40 (40% subsidy) to 1.00 (0% rate) at the kink.

The sample includes both self-employed and wage earners. Among the self-employed only, b = 8.4 and the implied elasticity is 4.0. Among wage earners, b = 0.3 and the elasticity is 0.14. What does this tell you about frictions?

H.3 Error Detective

Error Detective

Read the analysis below carefully and identify the errors.

A public finance researcher studies bunching at a state income tax kink where the marginal rate increases from 5% to 7% at $50,000 of taxable income. She uses $1,000 bins, a 3rd-order polynomial, and a bunching window of +/- $1,000 (1 bin on each side). She finds:

"We estimate normalized excess mass of b = 4.2 (SE = 0.8, p < 0.001). The implied elasticity of taxable income is 0.85. Bunching is clearly visible in the histogram, confirming that taxpayers are highly responsive to marginal tax rate changes."

She does not report polynomial sensitivity, bunching window sensitivity, or placebo tests.

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

A health economist studies bunching at a Medicare billing threshold. Hospitals receive higher reimbursement for patients classified as DRG weight >= 1.5 compared to < 1.5. She applies the Saez (2010) kink bunching methodology and estimates:

"Using the bunching estimator, we find significant excess mass of hospital cases just above the 1.5 DRG threshold (b = 3.1, SE = 0.6). This estimate implies a price elasticity of case classification of 0.55."

She uses a 7th-order polynomial and reports sensitivity to polynomial order (stable from 5-9).

Select all errors you can find:

H.4 You Are the Referee

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors estimate the elasticity of taxable income using bunching at a state income tax kink where the marginal rate increases from 4% to 6% at $75,000 of adjusted gross income. Using state tax returns for 2015-2019 (N = 1.8 million returns), they estimate normalized excess mass of b = 1.85 and an implied compensated elasticity of 0.42. They use $500 bins, a 4th-order polynomial, and a bunching window of +/- $2,000. They do not report sensitivity to polynomial order, window width, or bin size. No placebo tests are conducted.

Key Table

VariableCoefficientSEp-value
Excess mass (B)12,4002,800<0.001
Normalized b1.850.42<0.001
Implied elasticity0.420.10<0.001
Polynomial order4
Bin width500
Window+/- 2,000
N (returns)1,800,000

Authors' Identification Claim

The authors argue that the kink in the tax schedule creates a well-defined change in the marginal tax rate, and that excess bunching at the kink reveals taxpayers' behavioral response to the rate change.


ISwap-In: When to Use Something Else

  • Regression Kink Design (RKD): when you want to estimate the causal effect of a kink-determined treatment on a downstream outcome (e.g., the effect of unemployment insurance on job search duration), rather than the elasticity of the running variable itself.

  • Regression Discontinuity (RDD): when treatment status (not just the incentive slope) changes at the threshold. RDD estimates the local treatment effect; bunching estimates the behavioral elasticity. If you have a notch, you can use RDD for the outcome effect and bunching for the elasticity — they complement each other.

  • Instrumental Variables (IV): when you want to estimate a behavioral parameter but do not have a density-based approach — for example, using variation in tax rates across jurisdictions as instruments for reported income.

  • Structural estimation: when frictions are severe and you need a fully parametric model of the decision problem to identify the elasticity. The Kleven and Waseem (2013) model is a hybrid: it combines reduced-form bunching with a structural friction distribution.


JReviewer Checklist

Critical Reading Checklist

0 of 10 items checked0%

Paper Library

Foundational (3)

Blomquist, S., Newey, W. K., Kumar, A., & Liang, C.-Y. (2021). On Bunching and Identification of the Taxable Income Elasticity.

Journal of Political EconomyDOI: 10.1086/714446

Blomquist, Newey, Kumar, and Liang provide a critical examination of the identification assumptions underlying bunching estimation. They show that the standard bunching estimator identifies the elasticity only under strong assumptions about the functional form of the counterfactual density and the distribution of preferences. Without these assumptions, the amount of bunching is consistent with a range of elasticities. The paper sparks an important methodological debate about what bunching can and cannot identify, and motivates subsequent work on tightening identification in bunching designs.

Kleven, H. J., & Waseem, M. (2013). Using Notches to Uncover Optimization Frictions and Structural Elasticities: Theory and Evidence from Pakistan.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjt004

Kleven and Waseem extend bunching estimation from kinks to notches -- discrete jumps in the tax schedule where the average tax rate changes discontinuously. They develop a structural framework that distinguishes between frictionless and frictional bunching, showing that optimization frictions attenuate observed bunching and cause the naive estimator to understate the true elasticity. Their model identifies both the structural elasticity and the friction distribution from the observed bunching pattern. Applied to Pakistan's income tax notches, they demonstrate that frictions are empirically important and that ignoring them substantially biases elasticity estimates downward.

Saez, E. (2010). Do Taxpayers Bunch at Kink Points?.

American Economic Journal: Economic PolicyDOI: 10.1257/pol.2.3.180

Saez introduces the modern bunching methodology by examining taxpayer responses to kink points in the US income tax schedule, where marginal tax rates change discretely. He shows how to estimate the compensated elasticity of reported income from the excess mass of taxpayers at kink points relative to a smooth counterfactual density fitted by polynomial. The paper establishes the standard empirical approach: bin the data, fit a polynomial excluding the bunching region, and compute the excess mass. He finds modest elasticities overall but sharp bunching among the self-employed near the first EITC kink.

Application (1)

Chetty, R., Friedman, J. N., Olsen, T., & Pistaferri, L. (2011). Adjustment Costs, Firm Responses, and Micro vs. Macro Labor Supply Elasticities: Evidence from Danish Tax Records.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjr013

Chetty, Friedman, Olsen, and Pistaferri use Danish administrative tax data to reconcile the gap between micro and macro labor supply elasticities using bunching methods. They show that adjustment frictions explain why micro estimates from bunching at tax kinks are small: many workers cannot freely adjust hours, so observed bunching understates the frictionless elasticity. They estimate that accounting for frictions raises the implied elasticity substantially. The paper is a landmark application of bunching to the micro-macro elasticity puzzle and introduces key methods for dealing with frictions in bunching designs.

Survey (1)

Kleven, H. J. (2016). Bunching.

Annual Review of EconomicsDOI: 10.1146/annurev-economics-080315-015234

Kleven provides a comprehensive survey of the bunching methodology, covering both kink and notch designs, the role of optimization frictions, and extensions to multiple applications beyond taxation. The survey unifies the theoretical frameworks from Saez (2010) and Kleven and Waseem (2013), discusses practical implementation issues (polynomial order, bandwidth, bin width), and catalogs the growing literature applying bunching to estimate behavioral elasticities in public finance, labor economics, and regulation. Essential reading for anyone starting with bunching methods.

Tags

design-baseddensityelasticitypublic-finance