When should I use Bunching Estimation?

When agents bunch at a threshold in the assignment variable and you want to estimate the elasticity of behavioral response.

What is the key assumption of Bunching Estimation?

The counterfactual distribution (without the threshold) is smooth and can be approximated by a polynomial fitted to the observed distribution excluding the bunching region.

What is the most common mistake with Bunching Estimation?

Confusing notches (discrete jumps) with kinks (slope changes). Notch bunching implies larger elasticities. Also, ignoring optimization frictions that attenuate observed bunching.

Method·advanced·19 min read

Design-BasedModern

Bunching Estimation

Identifies behavioral responses from excess mass at notch or kink points in budget sets, estimating elasticities from distributional distortions.

When to Use: When agents bunch at a threshold in the assignment variable and you want to estimate the elasticity of behavioral response.
Assumption: The counterfactual distribution (without the threshold) is smooth and can be approximated by a polynomial fitted to the observed distribution excluding the bunching region.
Mistake: Confusing notches (discrete jumps) with kinks (slope changes). Notch bunching implies larger elasticities. Also, ignoring optimization frictions that attenuate observed bunching.
Reading Time: ~19 min read · 11 sections · 8 interactive exercises

One-Line Implementation

Rlibrary(bunching); bunchit(z_vector = income, zstar = threshold, binwidth = 500, poly = 7)

Statabunching income, kink(threshold) binwidth(500) poly(7) bandwidth(10000)

Python# Manual: fit polynomial to histogram bins excluding bunching region, integrate excess mass

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: Taxpayer Responses at Income Tax Kinks

An economist wants to estimate how much taxpayers adjust their reported income in response to marginal tax rate changes. The US income tax schedule has s — thresholds where the marginal tax rate jumps discretely. For example, at the boundary of the 10% and 12% brackets, the marginal rate increases by 2 percentage points. If taxpayers are responsive, they should cluster their reported income just below the kink to avoid the higher rate.

Here is the problem: the economist cannot run an experiment. She cannot randomly assign taxpayers to different kink points. She cannot compare taxpayers at different income levels, because high-income and low-income taxpayers differ in countless unobservable ways. A simple regression of income on tax rates would be hopelessly confounded by ability, education, occupation, and preferences.

The approach, developed by Saez (2010), solves this problem by exploiting the shape of the income distribution around the kink. If taxpayers are unresponsive to taxes, the income distribution should be smooth through the kink point — there is no reason for a discontinuity in density. But if taxpayers respond by reducing their reported income to stay below the kink, we generally want to see an excess mass of taxpayers bunching at the kink relative to a smooth .

The key insight is that the amount of excess bunching is proportional to the behavioral . More bunching means taxpayers are more responsive. The researcher estimates the counterfactual density by fitting a polynomial to the observed income histogram, excluding the bins near the kink, and then interpolating through the excluded region. The difference between the observed density and the counterfactual in the bunching region measures the excess mass, from which the elasticity is recovered.

The bunching framework has been extended from kinks (where the marginal rate changes) to notches (where the average tax rate jumps discretely), where the incentive to bunch is much stronger and the implied elasticities can be identified more precisely (Kleven & Waseem, 2013).

AOverview

What Bunching Estimation Does

Bunching estimation identifies behavioral responses to policy thresholds by measuring the excess mass of observations clustered at a kink or notch point relative to a smooth counterfactual density. The method proceeds in four steps:

Bin the data: create a histogram of the running variable (e.g., reported income) with fine bins
Exclude the bunching region: remove bins near the threshold where bunching occurs
Fit a counterfactual: estimate a polynomial through the remaining bins to approximate the density that would have prevailed without the threshold
Compute excess mass: integrate the difference between the observed density and the counterfactual over the bunching region

The excess mass $B$ — or the normalized excess mass $b = B / h_0(z^*)$ , where $h_0(z^*)$ is the counterfactual bin-frequency height (count of observations per bin) at the threshold — is the key estimand. With this convention, $b$ is dimensionless and measures the number of bunching bins. To convert to income units, multiply by the bin width: $\Delta z^* \approx b \cdot \Delta z_{\text{bin}}$ . The implied elasticity is then:

\hat{e} = \frac{\Delta z^*}{z^* \cdot [\log(1 - t_0) - \log(1 - t_1)]} = \frac{b \cdot \Delta z_{\text{bin}}}{z^* \cdot [\log(1 - t_0) - \log(1 - t_1)]}

where $\Delta z^*$ is the bunching range in income units, $z^*$ is the kink point, and $t_0$ and $t_1$ are the marginal tax rates below and above the kink. The bin-frequency convention used here is standard in Saez (2010), Chetty et al. (2011), and Kleven (2016).

Kinks vs. Notches

The intro contrasted kinks (marginal rate change) with notches (average rate jump). The two designs produce distinct bunching patterns:

Kink point: bunching is typically modest because the incentive to adjust is proportional to the tax rate change. The income distribution shows a spike at the kink and reduced density (missing mass) in the region just above it — agents from this region have relocated to the kink point. Unlike a notch, the density above a kink is reduced but not zero.
Notch point: the incentive to bunch is much stronger because a region above the notch — the "dominated region" — strictly dominates the threshold for any rational agent. The income distribution shows both a spike at the notch and a "hole" (missing mass) above it. Kleven and Waseem (2013) developed the formal framework for notch bunching.

The Counterfactual Density

The is the distribution that would have prevailed in the absence of the threshold. It is estimated by fitting a polynomial of order $p$ (typically $p = 5$ to $9$ ) to the binned frequency counts, excluding bins within the bunching window $[z^* - \delta_L, z^* + \delta_R]$ . The polynomial is then used to interpolate through the excluded region, and an integration constraint ensures that the total mass is preserved. For notch designs, this takes the form of a strict testable condition: the excess mass at the threshold must equal the missing mass in the dominated region above it. For kink designs, mass conservation is imposed as part of the polynomial fitting procedure.

The connection to RDD, RKD, and related methods is developed in the Connection to Other Methods subsection below.

Common Confusions

Frequently asked questions about bunching

Q: Is bunching the same as the McCrary density test in RDD? No. The McCrary test checks whether there is a discontinuity in the density at a cutoff, which would indicate manipulation and threaten the RDD design. Bunching estimation intentionally measures the density distortion — the bunching IS the behavioral response of interest, not a validity threat. In bunching, agents are expected to manipulate their position.
Q: How do I choose the polynomial order? There is no single correct answer. Start with $p = 7$ as a baseline and check robustness across $p = 5$ to $p = 9$ . If the estimates are sensitive to the polynomial order, the bunching window may be too wide or the data may not support a smooth counterfactual. Kleven (2016) discusses this choice in detail.
Q: What if there is no visible bunching? The absence of bunching is informative: it suggests either that the elasticity is zero (agents are unresponsive to the threshold) or that prevent agents from adjusting to the threshold. With frictions, it is difficult to distinguish a small frictionless elasticity from a larger elasticity attenuated by adjustment costs, which is why friction-corrected estimators are important.
Q: Can I apply bunching estimation to non-tax settings? Yes. Bunching has been applied to retirement age thresholds, welfare program eligibility, speed limits, minimum capital requirements for banks, and other settings where agents face a kink or notch in their budget set or incentive schedule. The key requirement is a threshold that agents can observe and adjust to.

When to Use

Agents face a clear threshold in their budget set. Tax kinks, notch points, regulatory cutoffs, benefit eligibility thresholds — any setting where the incentive structure changes discretely at a known point.
The running variable is continuous and finely measured. Income, revenue, asset holdings, or other variables measured with sufficient precision to construct a histogram with fine bins.
You want to estimate a behavioral elasticity. Bunching directly identifies how responsive agents are to changes in incentive structures, which is the key policy parameter for welfare analysis and optimal policy design.
You have a large sample. Bunching requires enough observations in each bin to construct a reliable density estimate. Administrative data (tax records, social insurance registers) are ideal.

Do NOT Use Bunching When:

The threshold is not known to agents. If agents are unaware of the kink or notch, they cannot respond to it, and bunching tells you about information, not preferences.
The running variable is coarsely measured or discrete. If income is reported in broad brackets or round numbers only, the histogram is too coarse to estimate a smooth counterfactual density.
Round-number bunching confounds the threshold. If the kink point coincides with a salient round number, behavioral bunching at the kink and round-number bunching are confounded.
You want to estimate effects on a different outcome. Bunching estimates the elasticity of the running variable. If you want the causal effect on, say, labor hours or firm investment, you need RDD or RKD instead.

Connection to Other Methods

Bunching estimation is related to several other causal inference methods:

Regression Kink Design (RKD): Both exploit kink points, but RKD estimates the causal effect of a kink-assigned treatment on an outcome (e.g., the effect of unemployment insurance generosity on unemployment duration), while bunching estimates the elasticity of the running variable itself (e.g., how much reported income responds to the tax rate change). RKD requires a smooth conditional outcome expectation; bunching requires a smooth counterfactual density.
Regression Discontinuity (RDD): RDD exploits a cutoff where treatment status jumps. In bunching, the "treatment" is the change in incentives, and the response is the distortion of the density. The McCrary density test in RDD checks for manipulation (bunching) as a threat; in bunching estimation, the density distortion is the estimand.
Difference-in-Differences (DiD): Some bunching applications use a DiD-like comparison, exploiting tax reforms that change the location or size of kink points. By comparing bunching before and after a reform, researchers can difference out round-number effects and other non-behavioral density features.
Instrumental Variables (IV): Bunching can be viewed as a distributional version of IV. The kink or notch is the "instrument" — an exogenous change in the budget set — and the excess mass is the "first stage" response in the density of the running variable.

BIdentification

For bunching estimation to provide valid elasticity estimates, three key assumptions must hold (Saez, 2010).

Assumption 1: Smooth Counterfactual Density

Plain language: In the absence of the threshold, the density of the running variable would be smooth through the kink or notch point. There would be no "natural" pile-up or gap at the threshold — any excess mass or missing mass is caused by behavioral responses to the threshold.

Formally: The counterfactual density $h_0(z)$ is continuously differentiable in a neighborhood of $z^*$ , the threshold. This rules out round-number bunching, institutional features, or other non-behavioral reasons for density discontinuities at the threshold.

This assumption is tested by checking for bunching at placebo thresholds — points in the income distribution where no tax rate change occurs. If similar bunching appears at placebo kinks, the identifying assumption is likely violated.

Assumption 2: No Other Behavioral Responses at the Threshold

Plain language: The threshold affects behavior only through the mechanism of interest (e.g., the tax rate change). There is no other policy, regulation, or incentive that kicks in at the same threshold and could independently cause bunching.

For example, if a tax kink coincides with an eligibility threshold for a government transfer program, bunching could reflect responses to the transfer, the tax change, or both. The elasticity estimate would be confounded.

Assumption 3: Known Functional Form of Response

Plain language: The standard bunching estimator assumes a specific model of individual behavior — typically isoelastic preferences — to map from the observed excess mass to the structural elasticity. Blomquist et al. (2021) showed that the bunching estimator identifies the elasticity only under these functional form restrictions. Without them, the same amount of bunching is consistent with a range of elasticities.

This assumption is the most debated in the bunching literature. Sensitivity analysis should explore how the implied elasticity changes under alternative preference specifications.

CVisual Intuition

Adjust the elasticity and tax rate change to see how bunching emerges at the kink point. The counterfactual polynomial estimates what the density would look like without the threshold; the excess mass between the observed and counterfactual densities identifies the behavioral response.

See how the Earned Income Tax Credit (EITC) kink produces bunching that varies with the taxable income elasticity:

DMathematical Derivation

The Bunching Estimator

Don't worry about the notation yet — here's what this means in words: Derives the relationship between excess mass at a kink point and the compensated elasticity of taxable income, under isoelastic preferences.

Setup. Consider a kink point at $z^*$ where the marginal tax rate increases from $t_0$ to $t_1$ (with $t_1 > t_0$ ). Under the new (higher) marginal rate, an agent who would have chosen income $z_0 > z^*$ in the absence of the kink may reduce income to exactly $z^*$ if the utility gain from bunching exceeds the utility cost of the income reduction.

Step 1: Define the marginal buncher. The marginal buncher is the agent who, absent the kink, would have chosen income $z^* + \Delta z^*$ and is just indifferent between bunching at $z^*$ and staying at $z^* + \Delta z^*$ . All agents with counterfactual income in $[z^*, z^* + \Delta z^*]$ will bunch at $z^*$ .

Step 2: Compute excess mass. In the continuous (population-density) formulation, the excess mass at the kink is:

B = \int_{z^*}^{z^* + \Delta z^*} h_0(z)\, dz \approx h_0(z^*) \cdot \Delta z^*

so $b = B / h_0(z^*) \approx \Delta z^*$ in income units. In the binned (sample-implementation) version used throughout this page, $\hat h_0(z^*)$ is a bin-frequency height (counts per bin), $\hat b = \hat B / \hat h_0(z^*)$ is a dimensionless number of bunching bins, and $\Delta z^* \approx \hat b \cdot \Delta z_{\text{bin}}$ .

Step 3: Link to the elasticity. Under isoelastic preferences with compensated elasticity $e$ , the marginal buncher satisfies:

\Delta z^* = z^* \left[ \left(\frac{1 - t_0}{1 - t_1}\right)^e - 1 \right]

For a small tax rate change, this expression simplifies to:

b \approx \Delta z^* \approx e \cdot z^* \cdot \frac{t_1 - t_0}{1 - t_0}

Step 4: Solve for the elasticity.

\hat{e} = \frac{b}{z^*} \cdot \frac{1 - t_0}{t_1 - t_0}

Without the small-tax-change approximation, inverting the exact bunching range formula gives:

\hat{e} = \frac{b \cdot \Delta z_{\text{bin}}}{z^* \cdot [\log(1 - t_0) - \log(1 - t_1)]}

where $b$ is the dimensionless normalized excess mass (the count of excess taxpayers in the bunching window divided by the counterfactual bin-frequency height at $z^*$ ) and $\Delta z_{\text{bin}}$ is the bin width, so that the product $b \cdot \Delta z_{\text{bin}}$ expresses the bunching range $\Delta z^*$ in income units. (See §A.3 for how $b$ is computed from a histogram of taxable income.)

The two expressions coincide when the tax rate change is small (since $\frac{t_1 - t_0}{1 - t_0} \approx \log(1 - t_0) - \log(1 - t_1)$ to first order). The log version is preferred because it remains accurate for larger tax rate changes. This expression is the standard bunching elasticity estimator for kink points. The estimator is a sample analog: estimate $b$ from the data, plug in the known tax rates, and compute $\hat{e}$ .

Don't worry about the notation yet — here's what this means in words: Derives how notch bunching differs from kink bunching and why notches produce stronger responses, following Kleven and Waseem (2013).

Setup. At a notch, the average tax rate (not just the marginal rate) jumps at $z^*$ . An agent earning $z^* + \epsilon$ pays discretely more in taxes than an agent earning $z^* - \epsilon$ . This jump creates a "dominated region" above $z^*$ where no rational agent should locate.

Step 1: Define the dominated region. For agents just above $z^*$ , the discrete tax increase means that earning $z^* + \epsilon$ yields strictly less after-tax income than earning $z^*$ . The dominated region extends from $z^*$ to $z^* + \Delta z^*_{notch}$ , where $\Delta z^*_{notch}$ solves the indifference condition. The upper bound of the dominated region depends on the elasticity and the notch size.

Step 2: Key difference from kinks. At a kink, agents smoothly shade their income down to $z^*$ . At a notch, ALL agents in the dominated region should jump to exactly $z^*$ (or above the dominated region). This behavior produces:

A sharp spike at $z^*$ (bunching)
A "hole" (missing mass) just above $z^*$ , extending through the dominated region
The excess mass must equal the missing mass (integration constraint)

Step 3: Elasticity from notch bunching. The elasticity is identified from the width of the dominated region, which is observed as the upper bound of the missing-mass region. The formula involves inverting the indifference condition and is detailed in Kleven and Waseem (2013).

Key advantage of notches: Because the dominated region is directly observable in the data, notch bunching provides a tighter bound on the elasticity than kink bunching. The integration constraint — excess mass equals missing mass — serves as an internal consistency check.

EImplementation

Bunching Estimation: Step by Step

1# Requires: bunching
2# bunching: R package for kink/notch bunching estimation (Mavrokonstantis)
3library(bunching)
4
5# --- Step 1: Set estimation parameters ---
6# Define the threshold, bin width, and polynomial order for the counterfactual
7zstar    <- 10000   # kink point (threshold where marginal rate changes)
8binwidth <- 500     # bin width for the income histogram
9poly     <- 7       # polynomial order for counterfactual density (robustness: try 5-9)
10
11# --- Step 2: Run bunching estimator ---
12# bunchit() bins the data, fits a polynomial counterfactual excluding the
13# bunching region, and computes the excess mass and implied elasticity
14result <- bunchit(
15z_vector  = df$income,
16zstar     = zstar,
17binwidth  = binwidth,
18bins_l    = 20,    # bins left of excluded region (defines polynomial fit window)
19bins_r    = 20,    # bins right of excluded region
20poly      = poly,
21t0        = 0,     # marginal tax rate below the kink
22t1        = 0.25,  # marginal tax rate above the kink
23notch     = FALSE  # set TRUE for notch designs where average rate jumps
24)
25
26# --- Step 3: Extract and interpret key results ---
27summary(result)
28# B     = excess mass (count of bunchers above the counterfactual density)
29# b     = normalized excess mass (B / counterfactual height at threshold)
30# e     = implied elasticity of taxable income w.r.t. net-of-tax rate
31# B_se  = bootstrap standard error of excess mass estimate

RequiresMASS

FDiagnostics

F.1 Placebo Tests at Non-Kink Points

Apply the same bunching estimation procedure at income levels where no kink or notch exists. If you find significant "bunching" at placebo points, the smooth counterfactual assumption may be violated — round-number effects, institutional features, or other non-behavioral factors are producing density irregularities that the polynomial cannot capture.

F.2 Integration Constraint (Notch Designs)

For notch bunching, the excess mass at the threshold must equal the missing mass above it. This constraint is an internal consistency check: agents who bunch at the notch must come from somewhere (the dominated region above the notch). If excess mass exceeds missing mass, the model is misspecified — perhaps the dominated region boundaries are wrong, or there is bunching from agents who were initially below the threshold.

F.3 Polynomial Order Sensitivity

Plot the estimated excess mass $\hat{B}$ as a function of the polynomial order $p$ for p = 3, 4, 5, 6, 7, 8, 9, 10. The estimate should stabilize for moderate p values. If the excess mass changes substantially as p increases, the counterfactual is sensitive to functional form and the estimate is fragile. Low-order polynomials (p below 5) may be too inflexible; high-order polynomials (p above 9) may overfit.

F.4 Bunching Window Sensitivity

The choice of the excluded region $[z^* - \delta_L, z^* + \delta_R]$ affects the estimate. Too narrow: the polynomial is fit using bins that are part of the bunching response, biasing the counterfactual downward and understating excess mass. Too wide: the polynomial has fewer data points to fit and the interpolation is less precise. Report results for a range of window widths.

F.5 Visual Inspection

It is good practice to produce two plots:

Histogram with counterfactual overlay: the observed binned density as bars, the polynomial counterfactual as a dashed line, and the bunching region shaded. The excess mass should be visually apparent.
Sensitivity plots: excess mass and implied elasticity as functions of polynomial order and bunching window width.

1# Sensitivity to polynomial order
2poly_range <- 5:9
3B_by_poly <- sapply(poly_range, function(p) {
4res <- bunchit(z_vector = df$income, zstar = zstar,
5                binwidth = binwidth, bins_l = 20,
6                bins_r = 20, poly = p, notch = FALSE)
7res$B
8})
9
10plot(poly_range, B_by_poly, type = "b", pch = 19,
11   xlab = "Polynomial Order", ylab = "Excess Mass (B)",
12   main = "Sensitivity to Polynomial Order")
13
14# Placebo test at a non-kink point
15placebo_zstar <- 15000
16placebo <- bunchit(z_vector = df$income,
17                  zstar = placebo_zstar,
18                  binwidth = binwidth, bins_l = 20,
19                  bins_r = 20, poly = 7, notch = FALSE)
20cat("Placebo B:", placebo$B, "\n")

RequiresMASS

Reading the Output

The key outputs from a bunching estimation are:

Statistic	Symbol	Interpretation
Excess mass	$\hat{B}$	Number of "extra" observations at the threshold above what a smooth density would predict
Normalized excess mass	$\hat{b}$	$\hat{B}$ divided by the counterfactual density height at the threshold. Comparable across settings
Implied elasticity	$\hat{e}$	The compensated elasticity of the running variable with respect to the net-of-tax rate
Bootstrap SE	$SE(\hat{B})$	Standard error from resampling residuals of the polynomial fit

What to Report

A well-reported bunching analysis should include:

The histogram with counterfactual overlay and bunching region shaded
Excess mass $\hat{B}$ (or normalized $\hat{b}$ ) with bootstrap confidence interval
Implied elasticity $\hat{e}$ with standard error
Polynomial order used and sensitivity across orders 5-9
Bunching window boundaries and sensitivity to window width
Bin width used and sensitivity
Placebo tests at non-kink points
Discussion of frictions — are the estimates naive or friction-corrected?

Example write-up

"We estimate the compensated elasticity of taxable income at the first Earned Income Tax Credit (EITC) kink point using the bunching approach of Saez (2010). Using IRS administrative data for 2005–2015, we construct a histogram of adjusted gross income with $500 bins and fit a 7th-order polynomial to bins outside the bunching window (the kink ±$2,000). The normalized excess mass is $\hat{b} = 1.85$ (bootstrap SE = 0.32). Given the 34-to-40 percentage-point marginal rate change at this kink (the EITC marginal subsidy of -34% or -40% in the phase-in region jumps to 0% in the flat region, depending on the number of qualifying children), the implied compensated elasticity is $\hat{e} = 0.31$ (SE = 0.05). The estimate is stable across polynomial orders 5–9 ( $\hat{e}$ ranges from 0.28 to 0.33) and bunching windows from ±$1,500 to ±$3,000. Placebo tests at non-kink round numbers ($15,000, $20,000) show no significant bunching (p > 0.3). Among the self-employed, $\hat{e} = 0.65$ (SE = 0.11), consistent with fewer optimization frictions."

GWhat Can Go Wrong

What Can Go Wrong

Round-Number Bunching Confounds the Threshold

A tax kink at $12,750 — a non-round number with no special salience. Any excess density at this threshold is plausibly a behavioral response to the tax rate change.

Normalized excess mass b = 1.2 (SE = 0.3). Placebo tests at nearby round numbers show no comparable bunching. The implied elasticity of 0.25 is credibly identified.

What Can Go Wrong

Ignoring Optimization Frictions

Bunching estimation with a friction-corrected model following Kleven and Waseem (2013), which accounts for the fact that many agents face adjustment costs and cannot perfectly locate at the kink

Frictionless (naive) elasticity: 0.10. Friction-corrected structural elasticity: 0.35. The friction model estimates that 60% of agents face positive adjustment costs. The naive estimator understates the true behavioral response by more than 3x.

What Can Go Wrong

Wrong Polynomial Order Distorts the Counterfactual

Polynomial of order 7 fits the non-bunching region well, and results are stable across orders 5-9

B = 450 (SE = 85) with p = 7. Sensitivity: B = 420 (p=5), 445 (p=6), 450 (p=7), 455 (p=8), 460 (p=9). The estimate is robust.

What Can Go Wrong

Bunching Window Too Narrow — Counterfactual Contaminated

Bunching window of +/- 3 bins around the threshold, chosen to include all visible bunching. The polynomial is fit to bins outside this region, which show a smooth density.

B = 450 (SE = 85). The counterfactual polynomial fits the non-bunching region well (R-squared = 0.97). Visual inspection confirms the excluded region captures all excess mass.

HPractice

H.1 Concept Checks

Concept Check

A researcher estimates normalized excess mass of b = 2.0 at a kink where the marginal tax rate increases from 15% to 25%. She computes the elasticity as e = b / (t1 - t0) = 2.0 / 0.10 = 20. Is this calculation correct?

Yes, the elasticity is simply the normalized excess mass divided by the tax rate change.No, the correct formula accounts for the net-of-tax rate change relative to the threshold, not the gross tax rate change alone.No, you need to subtract 1 from b before computing the elasticity.No, bunching cannot identify elasticities — it only identifies excess mass.

Concept Check

Self-employed taxpayers show much more bunching at tax kinks than wage earners. Does this necessarily mean the self-employed have a higher elasticity of taxable income?

Yes — more bunching directly implies a higher elasticity.Not necessarily — the self-employed may simply face lower optimization frictions, allowing them to bunch more easily even with the same underlying elasticity.No — the self-employed bunch more because they evade taxes, not because they respond to incentives.No — bunching at kinks cannot identify elasticities for any group.

Concept Check

A researcher applies bunching estimation at a tax notch and finds large excess mass at the threshold but no visible 'hole' (missing mass) above it. She reports a large implied elasticity. What is wrong?

Nothing is wrong — excess mass at a notch implies a behavioral response regardless of missing mass.The excess mass at the notch should equal the missing mass above it. No missing mass means the integration constraint fails, and the bunching is likely not a behavioral response to the notch.The researcher should widen the bunching window to capture the missing mass.The missing mass is absorbed into the excess mass at the threshold, so it is double-counted.

H.2 Guided Exercise

Guided Exercise

Interpreting Bunching Output from an EITC Study

You study bunching at the first EITC kink point for filers with two qualifying children, where the marginal tax rate changes from -40% (subsidy) to 0% (phase-in range to flat range) at $7,340 of earned income. You use IRS administrative data (2010-2015, N = 8.2 million returns), $250 bins, a 7th-order polynomial, and a bunching window of +/- $1,500. Your output:

Statistic	Value
Excess mass (B)	142,500
Counterfactual height	48,200
Normalized excess mass	2.96
Bootstrap SE (B)	18,300
Polynomial order	7
Bin width	250

Sensitivity (polynomial order):

Sensitivity (window width):

Placebo at 10,000 (no kink): B = 8,200 (SE = 12,400, p = 0.51) Placebo at 15,000 (no kink): B = -2,100 (SE = 11,800, p = 0.86)

H.3 Error Detective

Error Detective

Read the analysis below carefully and identify the errors.

A public finance researcher studies bunching at a state income tax kink where the marginal rate increases from 5% to 7% at $50,000 of taxable income. She uses $1,000 bins, a 3rd-order polynomial, and a bunching window of +/- $1,000 (1 bin on each side). She finds:

"We estimate normalized excess mass of b = 4.2 (SE = 0.8, p < 0.001). The implied elasticity of taxable income is 0.85. Bunching is clearly visible in the histogram, confirming that taxpayers are highly responsive to marginal tax rate changes."

She does not report polynomial sensitivity, bunching window sensitivity, or placebo tests.

Select all errors you can find:

Polynomial order too low (3rd order)(Polynomial specification)

Bunching window too narrow (+/- 1 bin)(Window specification)

No sensitivity analysis or placebo tests(Missing robustness checks)

No discussion of optimization frictions(Interpretation)

Error Detective

Read the analysis below carefully and identify the errors.

A health economist studies bunching at a Medicare billing threshold. Hospitals receive higher reimbursement for patients classified as DRG weight >= 1.5 compared to < 1.5. She applies the Saez (2010) kink bunching methodology and estimates:

"Using the bunching estimator, we find significant excess mass of hospital cases just above the 1.5 DRG threshold (b = 3.1, SE = 0.6). This estimate implies a price elasticity of case classification of 0.55."

She uses a 7th-order polynomial and reports sensitivity to polynomial order (stable from 5-9).

Select all errors you can find:

Misidentification of kink vs. notch(Threshold type misidentification)

Bunching is above the threshold, not below(Direction of behavioral response)

Manipulation vs. behavioral response(Interpretation)

H.4 You Are the Referee

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors estimate the elasticity of taxable income using bunching at a state income tax kink where the marginal rate increases from 4% to 6% at $75,000 of adjusted gross income. Using state tax returns for 2015-2019 (N = 1.8 million returns), they estimate normalized excess mass of b = 1.85 and an implied compensated elasticity of 0.42. They use $500 bins, a 4th-order polynomial, and a bunching window of +/- $2,000. They do not report sensitivity to polynomial order, window width, or bin size. No placebo tests are conducted.

Key Table

Variable	Coefficient	SE	p-value
Excess mass (B)	12,400	2,800	<0.001
Normalized b	1.85	0.42	<0.001
Implied elasticity	0.42	0.10	<0.001
Polynomial order	4
Bin width	500
Window	+/- 2,000
N (returns)	1,800,000

Authors' Identification Claim

The authors argue that the kink in the tax schedule creates a well-defined change in the marginal tax rate, and that excess bunching at the kink reveals taxpayers' behavioral response to the rate change.

ISwap-In: When to Use Something Else

Regression Kink Design (RKD): when you want to estimate the causal effect of a kink-determined treatment on a downstream outcome (e.g., the effect of unemployment insurance on job search duration), rather than the elasticity of the running variable itself.
Regression Discontinuity (RDD): when treatment status (not just the incentive slope) changes at the threshold. RDD estimates the local treatment effect; bunching estimates the behavioral elasticity. If you have a notch, you can use RDD for the outcome effect and bunching for the elasticity — they complement each other.
Instrumental Variables (IV): when you want to estimate a behavioral parameter but do not have a density-based approach — for example, using variation in tax rates across jurisdictions as instruments for reported income.
Structural estimation: when frictions are severe and you need a fully parametric model of the decision problem to identify the elasticity. The Kleven and Waseem (2013) model is a hybrid: it combines reduced-form bunching with a structural friction distribution.

JReviewer Checklist

Paper Library

Has replication code

Foundational (3)

Blomquist, S., Newey, W. K., Kumar, A., & Liang, C.-Y. (2021). On Bunching and Identification of the Taxable Income Elasticity.

Journal of Political EconomyDOI: 10.1086/714446

Blomquist, Newey, Kumar, and Liang provide a critical examination of the identification assumptions underlying bunching estimation. They show that the standard bunching estimator identifies the elasticity only under strong assumptions about the functional form of the counterfactual density and the distribution of preferences. Without these assumptions, the amount of bunching is consistent with a range of elasticities. The paper sparks an important methodological debate about what bunching can and cannot identify, and motivated subsequent work on tightening identification in bunching designs.

Kleven, H. J., & Waseem, M. (2013). Using Notches to Uncover Optimization Frictions and Structural Elasticities: Theory and Evidence from Pakistan.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjt004

Kleven and Waseem extend bunching estimation from kinks to notches -- discrete jumps in the tax schedule where the average tax rate changes discontinuously. They develop a structural framework that distinguishes between frictionless and frictional bunching, showing that optimization frictions attenuate observed bunching and cause the naive estimator to understate the true elasticity. Their model identifies both the structural elasticity and the friction distribution from the observed bunching pattern. Applied to Pakistan's income tax notches, they demonstrate that frictions are empirically important and that ignoring them substantially biases elasticity estimates downward.

Saez, E. (2010). Do Taxpayers Bunch at Kink Points?.

American Economic Journal: Economic PolicyDOI: 10.1257/pol.2.3.180

Saez introduces the modern bunching methodology by examining taxpayer responses to kink points in the US income tax schedule, where marginal tax rates change discretely. He shows how to estimate the compensated elasticity of reported income from the excess mass of taxpayers at kink points relative to a smooth counterfactual density fitted by polynomial. The paper establishes the standard empirical approach: bin the data, fit a polynomial excluding the bunching region, and compute the excess mass. He finds modest elasticities overall but sharp bunching among the self-employed near the first EITC kink.

Application (1)

Chetty, R., Friedman, J. N., Olsen, T., & Pistaferri, L. (2011). Adjustment Costs, Firm Responses, and Micro vs. Macro Labor Supply Elasticities: Evidence from Danish Tax Records.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjr013

Chetty, Friedman, Olsen, and Pistaferri use Danish administrative tax data to reconcile the gap between micro and macro labor supply elasticities using bunching methods. They show that adjustment frictions explain why micro estimates from bunching at tax kinks are small: many workers cannot freely adjust hours, so observed bunching understates the frictionless elasticity. They estimate that accounting for frictions raises the implied elasticity substantially. The paper is a landmark application of bunching to the micro-macro elasticity puzzle and introduces key methods for dealing with frictions in bunching designs.

Survey (1)

Kleven, H. J. (2016). Bunching.

Annual Review of EconomicsDOI: 10.1146/annurev-economics-080315-015234

Kleven provides a comprehensive survey of the bunching methodology, covering both kink and notch designs, the role of optimization frictions, and extensions to multiple applications beyond taxation. The survey unifies the theoretical frameworks from Saez (2010) and Kleven and Waseem (2013), discusses practical implementation issues (polynomial order, bandwidth, bin width), and catalogs the growing literature applying bunching to estimate behavioral elasticities in public finance, labor economics, and regulation. Essential reading for anyone starting with bunching methods.

One-Line Implementation

Download Full Analysis Code

Motivating Example: Taxpayer Responses at Income Tax Kinks#

AOverview#

What Bunching Estimation Does#

Kinks vs. Notches#

The Counterfactual Density#

Common Confusions#

When to Use#

Do NOT Use Bunching When:#

Connection to Other Methods#

BIdentification#

Assumption 1: Smooth Counterfactual Density#

Assumption 2: No Other Behavioral Responses at the Threshold#

Assumption 3: Known Functional Form of Response#

CVisual Intuition#

DMathematical Derivation#

The Bunching Estimator#

EImplementation#

Bunching Estimation: Step by Step#

FDiagnostics#

F.1 Placebo Tests at Non-Kink Points#

F.2 Integration Constraint (Notch Designs)#

F.3 Polynomial Order Sensitivity#

F.4 Bunching Window Sensitivity#

F.5 Visual Inspection#

Reading the Output#

What to Report#

GWhat Can Go Wrong#

Round-Number Bunching Confounds the Threshold

Ignoring Optimization Frictions

Wrong Polynomial Order Distorts the Counterfactual

Bunching Window Too Narrow — Counterfactual Contaminated

HPractice#

H.1 Concept Checks#

H.2 Guided Exercise#

H.3 Error Detective#

H.4 You Are the Referee#

Paper Summary

Key Table

Authors' Identification Claim

ISwap-In: When to Use Something Else#

JReviewer Checklist#

Critical Reading Checklist

Paper Library

Foundational (3)

Application (1)

Survey (1)

Tags

Motivating Example: Taxpayer Responses at Income Tax Kinks

AOverview

What Bunching Estimation Does

Kinks vs. Notches

The Counterfactual Density

Common Confusions

When to Use

Do NOT Use Bunching When:

Connection to Other Methods

BIdentification

Assumption 1: Smooth Counterfactual Density

Assumption 2: No Other Behavioral Responses at the Threshold

Assumption 3: Known Functional Form of Response

CVisual Intuition

DMathematical Derivation

The Bunching Estimator

EImplementation

Bunching Estimation: Step by Step

FDiagnostics

F.1 Placebo Tests at Non-Kink Points

F.2 Integration Constraint (Notch Designs)

F.3 Polynomial Order Sensitivity

F.4 Bunching Window Sensitivity

F.5 Visual Inspection

Reading the Output

What to Report

GWhat Can Go Wrong

HPractice

H.1 Concept Checks

H.2 Guided Exercise

H.3 Error Detective

H.4 You Are the Referee

ISwap-In: When to Use Something Else

JReviewer Checklist