Lee Bounds for Attrition
When point identification fails — especially due to differential attrition — informative bounds can still be useful.
When You Cannot Pin It Down
Sometimes you have to admit that the data cannot tell you exactly what the answer is — only that it lies within a range. This situation is the world of , and if you find it frustrating, you are in good company. But it turns out that knowing an effect is "between 0.05 and 0.25" is often far more useful — and far more honest — than reporting a precise but biased point estimate.
The most common setting where partial identification arises in applied economics is sample selection: the outcome you care about is only observed for a non-random subset of your sample. The classic example is wages. You want to estimate the effect of a job training program on wages, but wages are only observed for people who are employed. If the training program itself changes who is employed (as it almost certainly does), then comparing wages among the employed is contaminated by selection.
Sample selection is not a minor technical issue. It threatens the validity of any study where the treatment affects whether the outcome is observed.
Why It Matters
If you ignore sample selection and simply compare outcomes among observed units, your treatment effect estimate is biased — potentially severely. Lee bounds give you honest, assumption-lean bounds on the true effect, letting you report what the data can actually support rather than a precise but misleading point estimate. Reviewers increasingly expect attrition analysis in experimental work, and Lee bounds are a standard tool for it.
Why Point Identification Fails with Differential Attrition
Consider a randomized experiment evaluating a job training program. You randomly assign 1,000 people to training and 1,000 to control. After six months, you observe:
- Training group: 700 employed (70%), wages observed
- Control group: 600 employed (60%), wages observed
You want to estimate the effect of training on wages. But here is the problem: the 700 employed in the treatment group and the 600 employed in the control group are different populations. Training caused 100 more people to be employed. Those 100 "marginal" workers — people who would not have been employed without the training — are probably different from the "always-employed" workers (e.g., lower skill, lower potential wages).
When you compare average wages among the employed, you are comparing:
- Treatment: a mixture of always-employed workers and newly employed workers
- Control: only always-employed workers
The comparison is contaminated by the composition change. You cannot separate the effect of training on wages from the effect of training on who is observed. Randomization guarantees balance in the full sample, but it says nothing about the selected subsample.
Two Approaches: Heckman vs. Bounds
The
The traditional solution to sample selection is the Heckman (1979) model.
The Heckman selection model requires:
- A selection equation (a model of who is observed)
- An exclusion restriction (a variable that affects selection but not the outcome)
- Joint normality of the error terms
These conditions are strong requirements. The exclusion restriction is often hard to justify — what affects employment but not wages? Joint normality is a functional form assumption that may not hold. In practice, Heckman estimates can be fragile and sensitive to specification.
The Bounds Approach
The alternative, pioneered by Charles Manski and applied powerfully by David Lee, is to give up on point identification and instead bound the treatment effect using weaker assumptions (Manski, 2003).
The key insight: if you are willing to assume less, you learn less — but what you learn is more credible. A wide but honest bound beats a precise but questionable point estimate.
Lee (2009) Bounds: The Method
Lee (2009) developed a practical, widely used method for bounding treatment effects in the presence of sample selection.
The Monotonicity Assumption
Lee bounds require a single key assumption: . The treatment must affect selection in one direction only:
where is the selection indicator (1 = observed, 0 = not observed) under treatment status .
In the job training example: training can only increase (or leave unchanged) the probability of employment for every individual. No one who would have been employed without training becomes unemployed because of training.
The Trimming Procedure
Under monotonicity, the treatment group contains everyone the control group contains plus some extra individuals who were "brought in" by the treatment. To make the groups comparable, we need to remove those extra individuals. The question is: which ones?
We do not know. But we can construct the best and worst cases.
Step 1: Compute the selection rates:
Under monotonicity with the treatment increasing selection: .
Step 2: Compute the trimming proportion:
This quantity is the fraction of the treatment group's observed sample that was "brought in" by the treatment.
Step 3: Trim the treatment group to make it comparable:
- Upper bound: Remove the bottom fraction of the treated group's outcome distribution. The remaining treated individuals have the highest outcomes. Comparing them to the control group gives an upper bound on the treatment effect for always-observed individuals.
- Lower bound: Remove the top fraction of the treated group's outcome distribution. Comparing the remaining (lowest) outcomes to the control group gives a lower bound.
Don't worry about the notation yet — here's what this means in words: Under monotonicity, the extra individuals brought into the sample by treatment could be at any point in the outcome distribution. The worst case (lower bound) is that they are at the top; the best case (upper bound) is that they are at the bottom.
Under monotonicity, we can partition the treatment group's observed sample into two types:
- Always-observed (): individuals who would be observed regardless of treatment status.
- Compliers (): individuals brought into the sample by treatment.
The control group's observed sample contains only always-observed types (under monotonicity). So the ideal comparison is: .
We observe directly from the control group. But the treatment group mixes always-observed and compliers. We do not know which individuals are compliers.
The worst case for the treatment effect is that compliers have the highest outcomes (so removing them from the top gives the lowest remaining mean). The best case is that compliers have the lowest outcomes (so removing them from the bottom gives the highest remaining mean).
Formally:
where is the -th quantile of the treated outcome distribution.
These bounds are sharp — they are the tightest possible bounds given only the monotonicity assumption and random assignment. No additional restriction can narrow them without additional assumptions.
A Worked Example
Return to our training program:
- Treatment: 700 of 1,000 employed (70%)
- Control: 600 of 1,000 employed (60%)
Step 1: ,
Step 2:
So 14.3% of the treatment group's employed workers were "brought in" by the program.
Step 3: Trim the treatment group.
Suppose the average wage in the control group is $15/hour. The average wage in the full treatment group is $15.50/hour.
- Trim the bottom 14.3% of the treatment wage distribution. The remaining 85.7% have an average wage of $16.20. Upper bound = $16.20 - $15.00 = $1.20/hour.
- Trim the top 14.3% of the treatment wage distribution. The remaining 85.7% have an average wage of $14.80. Lower bound = $14.80 - $15.00 = -$0.20/hour.
The Lee bounds are [-$0.20, $1.20]. The training program's effect on wages (for always-employed workers) is somewhere in this range. Notice the bounds include zero — we cannot rule out that training has no wage effect, even though the naive comparison shows a positive difference.
Interactive: Attrition and Bounds
Try setting both observation rates to the same value. The bounds collapse because there is no differential selection. Now widen the gap and watch the bounds expand — this widening is the cost of not knowing who was "brought in" by the treatment.
Tightening the Bounds
Lee bounds can be wide, especially when differential attrition is large. Two main strategies help:
1. Condition on Pre-Treatment Covariates
If you have baseline covariates that predict the outcome, compute Lee bounds within covariate cells and then average. Within-cell outcome distributions are less dispersed, so the bounds within each cell are tighter. The overall bounds (a weighted average across cells) are tighter than the unconditional bounds.
This tightening works because trimming removes a fixed fraction of the outcome distribution. If the within-cell distribution has less spread, trimming removes less extreme values, producing tighter bounds.
2. Reduce Differential Attrition
The most effective strategy is prevention. Minimize differential attrition through:
- Intensive follow-up and tracking
- Administrative data linkage (which eliminates survey non-response)
- Incentive payments for survey completion
- Short, simple outcome measures that maximize response
Every percentage point of differential attrition widens the bounds. A 2-percentage-point differential is far more manageable than a 15-percentage-point differential. Accounting for expected attrition in your power analysis at the design stage helps ensure the resulting bounds remain informative.
Lee Bounds vs. Manski Worst-Case Bounds
It is important not to confuse Lee bounds with Manski's "worst-case" bounds, which use no assumptions at all beyond the support of the outcome:
| Manski Bounds | Lee Bounds | |
|---|---|---|
| Assumption | Only that is bounded in | Monotonicity of selection |
| Width | Often very wide (depends on outcome range) | Narrower (depends on differential attrition) |
| Impute missing outcomes as | Extreme values ( or ) | Trimmed quantiles from observed data |
| When useful | As a baseline; monotonicity is not credible | When monotonicity is plausible |
Manski bounds are often uninformatively wide because they impute the worst possible outcomes for missing data. Lee bounds are typically much tighter because monotonicity restricts the set of possible imputations.
When to Use Lee Bounds
| Setting | Use Lee Bounds? | Why |
|---|---|---|
| RCT with differential attrition | Yes | The standard tool for this setting |
| RCT with symmetric attrition | Consider as robustness check | Equal attrition rates do not rule out differential selection on unobservables |
| Natural experiment (e.g., DiD) with outcome only for selected sample | Yes, if monotonicity holds | Same logic applies to quasi-experiments |
| Study where treatment reduces observation (e.g., mortality) | Yes, but reverse the direction | If treatment reduces , trim the control group |
| Observational study with selection | Possible, but requires more caution | Lee bounds assume random assignment; additional assumptions needed |
How to Do It: Code
# Lee (2009) bounds with bootstrapped confidence intervals
# Requires: base R only (no additional packages)
lee_bounds <- function(y, treatment, selection, alpha = 0.05, n_boot = 1000) {
# --- Step 1: Compute observation (selection) rates by group ---
p1 <- mean(selection[treatment == 1]) # P(observed | treated)
p0 <- mean(selection[treatment == 0]) # P(observed | control)
# --- Step 2: Determine which group to trim ---
# Under monotonicity, the group with higher selection is trimmed
if (p1 >= p0) {
q <- (p1 - p0) / p1 # fraction of treated to trim
trim_group <- "treatment"
} else {
q <- (p0 - p1) / p0 # fraction of control to trim
trim_group <- "control"
}
# --- Step 3: Extract observed outcomes for each group ---
y1 <- y[treatment == 1 & selection == 1]
y0 <- y[treatment == 0 & selection == 1]
# --- Step 4: Core trimming function ---
compute_bounds <- function(y1, y0, q) {
if (q < 1e-10) {
# No differential attrition: bounds collapse to a point
return(c(lower = mean(y1) - mean(y0), upper = mean(y1) - mean(y0)))
}
# Upper bound: trim the bottom q fraction (remove lowest outcomes)
cutoff_low <- quantile(y1, q)
upper <- mean(y1[y1 >= cutoff_low]) - mean(y0)
# Lower bound: trim the top q fraction (remove highest outcomes)
cutoff_high <- quantile(y1, 1 - q)
lower <- mean(y1[y1 <= cutoff_high]) - mean(y0)
c(lower = lower, upper = upper)
}
# Point estimates of the bounds
bounds <- compute_bounds(y1, y0, q)
# --- Step 5: Bootstrap confidence intervals ---
# Resample treatment and control groups separately to preserve design
n1 <- sum(treatment == 1)
n0 <- sum(treatment == 0)
boot_bounds <- replicate(n_boot, {
# Draw bootstrap samples within each group
idx1 <- sample(which(treatment == 1), n1, replace = TRUE)
idx0 <- sample(which(treatment == 0), n0, replace = TRUE)
# Recompute selection rates and bounds on bootstrap sample
b_y1 <- y[idx1[selection[idx1] == 1]]
b_y0 <- y[idx0[selection[idx0] == 1]]
b_p1 <- mean(selection[idx1])
b_p0 <- mean(selection[idx0])
b_q <- max(0, (b_p1 - b_p0) / b_p1)
compute_bounds(b_y1, b_y0, b_q)
})
# Percentile-based CIs for each bound endpoint
ci_lower <- quantile(boot_bounds["lower", ], c(alpha/2, 1-alpha/2))
ci_upper <- quantile(boot_bounds["upper", ], c(alpha/2, 1-alpha/2))
# --- Step 6: Return results ---
list(
lower = bounds["lower"],
upper = bounds["upper"],
ci_lower = ci_lower,
ci_upper = ci_upper,
trimming_proportion = q,
p_treated = p1,
p_control = p0
)
}
# --- Usage ---
result <- lee_bounds(
y = df$wage,
treatment = df$treatment,
selection = df$employed # 1 = observed, 0 = attrited
)
# Print the bounds and their confidence intervals
cat(sprintf("Lee Bounds: [%.3f, %.3f]\n", result$lower, result$upper))
cat(sprintf("95%% CI for lower bound: [%.3f, %.3f]\n",
result$ci_lower[1], result$ci_lower[2]))
cat(sprintf("95%% CI for upper bound: [%.3f, %.3f]\n",
result$ci_upper[1], result$ci_upper[2]))Manual Implementation
# Requires: base R (no additional packages)
# Manual Lee bounds — step-by-step walkthrough
# --- Step 1: Compute selection (observation) rates ---
# p1 = P(observed | treated), p0 = P(observed | control)
p1 <- mean(df$employed[df$treatment == 1])
p0 <- mean(df$employed[df$treatment == 0])
# --- Step 2: Compute the trimming proportion ---
# q = fraction of the treated group "brought in" by treatment
q <- (p1 - p0) / p1
cat("Trimming proportion:", q, "\n")
# --- Step 3: Extract observed outcomes and find trimming cutoffs ---
y1 <- df$wage[df$treatment == 1 & df$employed == 1] # treated wages
y0 <- df$wage[df$treatment == 0 & df$employed == 1] # control wages
# Quantiles define where to cut the treated distribution
cutoff_lower <- quantile(y1, q) # for upper bound: trim below this
cutoff_upper <- quantile(y1, 1 - q) # for lower bound: trim above this
# --- Step 4: Compute bounds ---
# Upper bound: remove bottom q% (assume extra workers have lowest wages)
upper <- mean(y1[y1 >= cutoff_lower]) - mean(y0)
cat("Upper bound:", upper, "\n")
# Lower bound: remove top q% (assume extra workers have highest wages)
lower <- mean(y1[y1 <= cutoff_upper]) - mean(y0)
cat("Lower bound:", lower, "\n")How to Report Lee Bounds
A well-reported Lee bounds analysis includes:
- The attrition or selection rates for treatment and control groups.
- A test for differential attrition (is the difference statistically significant?).
- The monotonicity assumption, stated explicitly and justified for your setting.
- The bounds with confidence intervals (bootstrapped).
- Comparison with the naive estimate (ignoring selection).
- Whether covariates were used to tighten bounds.
Example write-up:
Employment rates are 70% in the treatment group and 60% in the control group (p < 0.001), indicating that the training program increased employment. Because wages are only observed for employed individuals, the naive comparison of treatment and control wages is contaminated by differential selection. Following Lee (2009), we compute bounds on the wage effect under the assumption that training weakly increases employment for all individuals (monotonicity). The trimming proportion is 14.3%. The Lee bounds for the treatment effect on hourly wages are [-$0.20, $1.20] (95% CI for the lower bound: [-$0.85, $0.45]; 95% CI for the upper bound: [$0.55, $1.85]). The bounds include zero, so we cannot reject that training has no effect on wages for always-employed workers. Conditioning on baseline covariates (age, gender, education) tightens the bounds to [$0.05, $0.95].
Common Mistakes
Concept Check
In an RCT of a tutoring program, 90% of treatment students take the end-of-year exam, compared to 80% of control students. The monotonicity assumption for Lee bounds requires:
Paper Library
Foundational (8)
Gerard, F., Rokkanen, M., & Rothe, C. (2020). Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable.
Gerard, Rokkanen, and Rothe study regression-discontinuity settings in which the running variable is manipulated, so conventional point identification fails. They show that treatment effects are still partially identified and derive sharp bounds under a general model in which the extent of manipulation is learned from the data.
Heckman, J. J. (1979). Sample Selection Bias as a Specification Error.
Heckman introduces the two-step estimator for correcting sample selection bias using the inverse Mills ratio. The paper shows that selection bias can be treated as an omitted variable problem, where the omitted variable is the conditional expectation of the error term given selection. One of the most cited papers in econometrics.
Horowitz, J. L., & Manski, C. F. (2000). Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data.
Horowitz and Manski extend the bounding approach to experiments with missing data on both covariates and outcomes. They show how to construct valid bounds under different assumptions about the missing data mechanism, providing a principled alternative to complete-case analysis and imputation.
Imbens, G. W., & Manski, C. F. (2004). Confidence Intervals for Partially Identified Parameters.
Imbens and Manski develop methods for constructing valid confidence intervals when parameters are only partially identified—that is, when the data and assumptions narrow the parameter to a set rather than a point. This paper provides the inferential foundation for reporting uncertainty around bounds estimates, including Lee bounds.
Lee, D. S. (2009). Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects.
Lee develops sharp nonparametric bounds on treatment effects in the presence of sample selection, requiring only a monotonicity assumption (that treatment affects selection in one direction). These bounds are widely used to address attrition and selective sample composition in randomized experiments.
Manski, C. F. (1990). Nonparametric Bounds on Treatment Effects.
Manski introduces the partial identification approach to treatment effects, showing that even without strong assumptions, one can bound causal effects using the observed data. His worst-case bounds framework lays the theoretical foundation for Lee's sharper bounds under the monotonicity assumption.
Manski, C. F. (2003). Partial Identification of Probability Distributions.
Manski's monograph provides a comprehensive treatment of partial identification, showing how to derive informative bounds on parameters of interest when point identification is not possible. This book formalizes and extends his earlier work on bounding treatment effects and is the standard reference for the theoretical framework underlying Lee bounds.
Semenova, V. (2025). Generalized Lee Bounds.
Semenova generalizes Lee bounds to allow for covariates and machine learning estimation of nuisance functions, improving the tightness of bounds while maintaining their nonparametric validity. This paper connects the Lee bounds literature to the modern machine learning causal inference literature.
Application (2)
Angrist, J., Bettinger, E., & Kremer, M. (2006). Long-Term Educational Consequences of Secondary School Vouchers: Evidence from Administrative Records in Colombia.
Angrist, Bettinger, and Kremer use administrative records to study the long-term effects of Colombia's PACES school voucher lottery, finding that vouchers increase secondary school completion rates by 15-20% and raise college admissions test scores by 0.2 standard deviations. They correct for differential test-taking rates between lottery winners and losers using bounding methods. The paper demonstrates how administrative data and lottery-based instruments enable credible long-term policy evaluation.
Kline, P., & Walters, C. R. (2016). Evaluating Public Programs with Close Substitutes: The Case of Head Start.
Kline and Walters develop a semi-parametric selection model to evaluate Head Start in the presence of close substitute preschool programs, estimating both average and marginal treatment effects. They find that Head Start's effects vary substantially with the quality of available alternatives, and that the program passes a cost-benefit test for the average participant. The paper demonstrates how accounting for alternative program availability changes the interpretation of experimental treatment effects.