Lab·tutorial·5 min read

tutorial90 minutes

Lab: Quantile Treatment Effects from Scratch

Implement quantile treatment effects step by step. Simulate heterogeneous treatment effects across the earnings distribution, estimate conditional QTEs via quantile regression, compute unconditional QTEs via RIF regression, and compare with OLS ATE.

MethodQuantile Treatment Effects (QTE)

LanguagesPython, R, Stata

DatasetJob training program earnings (simulated)

Overview

The average treatment effect (ATE) summarizes a treatment's impact with a single number — the mean difference. But a treatment can affect different parts of the outcome distribution differently. A job training program might help low earners substantially while barely affecting high earners, or it might compress the distribution by raising the floor without changing the ceiling. Quantile treatment effects (QTEs) reveal these distributional patterns.

What you will learn:

Why the ATE can miss important heterogeneity across the outcome distribution
How to estimate conditional QTEs using quantile regression
How to estimate unconditional QTEs using RIF (Recentered Influence Function) regression
The critical difference between conditional and unconditional quantile effects
How to test for treatment effect heterogeneity across quantiles
How to interpret a QTE curve

Prerequisites: OLS regression, basic understanding of quantiles and distributions.

Step 1: Simulate Heterogeneous Treatment Effects

We simulate a job training program where the treatment effect varies across the earnings distribution: large gains for low earners, modest gains for high earners.

1library(quantreg)
2
3set.seed(42)
4n <- 5000
5
6# Covariates
7educ <- round(rnorm(n, mean = 12, sd = 2))
8exper <- pmax(round(rnorm(n, mean = 10, sd = 5)), 0)
9female <- rbinom(n, 1, 0.5)
10
11# Treatment assignment (randomized experiment)
12treat <- rbinom(n, 1, 0.5)
13
14# Potential outcomes WITHOUT treatment
15# Base earnings depend on education, experience, and unobserved heterogeneity
16u <- rnorm(n)  # Unobserved ability/luck
17earnings_0 <- exp(2.5 + 0.08 * educ + 0.03 * exper - 0.15 * female + 0.4 * u)
18
19# Treatment effect: HETEROGENEOUS across the distribution
20# Larger effects for low earners (those with low u), smaller for high earners
21# Treatment effect in levels: tau(q) ~ 5000 at q=0.1, ~1000 at q=0.9
22te_individual <- pmax(6000 - 3000 * pnorm(u), 500)
23
24# Potential outcomes WITH treatment
25earnings_1 <- earnings_0 + te_individual
26
27# Observed outcome
28earnings <- ifelse(treat == 1, earnings_1, earnings_0)
29
30df <- data.frame(earnings, treat, educ, exper, female, u)
31
32# True ATE
33true_ate <- mean(te_individual)
34cat("True ATE:", round(true_ate, 0), "\n")
35cat("True QTE at q=0.10:", round(quantile(earnings_1, 0.10) - quantile(earnings_0, 0.10), 0), "\n")
36cat("True QTE at q=0.50:", round(quantile(earnings_1, 0.50) - quantile(earnings_0, 0.50), 0), "\n")
37cat("True QTE at q=0.90:", round(quantile(earnings_1, 0.90) - quantile(earnings_0, 0.90), 0), "\n")

Expected output:

Quantile	True QTE
0.10	~4,000–6,000
0.25	~3,500–5,000
0.50	~2,500–4,000
0.75	~1,500–3,000
0.90	~800–2,000
ATE	~3,000–4,000

The treatment effect is largest at the bottom of the distribution (low earners benefit most) and smallest at the top. The ATE masks this important heterogeneity.

Step 2: OLS Average Treatment Effect

Start with the standard ATE estimate for comparison.

1# OLS ATE (simple difference in means for RCT)
2ate_simple <- mean(df$earnings[df$treat == 1]) - mean(df$earnings[df$treat == 0])
3cat("ATE (simple difference):", round(ate_simple, 0), "\n")
4
5# OLS with controls
6ols_model <- lm(earnings ~ treat + educ + exper + female, data = df)
7cat("ATE (OLS with controls):", round(coef(ols_model)["treat"], 0), "\n")
8cat("True ATE:", round(true_ate, 0), "\n")
9cat("\nThe ATE tells us the program raises earnings by ~$",
10  round(coef(ols_model)["treat"], 0), " on average.\n")
11cat("But does it help everyone equally? QTEs will answer this.\n")

Expected output:

Estimator	ATE Estimate	True ATE
Simple difference	~3,000–4,000	~3,500
OLS with controls	~3,000–4,000	~3,500

The ATE estimate is a single summary number. It conceals the fact that the treatment effect varies substantially across the earnings distribution.

Step 3: Conditional Quantile Treatment Effects

Conditional QTEs estimate how the treatment shifts the conditional quantile of earnings (conditional on covariates) at each quantile level. This uses standard quantile regression.

1# Conditional quantile regression at multiple quantiles
2taus <- c(0.10, 0.25, 0.50, 0.75, 0.90)
3cqte_results <- data.frame(tau = numeric(), coeff = numeric(),
4                         se = numeric(), lower = numeric(), upper = numeric())
5
6for (tau in taus) {
7qr_model <- rq(earnings ~ treat + educ + exper + female, data = df, tau = tau)
8s <- summary(qr_model, se = "boot", R = 200)
9
10coeff <- s$coefficients["treat", "Value"]
11se <- s$coefficients["treat", "Std. Error"]
12
13cqte_results <- rbind(cqte_results, data.frame(
14  tau = tau, coeff = coeff, se = se,
15  lower = coeff - 1.96 * se, upper = coeff + 1.96 * se
16))
17}
18
19cat("=== Conditional Quantile Treatment Effects ===\n")
20cat(sprintf("%-6s %10s %10s %10s\n", "Tau", "CQTE", "SE", "OLS ATE"))
21for (i in 1:nrow(cqte_results)) {
22cat(sprintf("%-6.2f %10.0f %10.0f %10.0f\n",
23    cqte_results$tau[i], cqte_results$coeff[i],
24    cqte_results$se[i], coef(ols_model)["treat"]))
25}
26
27cat("\nThe CQTE is larger at lower quantiles: the treatment\n")
28cat("helps low earners more than high earners (conditional on X).\n")

Expected output:

Quantile (tau)	CQTE	SE	OLS ATE
0.10	~4,000–5,500	~500–800	~3,500
0.25	~3,500–4,500	~400–600	~3,500
0.50	~3,000–4,000	~400–600	~3,500
0.75	~2,000–3,000	~400–700	~3,500
0.90	~1,000–2,000	~600–1,000	~3,500

The conditional QTE is monotonically decreasing across quantiles: the treatment helps low earners (conditional on covariates) more than high earners. The ATE of approximately 3,500 is a weighted average that masks this heterogeneity.

Concept Check

What is the difference between a conditional quantile treatment effect (CQTE) and an unconditional quantile treatment effect (UQTE)?

They are the same thing — both estimate the treatment effect at different quantiles of the outcome.The CQTE estimates the effect at quantiles of Y|X (the conditional distribution), while the UQTE estimates the effect at quantiles of Y (the marginal distribution). They answer different questions.The CQTE is biased while the UQTE is unbiased.The UQTE requires instrumental variables while the CQTE does not.

Step 4: Unconditional QTEs via RIF Regression

The unconditional QTE estimates the effect at quantiles of the marginal distribution of Y. We use the Recentered Influence Function (RIF) approach of Firpo et al. (2009).

1# RIF regression for unconditional quantile effects
2# RIF(Y; q_tau) = q_tau + (tau - I(Y <= q_tau)) / f_Y(q_tau)
3# where q_tau is the tau-th quantile of Y and f_Y is the density of Y
4
5uqte_results <- data.frame(tau = numeric(), uqte = numeric(),
6                         se = numeric())
7
8for (tau in taus) {
9# Step 1: Estimate the unconditional quantile q_tau
10q_tau <- quantile(df$earnings, tau)
11
12# Step 2: Estimate the density at q_tau using kernel density
13dens <- density(df$earnings, n = 1024)
14f_q <- approx(dens$x, dens$y, xout = q_tau)$y
15
16# Step 3: Compute the RIF for each observation
17rif <- q_tau + (tau - as.numeric(df$earnings <= q_tau)) / f_q
18
19# Step 4: Run OLS with RIF as the dependent variable
20df$rif <- rif
21rif_model <- lm(rif ~ treat + educ + exper + female, data = df)
22
23uqte_results <- rbind(uqte_results, data.frame(
24  tau = tau,
25  uqte = coef(rif_model)["treat"],
26  se = summary(rif_model)$coef["treat", "Std. Error"]
27))
28}
29
30cat("=== Unconditional QTEs (RIF Regression) ===\n")
31cat(sprintf("%-6s %10s %10s %10s\n", "Tau", "UQTE", "CQTE", "OLS ATE"))
32for (i in 1:nrow(uqte_results)) {
33cat(sprintf("%-6.2f %10.0f %10.0f %10.0f\n",
34    uqte_results$tau[i], uqte_results$uqte[i],
35    cqte_results$coeff[i], coef(ols_model)["treat"]))
36}

Expected output:

Quantile	UQTE (RIF)	CQTE	OLS ATE
0.10	~4,500–6,000	~4,500	~3,500
0.25	~3,500–5,000	~4,000	~3,500
0.50	~3,000–4,000	~3,500	~3,500
0.75	~2,000–3,000	~2,500	~3,500
0.90	~800–2,000	~1,500	~3,500

Both CQTE and UQTE show declining treatment effects across quantiles, but the magnitudes may differ because they measure effects at different points — conditional vs. unconditional quantiles.

Concept Check

For policy purposes, a government wants to know: 'Does this job training program help the bottom 10% of earners?' Which estimand should they use?

The conditional QTE at tau = 0.10, because it conditions on observable characteristics.The unconditional QTE at tau = 0.10, because it measures the effect at the 10th percentile of the overall earnings distribution, which directly answers the policy question.The OLS ATE, because it gives the overall average effect.Neither — you need a different method entirely.

Step 5: Test for Treatment Effect Heterogeneity

Is the treatment effect truly heterogeneous across quantiles, or is the pattern just noise? We test whether the QTEs are significantly different from each other.

1# Formal test: is the QTE at 0.10 different from the QTE at 0.90?
2qr_10 <- rq(earnings ~ treat + educ + exper + female, data = df, tau = 0.10)
3qr_90 <- rq(earnings ~ treat + educ + exper + female, data = df, tau = 0.90)
4
5cat("CQTE at tau = 0.10:", coef(qr_10)["treat"], "\n")
6cat("CQTE at tau = 0.90:", coef(qr_90)["treat"], "\n")
7cat("Difference:", coef(qr_10)["treat"] - coef(qr_90)["treat"], "\n")
8
9# Joint test: are all QTEs equal? (Wald test)
10qr_process <- rq(earnings ~ treat + educ + exper + female,
11                data = df, tau = taus)
12wald_test <- anova(qr_process, joint = FALSE)
13cat("\nWald test for equality of QTEs across quantiles:\n")
14print(wald_test)
15
16# Interquartile range effect
17qr_25 <- rq(earnings ~ treat + educ + exper + female, data = df, tau = 0.25)
18qr_75 <- rq(earnings ~ treat + educ + exper + female, data = df, tau = 0.75)
19iqr_effect <- coef(qr_75)["treat"] - coef(qr_25)["treat"]
20cat("\nEffect on IQR (q75 - q25):", round(iqr_effect, 0), "\n")
21cat("Negative = treatment compresses the distribution\n")

Expected output:

Test	Statistic	p-value	Interpretation
CQTE(0.10) vs CQTE(0.90)	z ~ 3–5	< 0.01	Significant heterogeneity
Effect on IQR	~-1,500 to -3,000	—	Treatment compresses distribution

The test should reject the null of equal QTEs: the treatment effect is significantly larger at the 10th percentile than at the 90th percentile. The negative IQR effect confirms that the treatment compresses the earnings distribution — it is an equalizing intervention.

Step 6: The QTE Curve

Plot the full QTE curve across all quantiles to visualize the heterogeneity pattern.

1# Estimate CQTEs at a fine grid of quantiles
2tau_grid <- seq(0.05, 0.95, by = 0.05)
3qte_curve <- data.frame(tau = numeric(), qte = numeric(),
4                      lower = numeric(), upper = numeric())
5
6for (tau in tau_grid) {
7qr <- rq(earnings ~ treat + educ + exper + female, data = df, tau = tau)
8s <- summary(qr, se = "boot", R = 200)
9coeff <- s$coefficients["treat", "Value"]
10se <- s$coefficients["treat", "Std. Error"]
11
12qte_curve <- rbind(qte_curve, data.frame(
13  tau = tau, qte = coeff,
14  lower = coeff - 1.96 * se, upper = coeff + 1.96 * se
15))
16}
17
18# Plot
19plot(qte_curve$tau, qte_curve$qte, type = "l", lwd = 2, col = "blue",
20   xlab = "Quantile (tau)", ylab = "Treatment Effect ($)",
21   main = "Quantile Treatment Effect Curve",
22   ylim = range(c(qte_curve$lower, qte_curve$upper)))
23polygon(c(qte_curve$tau, rev(qte_curve$tau)),
24      c(qte_curve$lower, rev(qte_curve$upper)),
25      col = rgb(0, 0, 1, 0.1), border = NA)
26abline(h = coef(ols_model)["treat"], col = "red", lty = 2, lwd = 2)
27legend("topright", legend = c("CQTE", "95% CI", "OLS ATE"),
28     col = c("blue", rgb(0, 0, 1, 0.3), "red"),
29     lty = c(1, NA, 2), lwd = c(2, NA, 2),
30     fill = c(NA, rgb(0, 0, 1, 0.1), NA), border = NA)

Expected output:

The QTE curve should show a monotonically declining pattern: large positive effects at low quantiles (around $4,000–6,000 at the 10th percentile) declining to smaller positive effects at high quantiles (around$ 1,000–2,000 at the 90th percentile). The OLS ATE (horizontal dashed line) runs through the middle, showing how the single average masks the heterogeneity.

Step 7: Comparison Summary

1# Final summary table
2cat("================================================================\n")
3cat("                    TREATMENT EFFECT SUMMARY                     \n")
4cat("================================================================\n")
5cat(sprintf("%-30s %10s %10s\n", "Estimand", "Estimate", "True"))
6cat("----------------------------------------------------------------\n")
7cat(sprintf("%-30s %10.0f %10.0f\n", "OLS ATE",
8  coef(ols_model)["treat"], true_ate))
9cat("----------------------------------------------------------------\n")
10
11for (i in 1:nrow(cqte_results)) {
12true_qte <- quantile(earnings_1, cqte_results$tau[i]) -
13            quantile(earnings_0, cqte_results$tau[i])
14cat(sprintf("%-30s %10.0f %10.0f\n",
15    paste0("CQTE at tau=", cqte_results$tau[i]),
16    cqte_results$coeff[i], true_qte))
17}
18
19cat("----------------------------------------------------------------\n")
20for (i in 1:nrow(uqte_results)) {
21true_qte <- quantile(earnings_1, uqte_results$tau[i]) -
22            quantile(earnings_0, uqte_results$tau[i])
23cat(sprintf("%-30s %10.0f %10.0f\n",
24    paste0("UQTE (RIF) at tau=", uqte_results$tau[i]),
25    uqte_results$uqte[i], true_qte))
26}
27
28cat("================================================================\n")
29cat("\nKey finding: the program reduces earnings inequality by\n")
30cat("helping low earners substantially more than high earners.\n")

Step 8: Exercises

Guided Exercise

Interpreting a QTE Curve

You estimate a QTE curve for a minimum wage increase on worker earnings. The UQTE at tau = 0.10 is $800 (significant), at tau = 0.50 is $200 (insignificant), and at tau = 0.90 is -$50 (insignificant). The OLS ATE is $300.

Uniform treatment effects. Modify the simulation so that the treatment effect is the same for everyone (no heterogeneity). Re-estimate the QTE curve. It should be approximately flat, confirming that the method does not spuriously detect heterogeneity.
Endogenous treatment. In many applications, treatment is not randomly assigned. Implement an IV quantile regression (Chernozhukov and Hansen (2005)) to estimate QTEs with endogenous treatment.
Unconditional QTE via CDF inversion. An alternative to RIF regression is to estimate the entire counterfactual CDFs for treated and control groups and then invert them. Implement this approach and compare.

✓Key Takeaways

The average treatment effect (ATE) is a single summary that can mask important heterogeneity across the outcome distribution
Conditional quantile treatment effects (CQTEs) estimate the effect at quantiles of Y|X (the conditional distribution); unconditional QTEs (UQTEs) estimate the effect at quantiles of Y (the marginal distribution)
For policy questions about specific population segments (e.g., the bottom 10% of earners), unconditional QTEs are more directly relevant
RIF regression provides a straightforward way to estimate unconditional QTEs using standard OLS on a transformed dependent variable
A declining QTE curve indicates the treatment helps low earners more than high earners, reducing inequality
Always test for treatment effect heterogeneity — a flat QTE curve means the ATE is sufficient
QTE curves should be plotted with confidence bands to distinguish genuine heterogeneity from noise
The distinction between conditional and unconditional quantile effects is fundamental and often confused in applied work

Overview#

Step 1: Simulate Heterogeneous Treatment Effects#

Step 2: OLS Average Treatment Effect#

Step 3: Conditional Quantile Treatment Effects#

Step 4: Unconditional QTEs via RIF Regression#

Step 5: Test for Treatment Effect Heterogeneity#

Step 6: The QTE Curve#

Step 7: Comparison Summary#

Step 8: Exercises#

✓Key Takeaways#

Overview

Step 1: Simulate Heterogeneous Treatment Effects

Step 2: OLS Average Treatment Effect

Step 3: Conditional Quantile Treatment Effects

Step 4: Unconditional QTEs via RIF Regression

Step 5: Test for Treatment Effect Heterogeneity

Step 6: The QTE Curve

Step 7: Comparison Summary

Step 8: Exercises

✓Key Takeaways