When should I use Marginal Treatment Effects (MTE)?

When you have IV and want to understand how the treatment effect varies with the unobserved propensity to select into treatment, or when you need to extrapolate from LATE to policy-relevant treatment effects.

What is the key assumption of Marginal Treatment Effects (MTE)?

A valid instrument plus a threshold-crossing selection model: D = 1[P(Z) >= U_D] where U_D is the unobserved resistance. This threshold-crossing structure is equivalent to Imbens-Angrist monotonicity (Vytlacil 2002).

What is the most common mistake with Marginal Treatment Effects (MTE)?

Assuming LATE equals ATE. LATE captures the effect for compliers only. MTE reveals that treatment effects can vary systematically with selection propensity.

Method·advanced·14 min read

Design-BasedModern

Marginal Treatment Effects (MTE)

Unifies IV/LATE, ATE, and ATT as weighted averages of the MTE curve -- the treatment effect as a function of unobserved resistance to treatment.

When to Use: When you have IV and want to understand how the treatment effect varies with the unobserved propensity to select into treatment, or when you need to extrapolate from LATE to policy-relevant treatment effects.
Assumption: A valid instrument plus a threshold-crossing selection model: D = 1[P(Z) >= U_D] where U_D is the unobserved resistance. This threshold-crossing structure is equivalent to Imbens-Angrist monotonicity (Vytlacil 2002).
Mistake: Assuming LATE equals ATE. LATE captures the effect for compliers only. MTE reveals that treatment effects can vary systematically with selection propensity.
Reading Time: ~14 min read · 11 sections · 7 interactive exercises

One-Line Implementation

Rivmte(data = df, outcome = 'y', target = 'ate', m0 = ~ u + x1, m1 = ~ u + x1, propensity = treatment ~ instrument + x1)

Statamtefe y (treatment = instrument), polynomial(2) mte(u_D)

Python# Manual: estimate E[Y|P(Z)=p] nonparametrically, then MTE(u) = d/dp E[Y|P=p] at p=u

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: Expanding a Job Training Program

A government is considering expanding a subsidized job training program. An earlier randomized encouragement design estimated a of $3,200 in annual earnings gains for compliers — those induced to participate by the randomized encouragement letter.

The policy question is: if the program is expanded by relaxing eligibility rules, will the next group of participants benefit as much as the original compliers? The program director assumes yes and budgets accordingly.

But the original compliers were people at the margin of participation — those who needed only a gentle push (the encouragement letter) to enroll. The expansion targets people who did not enroll even with encouragement. These individuals have higher to treatment — perhaps they face greater barriers, have lower expected returns, or are less motivated.

The framework, developed by Heckman and Vytlacil (2005), reveals that the treatment effect varies systematically with the propensity to participate. When MTE declines with unobserved resistance — as it often does when individuals self-select based on expected gains — the LATE of $3,200 overstates the benefit for the next marginal participant. The for the expansion might be only $1,800.

Without MTE, the program director would have over-predicted benefits by 78%. With MTE, she can compute the correct marginal return and design the expansion accordingly.

AOverview

What Marginal Treatment Effects Does

The MTE framework starts with a fundamental insight: in a world with treatment effect heterogeneity, different causal estimands — ATE, ATT, LATE — are all weighted averages of the same underlying object: the MTE curve.

The MTE is defined as:

\text{MTE}(x, u_D) = E[Y(1) - Y(0) \mid X = x, U_D = u_D]

where $U_D$ is the unobserved component of the selection decision, normalized to be uniform on $[0, 1]$ . An individual selects into treatment when the propensity score $P(Z)$ exceeds their unobserved resistance:

D = \mathbf{1}[P(Z) \geq U_D]

Individuals with low $u_D$ (low resistance) are eager participants — they select into treatment even when $P(Z)$ is low. Individuals with high $u_D$ (high resistance) are reluctant — they participate only when $P(Z)$ is very high.

The MTE curve traces how the treatment effect varies across this spectrum of unobserved resistance.

The Unifying Framework

Every conventional treatment effect parameter is a weighted average of MTE:

\Delta^j = \int_0^1 \text{MTE}(u) \cdot \omega^j(u) \, du

where $\omega^j(u)$ is the weight function specific to estimand $j$ :

Estimand	Weight function $\omega(u)$	Interpretation
ATE	$\omega^{ATE}(u) = 1$ (uniform)	Averages over all resistance levels equally
ATT	$\omega^{ATT}(u) \propto P(P(Z) \geq u)$	Overweights eager participants (low $u_D$ ); proportional to $(1-u)$ when $P(Z)$ is uniform on $[0,1]$ (the normalized density on $[0,1]$ is $2(1-u)$ , integrating to 1)
ATU (average treatment effect on the untreated)	$\omega^{ATU}(u) \propto P(P(Z) < u)$	Overweights reluctant non-participants (high $u_D$ ); proportional to $u$ when $P(Z)$ is uniform on $[0,1]$ (normalized: $2u$ )
LATE	$\omega^{LATE}(u) = \frac{1}{p_1 - p_0}$ for $u \in [p_0, p_1]$	Uniform over the complier margin
Policy-Relevant Treatment Effect (PRTE)	$\omega^{PRTE}(u)$ depends on the policy	Weights determined by who the policy moves

When MTE is flat (constant in $u_D$ ), all estimands are equal: LATE = ATE = ATT. This equality is the case of no . When MTE slopes downward (positive selection on gains), eager participants benefit more and ATT > ATE > ATU; LATE captures the effect somewhere in between, depending on the complier margin.

How It Differs from Standard IV

Standard IV with a single instrument identifies a single number: the LATE for the complier subpopulation defined by that instrument. Different instruments identify different LATEs for different complier groups. MTE goes further by recovering the entire curve of treatment effects as a function of $u_D$ , from which any target parameter can be computed as a weighted average.

Common Confusions

Frequently asked questions about MTE

Q: Does MTE require a continuous instrument? Not strictly. Brinch et al. (2017) showed that MTE can be estimated with a discrete instrument, though the identified region is limited to the support of $P(Z)$ . With a binary instrument, only the average of MTE over $[P(Z=0), P(Z=1)]$ — i.e. the LATE for the complier subpopulation defined by $Z$ — is point-identified without further restrictions. Recovering the MTE curve itself requires either more values of $Z$ or shape restrictions on the curve (a polynomial in $u_D$ , monotonicity, or rank similarity).
Q: Is MTE the same as CATE? No. conditions on observed covariates: $E[Y(1) - Y(0) | X = x]$ . MTE conditions on unobserved resistance: $E[Y(1) - Y(0) | X = x, U_D = u]$ . MTE captures heterogeneity along the dimension of selection into treatment, which is central for policy extrapolation. CATE methods (e.g., causal forests) do not address the selection dimension.
Q: When does LATE equal ATE? When MTE is flat — there is no essential heterogeneity. This equality means the treatment effect does not vary with unobserved resistance. In practice, this condition is testable: if including $P(Z)$ interactions in the outcome equation adds no explanatory power, MTE is approximately flat.
Q: What is the "support problem"? The MTE is only identified over the support of the propensity score $P(Z)$ . If $P(Z)$ ranges from 0.3 to 0.7, MTE is identified only for $u_D \in [0.3, 0.7]$ . Computing ATE requires the MTE over the entire $[0, 1]$ interval. Without full support, ATE is only partially identified, and bounds must be used (Mogstad et al., 2018).

BIdentification

For MTE to be identified, three conditions must hold (Heckman & Vytlacil, 2005):

Assumption 1: Valid Instrument

Plain language: The instrument $Z$ affects the outcome $Y$ only through its effect on treatment $D$ . The instrument is relevant (it shifts $P(Z)$ ), exogenous (independent of potential outcomes and unobserved resistance), and satisfies the exclusion restriction.

Formally: $(Y(0), Y(1), U_D) \perp Z | X$ and $P(Z)$ is a nontrivial function of $Z$ .

This assumption is the same requirement as for standard IV/LATE identification. The MTE framework does not weaken the instrument validity requirements.

Assumption 2: Threshold-Crossing Selection Model

Plain language: Treatment take-up is determined by a threshold-crossing rule: an individual participates when the propensity score exceeds their unobserved resistance. This rule means there exists a single latent index that governs selection, and the instrument operates through this index.

Formally: $D = \mathbf{1}[P(Z) \geq U_D]$ where $U_D \sim \text{Uniform}(0, 1)$ after normalization. The selection equation can be derived from a latent utility model: $D = \mathbf{1}[\mu_D(Z) - V \geq 0]$ where $V$ represents unobserved costs/resistance, and $U_D = F_V(V)$ is the CDF transformation.

Assumption 3: Monotonicity of $P(Z)$ in $Z$

Plain language: Increasing the instrument value (weakly) increases the probability of treatment for all individuals. There are no "defiers" — individuals for whom a higher $Z$ reduces participation.

Formally: This condition is the standard IV assumption, embedded in the threshold-crossing model. Because $D = \mathbf{1}[P(Z) \geq U_D]$ and $P(Z)$ is the same function for everyone, monotonicity is automatically satisfied.

The Local IV Identification Strategy

The key identification result is:

\text{MTE}(x, u) = \frac{\partial E[Y \mid X = x, P(Z) = p]}{\partial p} \bigg|_{p = u}

The derivative of the conditional expectation of $Y$ with respect to the propensity score, evaluated at $p = u$ , gives the MTE at $u_D = u$ . Intuitively: a small increase in $P(Z)$ from $p$ to $p + dp$ induces participation by the marginal group with $U_D \approx p$ . The corresponding change in the average outcome reveals the treatment effect for this marginal group.

The local-IV identity means that variation in $P(Z)$ across different values of $Z$ traces out the MTE curve. Richer variation in the instrument (more values of $Z$ , wider support of $P(Z)$ ) identifies the MTE over a larger portion of $[0, 1]$ .

Estimation Procedure

The Local IV Approach

The workhorse estimation procedure proceeds in three steps:

Step 1: Estimate the propensity score. Estimate $\hat{P}(Z) = P(D = 1 | Z, X)$ using a probit or logit model.

Step 2: Estimate $E[Y | P(Z) = p]$ as a function of $p$ . Use either:

Parametric approach: regress $Y$ on $D$ , $\hat{P}$ , $\hat{P}^2$ , $D \times \hat{P}$ , $D \times \hat{P}^2$ , and $X$
Semiparametric approach: local polynomial regression of $Y$ on $\hat{P}$

Step 3: Differentiate to obtain MTE. Compute $\text{MTE}(u) = \partial E[Y | P = p] / \partial p$ evaluated at $p = u$ :

For the parametric approach: $\text{MTE}(u) = \hat{\beta}_D + \hat{\delta}_1 u + \hat{\delta}_2 u^2$
For the semiparametric approach: numerical differentiation of the local polynomial fit

CVisual Intuition

The central object of the MTE framework is the MTE curve — a plot of $\text{MTE}(u_D)$ against $u_D \in [0, 1]$ . The shape of this curve reveals the nature of selection into treatment:

Flat MTE: No essential heterogeneity. Treatment effects are homogeneous across the selection dimension. LATE = ATE = ATT. Standard IV is sufficient.
Downward-sloping MTE: Positive selection on gains. Individuals who are most eager to participate (low $u_D$ ) benefit most. ATT > ATE > ATU. LATE overestimates ATE.
Upward-sloping MTE: Negative selection on gains (rare). Reluctant participants benefit more. ATT < ATE < ATU.
U-shaped or inverse-U MTE: Non-monotonic heterogeneity. The relationship between selection and gains is complex.

The weight functions determine how each estimand aggregates the MTE curve:

ATE weights are uniform: every point on the MTE curve receives equal weight
ATT weights tilt toward low $u_D$ : the treated population is disproportionately composed of eager participants
LATE weights are concentrated on the complier interval $[P(Z=z_0), P(Z=z_1)]$ : only the margin shifted by the instrument matters
PRTE weights depend on the specific policy: a program expansion to the next 10% weights the portion of the MTE curve corresponding to those marginal participants

DMathematical Derivation

Don't worry about the notation yet — here's what this means in words: The marginal treatment effect is identified as the derivative of E[Y|P(Z)=p] with respect to the propensity score p, evaluated at p = u_D.

Setup. Under the threshold-crossing model, treatment selection is $D_i = \mathbf{1}[P(Z_i) \geq U_{D,i}]$ , where $P(Z) = P(D=1|Z)$ is the propensity score and $U_D \sim \text{Uniform}(0,1)$ captures unobserved resistance to treatment.

Step 1: Conditional expectation. The observed outcome conditional on the propensity score is:

E[Y \mid P(Z) = p] = E[Y(0)] + \int_0^p \text{MTE}(u) \, du

This integral relationship shows that $E[Y | P = p]$ is a running sum of the MTE curve from 0 to $p$ .

Step 2: Differentiate to recover MTE. Taking the derivative with respect to $p$ :

\text{MTE}(u_D) = \frac{\partial E[Y \mid P(Z) = p]}{\partial p} \bigg|_{p = u_D}

The MTE at any point $u_D$ is the slope of $E[Y | P = p]$ evaluated at $p = u_D$ . Intuitively, a small increase in $P(Z)$ induces the next marginal person (with $U_D = p$ ) to take treatment, and the change in the average outcome reveals their treatment effect.

Step 3: Estimands as weighted integrals. Any treatment effect parameter $\Delta$ can be written as:

\Delta = \int_0^1 \text{MTE}(u) \cdot \omega_\Delta(u) \, du

where $\omega_\Delta(u)$ is a weight function specific to the estimand:

ATE: $\omega_{ATE}(u) = 1$ (uniform)
ATT: $\omega_{ATT}(u) \propto 1 - F_P(u)$ (tilted toward eager participants; simplifies to $1 - u$ when $P(Z)$ is uniform)
LATE: $\omega_{LATE}(u) \propto \mathbf{1}[p_0 \leq u \leq p_1]$ (concentrated on compliers)

The key identification result of Heckman and Vytlacil (2005) is that the MTE is recovered as the derivative of the conditional expectation of $Y$ with respect to the propensity score $p = P(Z)$ .

Begin with the outcome equation under the threshold-crossing model. Because $D = \mathbf{1}[P(Z) \geq U_D]$ , the conditional expectation of $Y$ given $X = x$ and $P(Z) = p$ is:

E[Y \mid X = x, P(Z) = p] = E[Y(0) \mid X = x] + \int_0^{p} \text{MTE}(x, u) \, du

The integral accumulates treatment effects for all individuals whose unobserved resistance $U_D$ falls below the threshold $p$ — these individuals are the ones induced into treatment. Differentiating both sides with respect to $p$ yields:

\text{MTE}(x, p) = \frac{\partial E[Y \mid X = x, P(Z) = p]}{\partial p}

Intuitively, a marginal increase in the propensity score from $p$ to $p + dp$ induces participation by individuals with $U_D \approx p$ . The resulting change in the conditional expectation of $Y$ reveals the treatment effect for this marginal group — precisely the MTE evaluated at $u_D = p$ .

This derivative-based identification strategy is the foundation of the local IV estimator: estimate $E[Y | X, P]$ as a smooth function of $P$ (parametrically or semiparametrically), then differentiate to recover the MTE curve.

EImplementation

1# Requires: ivmte
2# Using the ivmte package (Mogstad, Santos, Torgovitsky)
3library(ivmte)
4
5# --- Step 1: Estimate the MTE via ivmte ---
6# ivmte() estimates Marginal Treatment Effects using IV-like moments.
7# target = "ate" extracts the Average Treatment Effect from the MTE curve.
8# m0/m1 = parametric specifications for E[Y(0)|X,U] and E[Y(1)|X,U],
9#   where u is the unobserved resistance to treatment (propensity score rank).
10#   I(u^2) allows the MTE curve to be nonlinear in u.
11# ivlike = the IV moment equation linking outcome, treatment, and covariates.
12# propensity = first-stage equation for the propensity score P(D=1|Z,X).
13# instrument = the excluded instrument(s) that shift treatment take-up.
14mte_fit <- ivmte(
15data = df,
16target = "ate",
17m0 = ~ x1 + x2 + u + I(u^2),
18m1 = ~ x1 + x2 + u + I(u^2),
19ivlike = y ~ treatment + x1 + x2,
20propensity = treatment ~ instrument + x1 + x2,
21instrument = ~ instrument
22)
23# Output: ATE estimate with confidence interval, and MTE curve parameters.
24# A declining MTE curve means those most likely to take treatment
25# benefit the most (positive selection into treatment).
26print(mte_fit)

FDiagnostics

Test for Essential Heterogeneity

The first diagnostic question is whether MTE is flat. If it is, LATE = ATE and the MTE framework adds nothing beyond standard IV. Test by including $P(Z)$ interactions in the outcome equation:

Y = \alpha + \beta D + \gamma_1 P(Z) + \gamma_2 P(Z)^2 + \delta_1 D \cdot P(Z) + \delta_2 D \cdot P(Z)^2 + X'\theta + \varepsilon

A joint F-test on $(\delta_1, \delta_2)$ tests whether MTE varies with $u_D$ . Rejection implies essential heterogeneity: LATE $\neq$ ATE.

Propensity Score Support

The MTE is identified only over the support of $P(Z)$ . Report:

The range of $\hat{P}(Z)$ (e.g., the 1st and 99th percentiles)
What fraction of $[0, 1]$ is covered
Whether the target parameter (e.g., ATE, which requires full support) can be point-identified or only bounded

If the support is narrow, consider using the partial identification approach of Mogstad et al. (2018).

Visual Inspection of the MTE Curve

Plot the estimated MTE curve with confidence bands. Look for:

The slope: is MTE declining, rising, or flat?
Confidence band width: is the MTE precisely estimated?
Boundary behavior: are the endpoints of the identified region reliable?

Compare Estimands

Compute ATE, ATT, ATU, and LATE from the estimated MTE curve. Large differences signal important treatment effect heterogeneity and indicate that LATE should not be interpreted as a general treatment effect.

Interpreting Your Results

Reading the Output

The key outputs from an MTE analysis are:

Output	Interpretation
MTE curve	Plot of treatment effect vs. unobserved resistance. Slope reveals selection patterns.
ATE	Population-average treatment effect — uniform weight on MTE
ATT	Effect on the treated — overweights eager participants
LATE	Effect for compliers — concentrated on the instrument's margin
PRTE	Effect for the specific policy change — weights determined by the policy
Essential heterogeneity test	F-test on $P(Z)$ interactions; rejection means LATE $\neq$ ATE
P(Z) support	Range over which MTE is identified

What to Report

A well-reported MTE analysis should include:

The MTE curve with pointwise confidence bands
ATE, ATT, LATE computed from the MTE, with standard errors
PRTE for the specific policy change under consideration
Essential heterogeneity test (F-test on $P(Z)$ interactions)
Propensity score support — range and coverage fraction
Sensitivity to polynomial order and bandwidth
First-stage diagnostics for the propensity score model
Discussion of the threshold-crossing model's plausibility

Example write-up

"We estimate the marginal treatment effect of job training on annual earnings using the local IV approach of Heckman and Vytlacil (2005). The propensity score, estimated via probit using random assignment to encouragement as the instrument, ranges from 0.15 to 0.82. The MTE curve declines from 0.48 at $u_D = 0.15$ to 0.12 at $u_D = 0.82$ (Figure 3), indicating positive selection on gains: individuals most inclined to participate benefit most. The test for essential heterogeneity rejects a flat MTE ( $F = 8.42$ , $p = 0.004$ ). ATE = 0.28 (SE = 0.08), ATT = 0.41 (SE = 0.06), LATE = 0.35 (SE = 0.07). The PRTE for expanding the program by 10 percentage points is 0.18 (SE = 0.09) — substantially below the LATE, confirming that naive extrapolation from LATE would overstate the benefits of expansion."

GWhat Can Go Wrong

What Can Go Wrong

LATE != ATE When Treatment Effects Are Heterogeneous

MTE is flat: treatment effect does not vary with unobserved resistance. All estimands agree.

LATE = 0.35, ATE = 0.35, ATT = 0.35. The MTE curve is a horizontal line. LATE generalizes perfectly to the entire population.

What Can Go Wrong

Insufficient Propensity Score Support

The instrument generates wide variation in P(Z), covering most of [0, 1]. MTE is identified over a broad range.

P(Z) ranges from 0.08 to 0.91. The MTE curve is estimated precisely over [0.08, 0.91], covering 83% of the unit interval. ATE can be computed with minimal extrapolation, and bounds are tight.

What Can Go Wrong

Violated Threshold-Crossing Model

Selection follows a threshold-crossing rule: individuals compare their propensity score P(Z) to their private resistance U_D and participate when P(Z) >= U_D.

MTE(0.2) = 0.38 (SE = 0.07), MTE(0.5) = 0.25 (SE = 0.05), MTE(0.8) = 0.11 (SE = 0.08). The MTE curve declines smoothly in u_D, consistent with positive selection. Parametric (normal) and semiparametric (local IV) estimates agree within 0.03 at all evaluation points.

HPractice

H.1 Concept Checks

Concept Check

The ATE weight function is uniform over [0, 1], but the LATE weight function is peaked at the complier margin. Why does this mean LATE can differ from ATE?

Because LATE uses a smaller sample size than ATE.Because the ATE weight integrates the entire MTE curve equally, while the LATE weight concentrates on the interval [P(Z=z_0), P(Z=z_1)], so they differ whenever MTE is non-constant.Because LATE is biased but ATE is not.Because ATE includes never-takers and always-takers, who have zero treatment effect.

Concept Check

When does LATE equal ATE?

When the instrument is strong (high first-stage F-statistic).When the MTE curve is flat — there is no essential heterogeneity, meaning the treatment effect does not vary with unobserved resistance to treatment.When the sample is large enough for the central limit theorem to apply.When there are no always-takers or never-takers in the population.

Concept Check

A researcher estimates the MTE curve using an instrument whose propensity score P(Z) ranges from 0.30 to 0.65. She then computes ATE = 0.25 by extrapolating the MTE curve to cover [0, 1]. What is wrong with this approach?

Nothing is wrong — parametric extrapolation is standard practice.The MTE is only identified over [0.30, 0.65]. Computing ATE requires the MTE over [0, 1], so the estimate relies on untestable parametric extrapolation for 65% of the curve.She should use a different instrument with wider support.She should trim the sample to units with P(Z) in [0.30, 0.65] and compute a trimmed ATE.

H.2 Guided Exercise

Guided Exercise

Interpreting an MTE Analysis of Returns to College

You study the returns to college education using proximity to a four-year college as an instrument, following the MTE approach. The propensity score (probability of attending college) is estimated via probit and ranges from 0.12 to 0.78. You estimate a quadratic MTE curve and compute target parameters.

Your output:

Parameter	Estimate	SE
MTE at u_D = 0.15	0.52	0.08
MTE at u_D = 0.40	0.38	0.06
MTE at u_D = 0.60	0.25	0.07
MTE at u_D = 0.75	0.15	0.11

ATE (over identified region)	0.33	0.05
ATT	0.44	0.04
LATE (proximity IV)	0.38	0.06

Essential heterogeneity F-test: F = 6.8, p = 0.009

Sensitivity (polynomial order): Linear MTE: ATE = 0.31, ATT = 0.42 Quadratic MTE: ATE = 0.33, ATT = 0.44 Cubic MTE: ATE = 0.34, ATT = 0.45

H.3 Error Detective

Error Detective

Read the analysis below carefully and identify the errors.

A labor economist uses MTE to evaluate a job training program. She has a binary instrument (random encouragement letter) and estimates:

"The propensity score ranges from 0.22 (no encouragement) to 0.48 (encouragement). We estimate MTE using a quadratic polynomial in P(Z). The MTE curve slopes downward from 0.45 at u_D = 0.22 to 0.28 at u_D = 0.48. We compute ATE = 0.18, ATT = 0.52, and PRTE = 0.24 for a 20-percentage-point program expansion.

We conclude that the program should be expanded because even the PRTE of 0.24 is economically significant."

She does not report confidence intervals, the essential heterogeneity test, or sensitivity to polynomial order.

Select all errors you can find:

ATE relies on extrapolation beyond the identified region(ATE computation)

No confidence intervals or standard errors reported(Missing uncertainty measures)

No essential heterogeneity test(Missing diagnostic)

No sensitivity to polynomial order(Missing robustness check)

Error Detective

Read the analysis below carefully and identify the errors.

A health economist estimates MTE of a medication on health outcomes using physician prescribing tendency as an instrument. She reports:

"Using physician prescribing tendency as an instrument, we estimate the propensity score and compute the MTE curve. The MTE is positive and roughly constant at 0.30 across the identified region [0.15, 0.85]. The essential heterogeneity test is insignificant (F = 1.2, p = 0.31). We report ATE = 0.30 and LATE = 0.30.

To compute the PRTE for expanding insurance coverage, we extrapolate the MTE beyond [0.85, 1.0] and find PRTE = 0.28 for a 10-percentage-point coverage expansion."

She computes PRTE by extending the flat MTE to u_D values beyond 0.85.

Select all errors you can find:

PRTE extrapolation beyond the support is unjustified even with a flat MTE(PRTE computation)

Flat MTE makes the elaborate MTE machinery unnecessary(Overall approach)

Physician prescribing tendency exclusion restriction not discussed(Identification)

H.4 You Are the Referee

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors study the returns to a subsidized vocational training program using the MTE framework. They use distance to the nearest training center as an instrument for participation, arguing it satisfies the exclusion restriction. The propensity score, estimated via probit, ranges from 0.18 to 0.62. They estimate a linear MTE curve and find a downward slope, concluding that eager participants benefit more than reluctant ones. They then extrapolate the MTE to the full [0, 1] interval and report ATE = 0.23, ATT = 0.32, and LATE = 0.24. They recommend expanding the program based on the positive ATE.

Key Table

Variable	Coefficient	SE	p-value
MTE intercept	0.42	0.09	<0.001
MTE slope	-0.38	0.18	0.035
ATE (extrapolated)	0.23	0.07	0.001
ATT	0.32	0.05	<0.001
LATE	0.24	0.06	<0.001
P(Z) support	[0.18, 0.62]
Essential heterog. F	4.5		0.035
N	3,800

Authors' Identification Claim

The authors argue that distance to the training center shifts participation without directly affecting earnings, and that the MTE framework allows them to recover the full curve of treatment effects and extrapolate to the population ATE.

ISwap-In: When to Use Something Else

IV/2SLS: when you only need the LATE and do not need to extrapolate to other target parameters. If essential heterogeneity is not a concern (or not testable), standard IV is simpler and more robust.
Matching: when selection is on observables and you want ATE or ATT. Matching addresses a different selection problem (selection on $X$ ) than MTE (selection on $U_D$ ).
Causal Forests: when you want to estimate heterogeneous treatment effects as a function of observed covariates. Causal forests estimate CATE $(x)$ but do not address selection on unobservables.
Heckman Selection Model: when the selection issue is sample selection (observing outcomes only for a non-random subsample) rather than treatment effect heterogeneity along the selection dimension. The Heckman model and MTE share the same structural foundation but answer different questions.

Limitations

Requires sufficient variation in the propensity score. With a binary instrument, the MTE is identified only over the interval $[P(Z=0), P(Z=1)]$ . A narrow support means ATE and ATT cannot be point-identified without parametric extrapolation.
Threshold-crossing model is restrictive. The model $D = \mathbf{1}[P(Z) \geq U_D]$ assumes a single latent index governs selection. This assumption may not hold with multiple treatments, strategic interactions, or complex decision processes.
Computationally intensive. Semiparametric MTE estimation requires local polynomial regression and numerical differentiation, which can be sensitive to bandwidth and polynomial order choices.
Requires a valid instrument. All the standard IV assumptions (relevance, exogeneity, exclusion restriction) must hold. MTE does not relax these requirements; it adds the threshold-crossing structure.
Precision can be poor. The MTE involves estimating a derivative (a second-order object), which is inherently noisier than estimating a conditional mean (a first-order object). Confidence bands on the MTE curve can be wide, especially near the boundaries of the propensity score support.

JReviewer Checklist

Critical Reading Checklist

0 of 8 items checked0%

Is the instrument valid? Are relevance, exogeneity, and the exclusion restriction discussed?
Is the propensity score support reported? What fraction of [0,1] is covered?
Is the MTE curve plotted with confidence bands?
Is the essential heterogeneity test reported?
Are multiple target parameters (ATE, ATT, LATE, PRTE) compared?
Is the threshold-crossing model plausible for this setting?
Are results robust to polynomial order and bandwidth?
If PRTE is computed, is the policy change clearly defined?

Paper Library

Has replication code

Foundational (3)

Brinch, C. N., Mogstad, M., & Wiswall, M. (2017). Beyond LATE with a Discrete Instrument.

Journal of Political EconomyDOI: 10.1086/692712

Brinch, Mogstad, and Wiswall show how to estimate the MTE curve semiparametrically even with a discrete (binary or multivalued) instrument, which is a common case in practice. They demonstrate that the local IV approach can be implemented with discrete instruments by imposing additive separability between observed covariates and unobserved heterogeneity along with a parametric structure on the MTE. Applied to the quantity-quality tradeoff of children using twin births and sibling sex composition as instruments for family size, they find that MTE varies with unobserved resistance to having additional children, demonstrating how discrete instruments can recover policy-relevant heterogeneity beyond LATE.

Heckman, J. J., & Vytlacil, E. (2005). Structural Equations, Treatment Effects, and Econometric Policy Evaluation.

EconometricaDOI: 10.1111/j.1468-0262.2005.00594.x

Heckman and Vytlacil use the marginal treatment effect (MTE) to connect the treatment-effects literature with structural econometric policy evaluation. A central result is that commonly used treatment-effect parameters (ATE, ATT, LATE, PRTE) can be expressed as weighted averages of the MTE curve, with each estimand using a different weight function. The framework shows how IV estimates with different instruments recover different weighted averages of the same underlying MTE, providing the theoretical foundation for understanding instrument-dependent variation in treatment-effect estimates.

Mogstad, M., Santos, A., & Torgovitsky, A. (2018). Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters.

EconometricaDOI: 10.3982/ECTA15463

Mogstad, Santos, and Torgovitsky develop a framework for using instrumental variables to conduct inference on policy-relevant treatment effects under weaker assumptions than full MTE identification. They show that even when the MTE is only partially identified (due to limited support of the propensity score), informative bounds on ATE, ATT, and PRTE can be derived by combining the identified portion of the MTE with shape restrictions. Their approach uses linear programming to compute sharp bounds on the target parameter given the data and assumptions. The paper provides the R package ivmte for implementation and demonstrates that useful policy conclusions can be drawn even without point-identifying the entire MTE curve.

Application (1)

Cornelissen, T., Dustmann, C., Raute, A., & Schonberg, U. (2016). From LATE to MTE: Alternative Methods for the Evaluation of Policy Interventions.

Labour EconomicsDOI: 10.1016/j.labeco.2016.06.004

Cornelissen, Dustmann, Raute, and Schonberg provide an accessible methodological guide to MTE estimation, covering the theoretical foundations and practical steps for moving from LATE to the full marginal treatment effect curve. The paper explains how to use local instrumental variables to trace out how treatment effects vary with individuals' unobserved propensity to participate. It serves as a tutorial for applied researchers seeking to implement MTE methods, with clear exposition of identification, estimation, and interpretation.

One-Line Implementation

Download Full Analysis Code

Motivating Example: Expanding a Job Training Program#

AOverview#

What Marginal Treatment Effects Does#

The Unifying Framework#

How It Differs from Standard IV#

Common Confusions#

BIdentification#

Assumption 1: Valid Instrument#

Assumption 2: Threshold-Crossing Selection Model#

Assumption 3: Monotonicity of P(Z)P(Z)P(Z) in ZZZ#

The Local IV Identification Strategy#

Estimation Procedure#

The Local IV Approach

CVisual Intuition#

DMathematical Derivation#

EImplementation#

FDiagnostics#

Test for Essential Heterogeneity#

Propensity Score Support#

Visual Inspection of the MTE Curve#

Compare Estimands#

Interpreting Your Results#

Reading the Output

What to Report

GWhat Can Go Wrong#

LATE != ATE When Treatment Effects Are Heterogeneous

Insufficient Propensity Score Support

Violated Threshold-Crossing Model

HPractice#

H.1 Concept Checks#

H.2 Guided Exercise#

H.3 Error Detective#

H.4 You Are the Referee#

Paper Summary

Key Table

Authors' Identification Claim

ISwap-In: When to Use Something Else#

Limitations#

JReviewer Checklist#

Critical Reading Checklist

Paper Library

Foundational (3)

Application (1)

Tags

Motivating Example: Expanding a Job Training Program

AOverview

What Marginal Treatment Effects Does

The Unifying Framework

How It Differs from Standard IV

Common Confusions

BIdentification

Assumption 1: Valid Instrument

Assumption 2: Threshold-Crossing Selection Model

Assumption 3: Monotonicity of $P(Z)$ in $Z$

The Local IV Identification Strategy

Estimation Procedure

CVisual Intuition

DMathematical Derivation

EImplementation

FDiagnostics

Test for Essential Heterogeneity

Propensity Score Support

Visual Inspection of the MTE Curve

Compare Estimands

Interpreting Your Results

GWhat Can Go Wrong

HPractice

H.1 Concept Checks

H.2 Guided Exercise

H.3 Error Detective

H.4 You Are the Referee

ISwap-In: When to Use Something Else

Limitations

JReviewer Checklist