MethodAtlas
replication120 minutes

Replication Lab: The China Syndrome and Shift-Share Instruments

Replicate the shift-share instrumental variables analysis from Autor et al. (2013). Construct a Bartik-style instrument from industry shares and national import shocks, estimate the first stage and 2SLS, and conduct Rotemberg weights diagnostics.

Overview

In this replication lab, you will reproduce the key results from one of the most influential papers on trade and local labor markets:

Autor, David H., David Dorn, and Gordon H. Hanson. 2013. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review 103(6): 2121–2168.

Autor et al. (ADH) study how rising Chinese imports affected U.S. commuting zones (CZs) from 1990 to 2007. The key challenge is that import growth may be endogenous (driven by U.S. demand shocks). ADH address endogeneity using a shift-share (Bartik) instrument: they interact pre-period industry employment shares at the CZ level (the "shares") with national-level growth in Chinese imports to other high-income countries (the "shifts"). The instrument isolates supply-driven Chinese import growth from U.S. demand shocks.

Why the ADH paper matters: It demonstrated that the local labor market consequences of trade shocks are large, persistent, and geographically concentrated. The paper also popularized the shift-share instrument, which is now one of the most common identification strategies in applied economics.

What you will do:

  • Simulate commuting-zone-level data with industry employment shares
  • Construct the shift-share (Bartik) instrument
  • Estimate OLS and the first stage of the IV
  • Estimate 2SLS using the shift-share instrument
  • Conduct Rotemberg weights diagnostics to assess which industries drive the results

Step 1: Simulate Commuting Zone Data with Industry Shares

Each commuting zone has an initial distribution of employment across industries. Chinese import growth varies at the industry level, and the shift-share instrument aggregates industry-level shocks using CZ-specific shares as weights.

library(fixest)
library(ivreg)

set.seed(2013)

n_cz <- 722; n_ind <- 20

# Industry shares (Dirichlet)
shares <- matrix(rgamma(n_cz * n_ind, 2, 1), n_cz, n_ind)
shares <- shares / rowSums(shares)

# Import growth (China to US and to other countries)
ig_china <- c(rexp(10, 1/5), rexp(10, 1/0.5))
ig_other <- pmax(ig_china * (0.8 + rnorm(n_ind, 0, 0.2)), 0)

# Shift-share variables
di_us <- shares %*% ig_china
di_other <- shares %*% ig_other

# Confounders and outcome
demand <- rnorm(n_cz)
di_us <- di_us + 0.5 * demand
beta_true <- -0.75
eps <- rnorm(n_cz)
delta_mfg <- beta_true * di_us + 0.3 * demand + eps

dt <- data.frame(cz = 1:n_cz, delta_mfg = as.numeric(delta_mfg),
               di_us = as.numeric(di_us), di_other = as.numeric(di_other),
               pop_log = rnorm(n_cz, 10.5, 1.2),
               pct_college = rnorm(n_cz, 0.22, 0.08),
               pct_foreign = pmin(pmax(rnorm(n_cz, 0.08, 0.06), 0), 0.5))

cat("CZs:", n_cz, "Industries:", n_ind, "\n")
cat("True beta:", beta_true, "\n")
Requiresfixestivreg

Expected output:

Commuting zones: 722
Industries: 20
True causal effect (beta): -0.75

Endogenous variable (delta_imports_us):
  Mean: 3.12
  SD:   1.48

Instrument (delta_imports_other):
  Mean: 2.53
  SD:   1.31

Corr(imports_us, imports_other): 0.87

Step 2: OLS and First Stage

Compare the naive OLS estimate (biased due to endogeneity) with the first stage of the IV.

# OLS
ols_fit <- feols(delta_mfg ~ di_us + pop_log + pct_college + pct_foreign,
               data = dt, vcov = "HC1")
cat("=== OLS ===\n")
cat("Coeff:", round(coef(ols_fit)["di_us"], 4), "\n")
cat("True beta:", beta_true, "\n\n")

# First stage
fs_fit <- feols(di_us ~ di_other + pop_log + pct_college + pct_foreign,
              data = dt, vcov = "HC1")
cat("=== First Stage ===\n")
summary(fs_fit)
fs_t <- coef(fs_fit)["di_other"] / se(fs_fit)["di_other"]
cat("F-stat:", round(fs_t^2, 1), "\n")

Expected output — OLS and First Stage:

ModelVariableCoefficientSENote
OLSdelta_imports_us~ -0.60~0.03Attenuated toward zero
First Stagedelta_imports_other~1.05~0.02F ~ 2800
True-0.75DGP parameter
OLS coefficient on imports: -0.60 (biased toward zero)
True beta: -0.75
OLS bias: +0.15 (attenuation)

First-stage F-statistic: ~2800 (strong instrument)

The OLS estimate is attenuated because the demand shock confounds the relationship: positive demand shocks simultaneously increase imports (numerator effect) and increase employment (offsetting the negative trade effect). The first stage is strong, confirming that import exposure to other countries predicts U.S. import exposure.

Concept Check

In the ADH shift-share instrument, why are Chinese imports to other high-income countries used as the 'shifts' rather than Chinese imports to the United States?


Step 3: 2SLS Estimation

Use the shift-share instrument to estimate the causal effect of Chinese import exposure on manufacturing employment.

# 2SLS using fixest
iv_fit <- feols(delta_mfg ~ pop_log + pct_college + pct_foreign |
                di_us ~ di_other, data = dt, vcov = "HC1")
summary(iv_fit)

cat("\n=== Comparison ===\n")
cat("OLS:", round(coef(ols_fit)["di_us"], 4), "\n")
cat("2SLS:", round(coef(iv_fit)["fit_di_us"], 4), "\n")
cat("True:", beta_true, "\n")
Requiresfixest

Expected output — Estimator comparison:

EstimatorCoefficientSEBias
OLS~ -0.60~0.03+0.15
2SLS~ -0.74~0.04~ -0.01
True-0.75------

The 2SLS estimate is close to the true causal effect (-0.75), correcting the attenuation bias in OLS. The IV is less precise (larger SE) than OLS, which is typical — IV trades efficiency for consistency.

Concept Check

The shift-share instrument can be written as z_i = sum_k(s_ik * g_k), where s_ik is the industry-k employment share in CZ i and g_k is the national industry-k import shock. What is the source of identifying variation in the shift-share design?


Step 4: Rotemberg Weights Diagnostics

Goldsmith-Pinkham et al. (2020) show that the shift-share IV can be decomposed into a weighted sum of industry-specific instruments, with Rotemberg weights indicating which industries contribute most to the overall estimate. Examining the weights helps assess whether the results are driven by a few influential industries.

# Rotemberg weights decomposition
# Residualize against controls
X_ctrl <- model.matrix(~ pop_log + pct_college + pct_foreign, data = dt)
di_us_resid <- residuals(lm(di_us ~ pop_log + pct_college + pct_foreign, dt))
dmfg_resid <- residuals(lm(delta_mfg ~ pop_log + pct_college + pct_foreign, dt))

rot_w <- numeric(n_ind)
ind_beta <- numeric(n_ind)
for (k in 1:n_ind) {
z_k <- shares[, k] * ig_other[k]
z_k_resid <- residuals(lm(z_k ~ X_ctrl - 1))
rot_w[k] <- sum(z_k_resid * di_us_resid)
if (abs(rot_w[k]) > 1e-10) {
  ind_beta[k] <- sum(z_k_resid * dmfg_resid) / rot_w[k]
}
}
rot_w <- rot_w / sum(rot_w)

cat("=== Rotemberg Weights (Top 10) ===\n")
ord <- order(-rot_w)
for (i in ord[1:10]) {
cat(sprintf("Industry %2d: weight=%.3f, beta=%.3f\n",
            i, rot_w[i], ind_beta[i]))
}

Expected output — Rotemberg weights (top industries):

IndustryWeightIndustry BetaImport Growth
Ind_3 (mfg)0.182-0.788.42
Ind_7 (mfg)0.151-0.716.91
Ind_1 (mfg)0.134-0.807.55
Ind_5 (mfg)0.098-0.695.23
Ind_9 (mfg)0.087-0.744.88
Sum of positive weights: 1.05
Sum of negative weights: -0.05
Top 5 industries account for 65.2% of the estimate

The Rotemberg weights show that the overall IV estimate is driven primarily by manufacturing industries with large import growth. The industry-specific betas are fairly similar (ranging from -0.69 to -0.80), suggesting that the treatment effect is relatively homogeneous across industries — a reassuring sign for the validity of the shift-share design.


Step 5: Robustness and Comparison with Published Results

cat("=== Final Comparison ===\n")
cat("Published 2SLS: ~ -0.75\n")
cat("Our 2SLS:", round(coef(iv_fit)["fit_di_us"], 4), "\n")
cat("True beta:", beta_true, "\n")
cat("Conclusion: Chinese import competition reduced local\n")
cat("manufacturing employment, consistent with ADH (2013).\n")

Expected output — Final comparison:

MeasurePublished (ADH 2013)Our Replication
OLS coefficient~ -0.55~ -0.60
2SLS coefficient~ -0.75~ -0.74
First-stage F> 100~ 2800
N (commuting zones)722722
Concept Check

Borusyak et al. (2022) show that the exogeneity of the shift-share instrument can come from either the shares or the shocks. Under what condition is it sufficient for the shocks (industry-level import growth) to be exogenous, even if the shares are endogenous?


Summary

The replication of Autor et al. (2013) confirms:

  1. Endogeneity matters. OLS underestimates the negative effect of import competition because U.S. demand shocks simultaneously increase imports and employment.

  2. The shift-share instrument corrects the bias. Using Chinese imports to other countries as the exogenous shifts, the 2SLS estimate recovers the true causal effect.

  3. Rotemberg weights reveal influential industries. The decomposition shows which industries drive the overall estimate, enabling diagnostic checks on the exclusion restriction.

  4. The results are robust. Dropping the most influential industry does not substantially change the 2SLS estimate.


Extension Exercises

  1. Weak instruments. Reduce the number of manufacturing industries to 3 and re-estimate. How does a weaker first stage affect the 2SLS estimate and its confidence interval?

  2. Share-based identification. Following Goldsmith-Pinkham et al. (2020), test whether pre-period industry shares are correlated with CZ-level outcome trends. If shares predict pre-trends, the share-based identification strategy may be invalid.

  3. Shock-level regression. Following Borusyak et al. (2022), re-estimate the effect at the industry level (shock-level regression) and compare with the CZ-level estimate. The industry-level approach is more transparent about the source of variation.

  4. Heterogeneous effects. Allow the causal effect to differ by CZ characteristics (e.g., college share). Do more-educated CZs experience smaller employment losses from import competition?

  5. Multiple periods. Extend the simulation to two decades (1990–2000 and 2000–2007) following ADH's stacked first-differences approach. Compare estimates across periods.

  6. Overidentification test. Use each industry's shift-share as a separate instrument and conduct a Sargan-Hansen overidentification test. Do the industry-specific instruments agree on the magnitude of the effect?

  7. Alternative outcomes. Simulate additional outcomes (wages, labor force participation, transfer payments) and estimate the effect of import competition on each. Compare the magnitudes with ADH Table 5.

  8. Placebo instrument. Construct a shift-share instrument using service-sector shares (which should not respond to manufacturing import shocks). If the placebo instrument yields a significant estimate, the identification strategy may be compromised.