MethodAtlas
Method·advanced·12 min read
Design-BasedModern

Shift-Share / Bartik Instruments

Uses national-level shocks interacted with local-level exposure to construct instruments for endogenous variables.

When to UseWhen you can decompose variation in an endogenous variable into national/industry shocks times local exposure shares, and use the interaction as an instrument for the endogenous variable.
AssumptionEither the shares are exogenous (Goldsmith-Pinkham et al. — treat shares as instruments) or the shocks are exogenous (Borusyak et al. — treat shocks as instruments). The appropriate interpretation depends on the research setting.
MistakeNot testing which interpretation (shares vs. shocks exogeneity) is appropriate, or not reporting Rotemberg weights to assess which industries drive the estimate. A single dominant industry can make the instrument effectively a single-shock instrument.
Reading Time~12 min read · 11 sections · 6 interactive exercises

One-Line Implementation

Rfeols(y ~ controls | 0 | exposure ~ shift_share, data = df, vcov = 'hetero')
Stataivregress 2sls y controls (exposure = shift_share), first vce(robust)
Pythondf['shift_share'] = shares @ shocks; IV2SLS(dependent=df['y'], exog=df[['const','controls']], endog=df['exposure'], instruments=df[['shift_share']]).fit(cov_type='robust')

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: The China Trade Shock

Between 1990 and 2007, China's manufacturing exports exploded. Some US labor markets were devastated. Others barely noticed. Why? Because local economies have different industry compositions. A city dominated by furniture manufacturing was hammered by Chinese competition. A city dominated by software development was not.

Autor et al. (2013) wanted to estimate the causal effect of Chinese import competition on US manufacturing employment. The challenge is obvious: local employment changes are driven by many factors beyond Chinese trade, and areas that are declining for other reasons might also be more exposed to import competition.

Their solution was a , also known as a Bartik instrument. This approach extends the instrumental variables framework by constructing the instrument from two components. The idea: construct a predicted measure of local Chinese import exposure by interacting:

  • National shocks ("shifts"): the growth in Chinese imports in each industry nationwide
  • Local shares: the share of each industry in each local labor market's initial employment

The resulting instrument Zl=kslkgkZ_l = \sum_k s_{lk} \cdot g_k varies across locations because different places have different industry mixes, but the variation comes from national import growth, which is plausibly unrelated to local labor demand shocks.

This construction is widely used in applied economics — trade, immigration, fiscal policy, technology adoption. But for decades, researchers lacked clarity about exactly what makes the instrument valid. Recent work has clarified this question substantially.


AOverview

A shift-share instrument takes the form:

Zl=k=1KslkgkZ_l = \sum_{k=1}^{K} s_{lk} \cdot g_k

where:

  • ll indexes locations (or other cross-sectional units)
  • kk indexes industries (or other categories)
  • slks_{lk} is the share of industry kk in location ll (typically measured in a base period)
  • gkg_k is the shift (national growth rate or shock in industry kk)

The instrument exploits the idea that national shocks (gkg_k) affect different locations differently because of their pre-existing industry compositions (slks_{lk}).

Two Frameworks

The breakthrough in recent scholarship has been recognizing that shift-share instruments can be justified through two fundamentally different sets of assumptions:

1. Shares-based identification (Goldsmith-Pinkham et al., 2020): The shares slks_{lk} are the source of exogenous variation. The instrument is a GMM-style estimator using the shares as individual instruments, with the shocks serving as weights. This framework requires the shares to be uncorrelated with unobserved determinants of the outcome — that is, initial industry composition must be exogenous.

2. Shocks-based identification (Borusyak et al., 2022): The shocks gkg_k are the source of exogenous variation. This identification requires the shocks to be as-good-as-randomly assigned, but allows the shares to be endogenous. This approach is the more common justification in trade and immigration settings, where national-level changes (like China's industrial policy) are plausibly exogenous to any individual local labor market.

Common Confusions

"Are these two frameworks in conflict?" No. They are complementary. They identify the same parameter under different assumptions. The question is: in your setting, is it more plausible that the shares are exogenous or that the shocks are exogenous? The answer determines which diagnostics to run.

"Can I just run 2SLS and not worry about this distinction?" Technically yes — the first-stage regression and 2SLS machinery are the same regardless of interpretation. But you need to know why your instrument is valid so you can test the right assumptions. Under the shares interpretation, it is important to check that shares are balanced (uncorrelated with local covariates). Under the shocks interpretation, it is important to check that shocks are as-if-random.

"What about the exclusion restriction?" The exclusion restriction requires that the shift-share instrument affects the outcome only through the endogenous variable. In the China shock example, the instrument should affect local employment only through its effect on local import competition, not through other channels. This restriction is violated if, for example, areas with high manufacturing shares also experience technology shocks that affect employment independently of trade.

"How many industries do I need?" Under the Borusyak et al. (2022) framework, you need many shocks (large KK) so that the law of large numbers kicks in and the instrument's exogeneity holds on average. If KK is small, each individual shock has too much influence, and the as-if-random assumption is harder to justify.


BIdentification

The Estimating Equation

The typical setup is:

Second stage: Yl=α+βXl+γWl+εlY_l = \alpha + \beta X_l + \gamma' \mathbf{W}_l + \varepsilon_l

First stage: Xl=π0+π1Zl+π2Wl+ulX_l = \pi_0 + \pi_1 Z_l + \pi_2' \mathbf{W}_l + u_l

where XlX_l is the endogenous variable (e.g., change in local import exposure), ZlZ_l is the shift-share instrument, and Wl\mathbf{W}_l are controls.

Under the Shares Interpretation

Goldsmith-Pinkham et al. (2020) show that the shift-share instrumental variables (IV) estimator is numerically equivalent to a Generalized Method of Moments (GMM) estimator using the KK individual shares {sl1,,slK}\{s_{l1}, \ldots, s_{lK}\} as instruments:

β^SSIV=β^GMM(α1,,αK)\hat{\beta}^{SSIV} = \hat{\beta}^{GMM}(\alpha_1, \ldots, \alpha_K)

where αk\alpha_k are Rotemberg weights that reflect each industry's contribution to identification. Key implication: a few industries typically drive the result. It is important to check which industries have the largest Rotemberg weights and verify that their shares are plausibly exogenous.

Diagnostics:

  • Report Rotemberg weights (which industries matter most)
  • Check balance: regress shares on local covariates for the high-weight industries
  • Over-identification test: if K>1K > 1, you have multiple instruments and can test over-identifying restrictions

Under the Shocks Interpretation

Borusyak et al. (2022) show that the shift-share IV can be recast as an estimator at the industry level. The key assumption is that shocks are as-good-as-randomly assigned:

E[gkεˉk]=0E[g_k \cdot \bar{\varepsilon}_k] = 0

where εˉk=lslkεl/lslk\bar{\varepsilon}_k = \sum_l s_{lk} \varepsilon_l / \sum_l s_{lk} is the exposure-weighted average of local residuals for industry kk.

Diagnostics:

  • Check balance: regress shocks on industry-level characteristics
  • Examine whether shocks are correlated with pre-period trends
  • Verify that KK is large enough for the asymptotic approximation

CVisual Intuition

Imagine a map of the United States. Each local labor market has a pie chart showing its industry composition. Now imagine that China's rise disproportionately affects certain industries (shown in red). The areas with the largest red slices are the most "exposed" — they receive the largest values of the shift-share instrument.

The identifying variation comes from comparing outcomes in areas with large red slices (high exposure) to areas with small red slices (low exposure), where the redness of each industry is determined by national-level Chinese import growth.


DMathematical Derivation

Don't worry about the notation yet — here's what this means in words: The shift-share IV estimator is equivalent to using each industry share as a separate instrument, weighted by how much that industry contributes to the identifying variation.

Define the shift-share instrument:

Zl=kslkgk=slgZ_l = \sum_k s_{lk} g_k = \mathbf{s}_l' \mathbf{g}

The 2SLS estimator using ZlZ_l as the instrument for XlX_l can be written as follows. For exposition, we present the no-controls Wald-ratio form; with controls Wl\mathbf{W}_l (as defined above), YlY_l and XlX_l are first residualized on Wl\mathbf{W}_l and the instrument enters a standard 2SLS framework:

β^SSIV=lZlYllZlXl=l(kslkgk)Yll(kslkgk)Xl\hat{\beta}^{SSIV} = \frac{\sum_l Z_l Y_l}{\sum_l Z_l X_l} = \frac{\sum_l \left(\sum_k s_{lk} g_k\right) Y_l}{\sum_l \left(\sum_k s_{lk} g_k\right) X_l}

Goldsmith-Pinkham et al. (2020) show this equals:

β^SSIV=kαkβ^k\hat{\beta}^{SSIV} = \sum_k \alpha_k \hat{\beta}_k

where β^k\hat{\beta}_k is the just-identified IV estimate using slks_{lk} alone as the instrument, and αk\alpha_k are the Rotemberg weights:

αk=gklslkX^lkgklslkX^l\alpha_k = \frac{g_k \sum_l s_{lk} \hat{X}_l}{\sum_{k'} g_{k'} \sum_l s_{lk'} \hat{X}_l}

where X^l\hat{X}_l is the first-stage fitted value. These weights sum to one but can be negative (for industries where the shock and the first-stage effect have opposite signs).

Under the Borusyak et al. (2022) shocks interpretation, the estimator can be rewritten at the industry level:

β^SSIV=ks^kgkYˉkks^kgkXˉk\hat{\beta}^{SSIV} = \frac{\sum_k \hat{s}_k g_k \bar{Y}_k}{\sum_k \hat{s}_k g_k \bar{X}_k}

where s^k=lslk\hat{s}_k = \sum_l s_{lk} and Yˉk=lslkYl/s^k\bar{Y}_k = \sum_l s_{lk} Y_l / \hat{s}_k is the exposure-weighted average outcome for industry kk. This expression is just a weighted IV regression at the industry level, which makes the exogeneity requirement on gkg_k transparent.


EImplementation

# Requires: fixest, bartik.weight
library(fixest)
# Install from GitHub: devtools::install_github("paulgp/bartik-weight")
library(bartik.weight)

# --- Step 1: Construct the Shift-Share Instrument ---
# Z_l = sum_k (share_lk * shock_k) — inner product of shares and shocks.
# share_lk = location l's employment share in industry k (from a base period).
# shocks = national-level growth rates by industry (excluding location l, ideally).
# The instrument captures predicted local exposure to national industry shocks.
df$shift_share <- as.matrix(df[, share_cols]) %*% shocks

# --- Step 2: Two-Stage Least Squares ---
# feols() IV syntax: outcome ~ controls | FE | endogenous ~ instrument.
# The shift-share instrument is excluded from the second stage.
# vcov = "HC1" provides heteroskedasticity-robust standard errors.
est <- feols(outcome ~ controls | 0 | exposure ~ shift_share,
           data = df, vcov = "HC1")
# Check the first-stage F-statistic: F > 10 indicates a strong instrument.
summary(est)

# --- Step 3: Rotemberg Weights Decomposition ---
# Decomposes the overall IV estimate into industry-level contributions.
# alpha_k = Rotemberg weight for industry k: how much industry k drives
# the aggregate estimate. Negative weights indicate extrapolation.
# If a few industries dominate, the estimate is fragile.
bw_result <- bw(
master = df, y = "outcome", x = "exposure",
controls = control_cols, weight = "weight_col",
local = df, Z = share_cols,
global = shocks_df, G = "shock_growth"
)
summary(bw_result)

# --- Step 4: Inspect Top-Weight Industries ---
# Industries with the largest Rotemberg weights drive the estimate.
# These high-weight industries must individually pass exogeneity and
# balance tests — if they fail, the overall instrument is suspect.
top_weights <- sort(bw_result$alpha, decreasing = TRUE)[1:5]
print(top_weights)

FDiagnostics

  1. First-stage F-statistic. As with any IV, check for weak instruments. F > 10 is the traditional threshold, though more recent guidance from (Lee et al., 2022) suggests stricter thresholds.

  2. Rotemberg weights. Report the top 5-10 industries by Rotemberg weight. Check whether these industries are plausibly exogenous. If one industry dominates the instrument, your results depend heavily on that industry.

  3. Balance checks. Under shares-based identification: regress shares on pre-period covariates. Under shocks-based identification: regress shocks on industry-level characteristics.

  4. Over-identification test. Under the shares interpretation, you have KK instruments but estimate only one parameter. The Sargan-Hansen J-test checks whether the over-identifying restrictions hold. A rejection suggests that at least some shares are not valid instruments.

  5. Leave-one-industry-out. Re-estimate removing the top Rotemberg weight industry. If results change dramatically, they are fragile.

  6. Pre-trend tests. If you have multiple periods, check whether the shift-share instrument predicts outcome changes in the pre-period.

Interpreting Your Results

When reporting shift-share IV results, it is important to be clear about which identification framework you are invoking. State explicitly: "We interpret the shift-share instrument under the [shares/shocks]-based framework of [Goldsmith-Pinkham et al. (2020)/Borusyak et al. (2022)]."

Report the Rotemberg weights and discuss which industries drive identification. If 80% of the identifying variation comes from three industries, the reader needs to evaluate whether those three industries satisfy the exogeneity conditions.


GWhat Can Go Wrong

What Can Go Wrong

Rotemberg Weights Reveal Identification Driven by a Single Industry

Researcher studies the effect of immigration on native wages using a shift-share instrument where national immigrant inflows by origin country are the shocks and local baseline immigrant shares by origin are the exposure shares. They compute Rotemberg weights and find that the top 5 origin countries account for 38% of the identifying variation, with no single country exceeding 12%.

The diversified identification source supports the Borusyak et al. (2022) shocks-based interpretation: with many origin countries contributing, the as-if-random assignment of shocks averages out idiosyncratic concerns. Leave-one-out estimates removing each top-5 country range from -0.28 to -0.35, tightly bracketing the baseline estimate of -0.31.

What Can Go Wrong

Using Contemporaneous Shares Instead of Baseline Shares

Researcher constructs a Bartik instrument for local exposure to technology shocks using 1990 Census industry employment shares as weights, applied to 2000-2015 national technology adoption growth rates.

The baseline 1990 shares are predetermined relative to the 2000-2015 outcome period. Under the shares-based interpretation, the researcher can credibly argue that 1990 industry composition reflects historical comparative advantage and is plausibly uncorrelated with 2000-2015 local demand shocks.

What Can Go Wrong

Exclusion Restriction Violated Through Correlated Sectoral Shocks

Researcher studies the effect of Chinese import competition on local manufacturing employment using the Autor et al. (2013) shift-share instrument. They carefully address the concern that Chinese import growth in an industry might be correlated with US domestic technology shocks in the same industry by instrumenting with Chinese exports to other high-income countries.

Using Chinese exports to 8 other high-income countries as an alternative measure of Chinese supply shocks isolates the supply-driven component of Chinese trade growth. The instrument captures China's comparative advantage development rather than US demand conditions. The IV estimate is -0.73 jobs per 1,000 workers per $1,000 import exposure.


HPractice

Concept Check

You construct a shift-share instrument for local labor market exposure to automation using national-level robot adoption rates as shocks and baseline industry employment shares as weights. The Rotemberg weight decomposition reveals that the automobile manufacturing industry accounts for 45% of the identifying variation. What should you do?

Concept Check

In a shift-share (Bartik) instrument, what is the key distinction between the 'shares' and 'shocks' identification strategies?

Guided Exercise

Shift-Share Instrument: Import Competition and Manufacturing Employment

Autor et al. (2013) study how Chinese import growth affected US manufacturing employment. Their shift-share instrument assigns each US commuting zone exposure to Chinese imports based on the zone's pre-existing industry mix (shares) and the national growth of Chinese imports in each industry (shocks). A zone with many textile workers gets high 'China shock' exposure because textile imports surged nationally.

What are the two ingredients of a shift-share instrument?

Under the 'shocks-based' identification framework, what must be true for the instrument to be valid?

Why do researchers typically measure industry shares in a 'pre-period' well before the outcome period?

What is a Rotemberg weight, and why is it useful in shift-share analyses?

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of automation exposure on local unemployment. They construct a shift-share instrument: `Z_l = sum_k s_lk * g_k`, where `s_lk` is the share of industry k in location l's employment (measured in 2010) and `g_k` is the national growth in robot installations in industry k from 2010 to 2020. They run 2SLS with state fixed effects and find that a one-standard-deviation increase in automation exposure raises local unemployment by 1.8 percentage points (p = 0.01, F-stat = 18). They write: "Our shift-share instrument is valid because national robot adoption rates are plausibly exogenous to any individual local labor market."

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of trade liberalization on firm productivity in Brazilian municipalities. They construct a Bartik instrument using 1991 industry shares and tariff reductions from 1991 to 2010. The Rotemberg weight analysis shows that the top 3 industries (footwear, textiles, and food processing) account for 72% of the identifying variation. Balance tests show that 1991 footwear shares are strongly correlated with pre-period education levels (p = 0.003) and urbanization (p = 0.01). The researcher reports: "We control for education and urbanization in the second stage, which addresses the balance test failure."

Select all errors you can find:

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors study the causal effect of immigration on native wages in 722 US commuting zones from 1990 to 2010. They construct a shift-share instrument using 1980 immigrant shares by origin country as weights and decadal national immigrant inflows by origin country as shocks. The first-stage F-statistic is 32. They estimate that a 1 percentage point increase in the immigrant share reduces native wages by 0.8% (SE = 0.3%). They invoke the shocks-based framework of Borusyak et al. (2022) and report balance tests showing that origin-country-level shocks are uncorrelated with 1970 origin-country characteristics.

Key Table

VariableOLSIV (shift-share)
Immigrant share-0.15 (0.08)-0.80 (0.30)
F-stat32
Top Rotemberg weights:
  Mexico:           0.41
  Philippines:      0.09
  China:            0.08
  India:            0.07
  Vietnam:          0.06

Authors' Identification Claim

Under the shocks-based framework, the exogeneity of origin-country immigration flows to local US labor demand conditions ensures that the shift-share instrument isolates supply-driven variation in local immigrant concentration.


ISwap-In: When to Use Something Else

  • Standard IV: When a single instrument with a clear exclusion restriction is available — the shift-share structure adds complexity that is only warranted when the instrument is inherently composed of shares and shocks.
  • Difference-in-differences: When the shock creates a clear before/after comparison for exposed versus unexposed regions, and parallel trends is directly defensible.
  • Synthetic control: When few regions are heavily exposed and constructing a data-driven counterfactual from donor regions is feasible.
  • OLS with controls: When the exposure variable is exogenous conditional on observables and the primary concern is confounding rather than endogeneity — no IV structure is needed.

JReviewer Checklist

Critical Reading Checklist

0 of 8 items checked0%

Paper Library

Foundational (5)

Adao, R., Kolesar, M., & Morales, E. (2019). Shift-Share Designs: Theory and Inference.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjz025

Adao, Kolesar, and Morales show that standard errors in shift-share regressions are too small when computed with conventional clustering because residuals are correlated across regions that share similar industry compositions. They propose an inference procedure that accounts for this dependence.

Bartik, T. J. (1991). Who Benefits from State and Local Economic Development Policies?.

W.E. Upjohn Institute for Employment ResearchDOI: 10.17848/9780585223940

Bartik introduces the shift-share instrument—constructing predicted local employment growth from national industry growth rates interacted with initial local industry composition. This 'Bartik instrument' has become one of the most widely used instruments in labor and urban economics.

Borusyak, K., Hull, P., & Jaravel, X. (2022). Quasi-Experimental Shift-Share Research Designs.

Review of Economic StudiesDOI: 10.1093/restud/rdab030

Borusyak, Hull, and Jaravel provide an alternative framework where identification comes from the exogeneity of the shocks rather than the shares. They show that with many independent shocks, the instrument is valid even if shares are endogenous, greatly expanding the range of credible applications.

Goldsmith-Pinkham, P., Sorkin, I., & Swift, H. (2020). Bartik Instruments: What, When, Why, and How.

American Economic ReviewDOI: 10.1257/aer.20181047

Goldsmith-Pinkham, Sorkin, and Swift provide a rigorous econometric framework for shift-share instruments, showing that the Bartik instrument can be decomposed into a weighted sum of individual share-based instruments. They clarify that identification requires exogeneity of the initial shares, not the shocks.

Jaeger, D. A., Ruist, J., & Stuhler, J. (2018). Shift-Share Instruments and the Impact of Immigration.

NBER Working Paper No. 24285DOI: 10.3386/w24285

Jaeger, Ruist, and Stuhler highlight a threat to shift-share instruments in immigration research: serial correlation in immigrant inflows can bias estimates if past immigration affects current outcomes through channels other than current immigration. This paper raises important concerns about the exclusion restriction.

Application (3)

Autor, D. H., Dorn, D., & Hanson, G. H. (2013). The China Syndrome: Local Labor Market Effects of Import Competition in the United States.

American Economic ReviewDOI: 10.1257/aer.103.6.2121

Autor, Dorn, and Hanson use a shift-share instrument to study how Chinese import competition affected U.S. local labor markets, instrumenting U.S. import exposure with Chinese exports to other high-income countries. This paper is one of the most influential and widely discussed shift-share applications.

Blanchard, O. J., & Katz, L. F. (1992). Regional Evolutions.

Brookings Papers on Economic ActivityDOI: 10.2307/2534556

Blanchard and Katz study regional labor market adjustment in the United States, analyzing how local employment shocks affect wages, unemployment, and migration. They construct a predicted-employment instrument using national industry growth interacted with local industry shares—the approach the subsequent literature calls the Bartik or shift-share instrument.

Card, D. (2001). Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration.

Journal of Labor EconomicsDOI: 10.1086/209979

Card uses a shift-share instrument based on historical settlement patterns of immigrant groups to predict current immigration flows to U.S. cities. This 'enclave instrument' is adopted in hundreds of subsequent immigration studies and is a classic example of the shift-share approach.

Tags

design-basedinstrumentexposure