MethodAtlas
Design-BasedModern

Shift-Share / Bartik Instruments

Uses national-level shocks interacted with local-level exposure to construct instruments for endogenous variables.

Quick Reference

When to Use
When you can decompose variation in an endogenous variable into national/industry shocks times local exposure shares, and use the interaction as an instrument for the endogenous variable.
Key Assumption
Either the shares are exogenous (Goldsmith-Pinkham et al. — treat shares as instruments) or the shocks are exogenous (Borusyak et al. — treat shocks as instruments). The appropriate interpretation depends on the research setting.
Common Mistake
Not testing which interpretation (shares vs. shocks exogeneity) is appropriate, or not reporting Rotemberg weights to assess which industries drive the estimate. A single dominant industry can make the instrument effectively a single-shock instrument.
Estimated Time
2.5 hours

One-Line Implementation

Stata: ivregress 2sls y controls (exposure = shift_share), first vce(robust)
R: feols(y ~ controls | 0 | exposure ~ shift_share, data = df, vcov = 'hetero')
Python: df['shift_share'] = shares @ shocks; IV2SLS(dependent=df['y'], exog=df[['const','controls']], endog=df['exposure'], instruments=df[['shift_share']]).fit(cov_type='robust')

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example

Between 1990 and 2007, China's manufacturing exports exploded. Some US labor markets were devastated. Others barely noticed. Why? Because local economies have different industry compositions. A city dominated by furniture manufacturing was hammered by Chinese competition. A city dominated by software development was not.

Autor et al. (2013) wanted to estimate the causal effect of Chinese import competition on US manufacturing employment. The challenge is obvious: local employment changes are driven by many factors beyond Chinese trade, and areas that are declining for other reasons might also be more exposed to import competition.

(Autor et al., 2013)

Their solution was a shift-share instrument, also known as a Bartik instrument. This approach extends the instrumental variables framework by constructing the instrument from two components. The idea: construct a predicted measure of local Chinese import exposure by interacting:

  • National shocks ("shifts"): the growth in Chinese imports in each industry nationwide
  • Local shares: the share of each industry in each local labor market's initial employment

The resulting instrument Zl=kslkgkZ_l = \sum_k s_{lk} \cdot g_k varies across locations because different places have different industry mixes, but the variation comes from national import growth, which is plausibly unrelated to local labor demand shocks.

This construction is widely used in applied economics — trade, immigration, fiscal policy, technology adoption. But for decades, researchers lacked clarity about exactly what makes the instrument valid. Recent work has clarified this question substantially.

A. Overview

A shift-share instrument takes the form:

Zl=k=1KslkgkZ_l = \sum_{k=1}^{K} s_{lk} \cdot g_k

where:

  • ll indexes locations (or other cross-sectional units)
  • kk indexes industries (or other categories)
  • slks_{lk} is the share of industry kk in location ll (typically measured in a base period)
  • gkg_k is the shift (national growth rate or shock in industry kk)

The instrument exploits the idea that national shocks (gkg_k) affect different locations differently because of their pre-existing industry compositions (slks_{lk}).

Two Frameworks

The breakthrough in recent scholarship has been recognizing that shift-share instruments can be justified through two fundamentally different sets of assumptions:

1. Shares-based identification (Goldsmith-Pinkham et al., 2020): The shares slks_{lk} are the source of exogenous variation. The instrument is essentially a GMM estimator using the shares as individual instruments, with the shocks serving as weights. This framework requires the shares to be uncorrelated with unobserved determinants of the outcome — essentially, initial industry composition must be exogenous.

(Goldsmith-Pinkham et al., 2020)

2. Shocks-based identification (Borusyak et al., 2022): The shocks gkg_k are the source of exogenous variation. This identification requires the shocks to be as-good-as-randomly assigned, but allows the shares to be endogenous. This approach is the more common justification in trade and immigration settings, where national-level changes (like China's industrial policy) are plausibly exogenous to any individual local labor market.

(Borusyak et al., 2022)

Common Confusions

"Are these two frameworks in conflict?" No. They are complementary. They identify the same parameter under different assumptions. The question is: in your setting, is it more plausible that the shares are exogenous or that the shocks are exogenous? The answer determines which diagnostics to run.

"Can I just run 2SLS and not worry about this distinction?" Technically yes — the first-stage regression and 2SLS machinery are the same regardless of interpretation. But you need to know why your instrument is valid so you can test the right assumptions. Under the shares interpretation, it is important to check that shares are balanced (uncorrelated with local covariates). Under the shocks interpretation, it is important to check that shocks are as-if-random.

"What about the exclusion restriction?" The exclusion restriction requires that the shift-share instrument affects the outcome only through the endogenous variable. In the China shock example, the instrument should affect local employment only through its effect on local import competition, not through other channels. This restriction is violated if, for example, areas with high manufacturing shares also experience technology shocks that affect employment independently of trade.

"How many industries do I need?" Under the Borusyak et al. framework, you need many shocks (large KK) so that the law of large numbers kicks in and the instrument's exogeneity holds on average. If KK is small, each individual shock has too much influence, and the as-if-random assumption is harder to justify.

B. Identification

The Estimating Equation

The typical setup is:

Second stage: Yl=α+βXl+γWl+εlY_l = \alpha + \beta X_l + \gamma' \mathbf{W}_l + \varepsilon_l

First stage: Xl=π0+π1Zl+π2Wl+ulX_l = \pi_0 + \pi_1 Z_l + \pi_2' \mathbf{W}_l + u_l

where XlX_l is the endogenous variable (e.g., change in local import exposure), ZlZ_l is the shift-share instrument, and Wl\mathbf{W}_l are controls.

Under the Shares Interpretation

Goldsmith-Pinkham et al. show that the shift-share IV estimator is numerically equivalent to a GMM estimator using the KK individual shares {sl1,,slK}\{s_{l1}, \ldots, s_{lK}\} as instruments:

β^SSIV=β^GMM(α1,,αK)\hat{\beta}^{SSIV} = \hat{\beta}^{GMM}(\alpha_1, \ldots, \alpha_K)

where αkg^k\alpha_k \propto \hat{g}_k are Rotemberg weights that reflect each industry's contribution to identification. Key implication: a few industries typically drive the result. It is important to check which industries have the largest Rotemberg weights and verify that their shares are plausibly exogenous.

Diagnostics:

  • Report Rotemberg weights (which industries matter most)
  • Check balance: regress shares on local covariates for the high-weight industries
  • Over-identification test: if K>1K > 1, you have multiple instruments and can test over-identifying restrictions

Under the Shocks Interpretation

Borusyak et al. show that the shift-share IV can be recast as an estimator at the industry level. The key assumption is that shocks are as-good-as-randomly assigned:

E[gkεˉk]=0E[g_k \cdot \bar{\varepsilon}_k] = 0

where εˉk=lslkεl/lslk\bar{\varepsilon}_k = \sum_l s_{lk} \varepsilon_l / \sum_l s_{lk} is the exposure-weighted average of local residuals for industry kk.

Diagnostics:

  • Check balance: regress shocks on industry-level characteristics
  • Examine whether shocks are correlated with pre-period trends
  • Verify that KK is large enough for the asymptotic approximation

C. Visual Intuition

Imagine a map of the United States. Each local labor market has a pie chart showing its industry composition. Now imagine that China's rise disproportionately affects certain industries (shown in red). The areas with the largest red slices are the most "exposed" — they receive the largest values of the shift-share instrument.

The identifying variation comes from comparing outcomes in areas with large red slices (high exposure) to areas with small red slices (low exposure), where the redness of each industry is determined by national-level Chinese import growth.

Interactive Simulation

Shift-Share Instrument Construction

See how national industry shocks interact with local industry shares to produce different levels of predicted exposure across locations. Adjust the shocks and see how the instrument changes for each location.

013.0626.1339.19Simulated ValueShock toShock toShock toNumber ofParameters
-55
-55
-55
20200
Interactive Simulation

Why Shift-Share IV?

DGP: 100 local areas, 5 national industries. Treatment depends on Bartik instrument (relevance = 1.5) + local confounders. Y = 2.0·D + 1.5·confounder + ε. First-stage F = 4.6.

-16.0-11.8-7.6-3.40.85.0-2.1-1.8-1.5-1.1-0.8Bartik Instrument (Z)Outcome (Y)
Reduced form (Y on Z)Area (size = exposure)

Estimation Results

Estimatorβ̂SE95% CIBias
Naive OLS3.2030.103[3.00, 3.41]+1.203
OLS + controlsclosest2.0070.107[1.80, 2.22]+0.007
Shift-Share IV1.8550.809[0.27, 3.44]-0.145
True β2.000
100

Cross-sectional units (local areas)

2.0

The causal effect of the local treatment on outcome

1.5

How strongly the Bartik instrument predicts treatment

0.0

Direct effect of Bartik Z on Y bypassing D (should be 0)

Why the difference?

Naive OLS is biased (+1.20) because local confounders (e.g., policies, geography) simultaneously affect both the local economic treatment and the outcome. The shift-share IV (Bartik instrument) isolates national industry-level shocks weighted by predetermined local exposure shares, purging local confounders. The IV estimate (1.855) is much closer to the true effect. Warning: the first-stage F-statistic is 4.6, indicating a weak instrument. Increase relevance to strengthen the first stage.

D. Mathematical Derivation

Don't worry about the notation yet — here's what this means in words: The shift-share IV estimator is equivalent to using each industry share as a separate instrument, weighted by how much that industry contributes to the identifying variation.

Define the shift-share instrument:

Zl=kslkgk=slgZ_l = \sum_k s_{lk} g_k = \mathbf{s}_l' \mathbf{g}

The 2SLS estimator using ZlZ_l as the instrument for XlX_l can be written as follows. For exposition, we present the no-controls Wald-ratio form; with controls Wl\mathbf{W}_l (as defined above), YlY_l and XlX_l are first residualized on Wl\mathbf{W}_l and the instrument enters a standard 2SLS framework:

β^SSIV=lZlYllZlXl=l(kslkgk)Yll(kslkgk)Xl\hat{\beta}^{SSIV} = \frac{\sum_l Z_l Y_l}{\sum_l Z_l X_l} = \frac{\sum_l \left(\sum_k s_{lk} g_k\right) Y_l}{\sum_l \left(\sum_k s_{lk} g_k\right) X_l}

Goldsmith-Pinkham et al. (2020) show this equals:

β^SSIV=kαkβ^k\hat{\beta}^{SSIV} = \sum_k \alpha_k \hat{\beta}_k

where β^k\hat{\beta}_k is the just-identified IV estimate using slks_{lk} alone as the instrument, and αk\alpha_k are the Rotemberg weights:

αk=gklslkX^lkgklslkX^l\alpha_k = \frac{g_k \sum_l s_{lk} \hat{X}_l}{\sum_{k'} g_{k'} \sum_l s_{lk'} \hat{X}_l}

where X^l\hat{X}_l is the first-stage fitted value. These weights sum to one but can be negative (for industries where the shock and the first-stage effect have opposite signs).

Under the Borusyak et al. (2022) shocks interpretation, the estimator can be rewritten at the industry level:

β^SSIV=ks^kgkYˉkks^kgkXˉk\hat{\beta}^{SSIV} = \frac{\sum_k \hat{s}_k g_k \bar{Y}_k}{\sum_k \hat{s}_k g_k \bar{X}_k}

where s^k=lslk\hat{s}_k = \sum_l s_{lk} and Yˉk=lslkYl/s^k\bar{Y}_k = \sum_l s_{lk} Y_l / \hat{s}_k is the exposure-weighted average outcome for industry kk. This expression is just a weighted IV regression at the industry level, which makes the exogeneity requirement on gkg_k transparent.

E. Implementation

library(fixest)
# Install from GitHub: devtools::install_github("jjchern/bartik.weight")
library(bartik.weight)

# Step 1: Construct shift-share instrument
# Z_l = sum_k (share_lk * shock_k) — inner product of shares and shocks
df$shift_share <- as.matrix(df[, share_cols]) %*% shocks

# Step 2: Two-stage least squares using fixest
# outcome ~ controls | fixed_effects | endogenous ~ instrument
est <- feols(outcome ~ controls | 0 | exposure ~ shift_share,
           data = df, vcov = "HC1")
summary(est)

# Step 3: Rotemberg weights (Goldsmith-Pinkham et al. 2020)
# Decomposes the IV estimate into industry-level contributions
# alpha_k shows how much each industry drives the overall estimate
bw_result <- bw(
master = df, y = "outcome", x = "exposure",
controls = control_cols, weight = "weight_col",
local = df, Z = share_cols,
global = shocks_df, G = "shock_growth"
)
summary(bw_result)

# Step 4: Examine top 5 industries by Rotemberg weight
# These industries must pass exogeneity / balance tests
top_weights <- sort(bw_result$alpha, decreasing = TRUE)[1:5]
print(top_weights)

F. Diagnostics

  1. First-stage F-statistic. As with any IV, check for weak instruments. F > 10 is the traditional threshold, though more recent guidance (Lee et al., 2022) suggests stricter thresholds.

  2. Rotemberg weights. Report the top 5-10 industries by Rotemberg weight. Check whether these industries are plausibly exogenous. If one industry dominates, your results depend heavily on that industry.

  3. Balance checks. Under shares-based identification: regress shares on pre-period covariates. Under shocks-based identification: regress shocks on industry-level characteristics.

  4. Over-identification test. Under the shares interpretation, you have KK instruments but estimate only one parameter. The Sargan-Hansen J-test checks whether the over-identifying restrictions hold. A rejection suggests that at least some shares are not valid instruments.

  5. Leave-one-industry-out. Re-estimate removing the top Rotemberg weight industry. If results change dramatically, they are fragile.

  6. Pre-trend tests. If you have multiple periods, check whether the shift-share instrument predicts outcome changes in the pre-period.

Interpreting Your Results

When reporting shift-share IV results, it is important to be clear about which identification framework you are invoking. State explicitly: "We interpret the shift-share instrument under the [shares/shocks]-based framework of [Goldsmith-Pinkham et al./Borusyak et al.]."

Report the Rotemberg weights and discuss which industries drive identification. If 80% of the identifying variation comes from three industries, the reader needs to evaluate whether those three industries satisfy the exogeneity conditions.

G. What Can Go Wrong

Assumption Failure Demo

Rotemberg Weights Reveal Identification Driven by a Single Industry

Researcher studies the effect of immigration on native wages using a shift-share instrument where national immigrant inflows by origin country are the shocks and local baseline immigrant shares by origin are the exposure shares. They compute Rotemberg weights and find that the top 5 origin countries account for 38% of the identifying variation, with no single country exceeding 12%.

The diversified identification source supports the Borusyak et al. (2022) shocks-based interpretation: with many origin countries contributing, the as-if-random assignment of shocks averages out idiosyncratic concerns. Leave-one-out estimates removing each top-5 country range from -0.28 to -0.35, tightly bracketing the baseline estimate of -0.31.

Assumption Failure Demo

Using Contemporaneous Shares Instead of Baseline Shares

Researcher constructs a Bartik instrument for local exposure to technology shocks using 1990 Census industry employment shares as weights, applied to 2000-2015 national technology adoption growth rates.

The baseline 1990 shares are predetermined relative to the 2000-2015 outcome period. Under the shares-based interpretation, the researcher can credibly argue that 1990 industry composition reflects historical comparative advantage and is plausibly uncorrelated with 2000-2015 local demand shocks.

Assumption Failure Demo

Exclusion Restriction Violated Through Correlated Sectoral Shocks

Researcher studies the effect of Chinese import competition on local manufacturing employment using the Autor et al. (2013) shift-share instrument. They carefully address the concern that Chinese import growth in an industry might be correlated with US domestic technology shocks in the same industry by instrumenting with Chinese exports to other high-income countries.

Using Chinese exports to 8 other high-income countries as an alternative measure of Chinese supply shocks isolates the supply-driven component of Chinese trade growth. The instrument captures China's comparative advantage development rather than US demand conditions. The IV estimate is -0.73 jobs per 1,000 workers per $1,000 import exposure.

H. Practice

Concept Check

You construct a shift-share instrument for local labor market exposure to automation using national-level robot adoption rates as shocks and baseline industry employment shares as weights. The Rotemberg weight decomposition reveals that the automobile manufacturing industry accounts for 45% of the identifying variation. What should you do?

Guided Exercise

Shift-Share Instrument: Import Competition and Manufacturing Employment

Autor et al. (2013) study how Chinese import growth affected US manufacturing employment. Their shift-share instrument assigns each US commuting zone exposure to Chinese imports based on the zone's pre-existing industry mix (shares) and the national growth of Chinese imports in each industry (shocks). A zone with many textile workers gets high 'China shock' exposure because textile imports surged nationally.

What are the two ingredients of a shift-share instrument?

Under the 'shocks-based' identification framework, what must be true for the instrument to be valid?

Why do researchers typically measure industry shares in a 'pre-period' well before the outcome period?

What is a Rotemberg weight, and why is it useful in shift-share analyses?

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of automation exposure on local unemployment. They construct a shift-share instrument: Z_l = sum_k s_lk * g_k, where s_lk is the share of industry k in location l's employment (measured in 2010) and g_k is the national growth in robot installations in industry k from 2010 to 2020. They run 2SLS with state fixed effects and find that a one-standard-deviation increase in automation exposure raises local unemployment by 1.8 percentage points (p = 0.01, F-stat = 18). They write: "Our shift-share instrument is valid because national robot adoption rates are plausibly exogenous to any individual local labor market."

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of trade liberalization on firm productivity in Brazilian municipalities. They construct a Bartik instrument using 1991 industry shares and tariff reductions from 1991 to 2010. The Rotemberg weight analysis shows that the top 3 industries (footwear, textiles, and food processing) account for 72% of the identifying variation. Balance tests show that 1991 footwear shares are strongly correlated with pre-period education levels (p = 0.003) and urbanization (p = 0.01). The researcher reports: "We control for education and urbanization in the second stage, which addresses the balance test failure."

Select all errors you can find:

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors study the causal effect of immigration on native wages in 722 US commuting zones from 1990 to 2010. They construct a shift-share instrument using 1980 immigrant shares by origin country as weights and decadal national immigrant inflows by origin country as shocks. The first-stage F-statistic is 32. They estimate that a 1 percentage point increase in the immigrant share reduces native wages by 0.8% (SE = 0.3%). They invoke the shocks-based framework of Borusyak et al. (2022) and report balance tests showing that origin-country-level shocks are uncorrelated with 1970 origin-country characteristics.

Key Table

VariableOLSIV (shift-share)
Immigrant share-0.15 (0.08)-0.80 (0.30)
F-stat--32
Top Rotemberg weights:
  Mexico:           0.41
  Philippines:      0.09
  China:            0.08
  India:            0.07
  Vietnam:          0.06

Authors' Identification Claim

Under the shocks-based framework, the exogeneity of origin-country immigration flows to local US labor demand conditions ensures that the shift-share instrument isolates supply-driven variation in local immigrant concentration.

I. Swap-In: When to Use Something Else

  • Standard IV: When a single instrument with a clear exclusion restriction is available — the shift-share structure adds complexity that is only warranted when the instrument is inherently composed of shares and shocks.
  • Difference-in-differences: When the shock creates a clear before/after comparison for exposed versus unexposed regions, and parallel trends is directly defensible.
  • Synthetic control: When few regions are heavily exposed and constructing a data-driven counterfactual from donor regions is feasible.
  • OLS with controls: When the exposure variable is exogenous conditional on observables and the primary concern is confounding rather than endogeneity — no IV structure is needed.

J. Reviewer Checklist

Critical Reading Checklist


Paper Library

Foundational (4)

Bartik, T. J. (1991). Who Benefits from State and Local Economic Development Policies?.

W.E. Upjohn Institute for Employment ResearchDOI: 10.17848/9780585223940

Bartik introduced the shift-share instrument—constructing predicted local employment growth from national industry growth rates interacted with initial local industry composition. This 'Bartik instrument' has become one of the most widely used instruments in labor and urban economics.

Goldsmith-Pinkham, P., Sorkin, I., & Swift, H. (2020). Bartik Instruments: What, When, Why, and How.

American Economic ReviewDOI: 10.1257/aer.20181047

This paper provided the first rigorous econometric framework for shift-share instruments, showing that the Bartik instrument can be decomposed into a weighted sum of individual share-based instruments. They clarified that identification requires exogeneity of the initial shares, not the shocks.

Borusyak, K., Hull, P., & Jaravel, X. (2022). Quasi-Experimental Shift-Share Research Designs.

Review of Economic StudiesDOI: 10.1093/restud/rdab030

Borusyak, Hull, and Jaravel provided an alternative framework where identification comes from the exogeneity of the shocks rather than the shares. They showed that with many independent shocks, the instrument is valid even if shares are endogenous, greatly expanding the range of credible applications.

Adao, R., Kolesar, M., & Morales, E. (2019). Shift-Share Designs: Theory and Inference.

Quarterly Journal of EconomicsDOI: 10.1093/qje/qjz025

Adao, Kolesar, and Morales showed that standard errors in shift-share regressions are too small when computed with conventional clustering because residuals are correlated across regions that share similar industry compositions. They proposed an inference procedure that accounts for this dependence.

Application (4)

Autor, D. H., Dorn, D., & Hanson, G. H. (2013). The China Syndrome: Local Labor Market Effects of Import Competition in the United States.

American Economic ReviewDOI: 10.1257/aer.103.6.2121

Autor, Dorn, and Hanson used a shift-share instrument to study how Chinese import competition affected U.S. local labor markets, instrumenting U.S. import exposure with Chinese exports to other high-income countries. This paper is one of the most influential and widely discussed shift-share applications.

Blanchard, O. J., & Katz, L. F. (1992). Regional Evolutions.

Brookings Papers on Economic ActivityDOI: 10.2307/2534556

Blanchard and Katz used the Bartik shift-share instrument to study regional labor market adjustment in the United States, analyzing how local employment shocks affect wages, unemployment, and migration. This paper is one of the earliest and most influential applications of the shift-share IV strategy.

Card, D. (2001). Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration.

Journal of Labor EconomicsDOI: 10.1086/209979

Card used a shift-share instrument based on historical settlement patterns of immigrant groups to predict current immigration flows to U.S. cities. This 'enclave instrument' has been adopted in hundreds of subsequent immigration studies and is a classic example of the shift-share approach.

Greenland, A., & Loualiche, E. (2024). Financial Implications of Supply Chain Disruptions: Evidence from the Japanese Tsunami.

Management ScienceDOI: 10.1287/mnsc.2023.4855

Greenland and Loualiche used a shift-share instrument based on pre-existing supplier linkages and the geographic incidence of the 2011 Japanese tsunami to identify the causal effects of supply chain disruptions on U.S. firms' stock returns and real outcomes. The paper illustrates how the Bartik-style approach extends naturally to settings where firm-level exposure shares interact with exogenous shocks, providing a clean identification strategy in management and finance research.

Survey (1)

Jaeger, D. A., Ruist, J., & Stuhler, J. (2018). Shift-Share Instruments and the Impact of Immigration.

NBER Working Paper No. 24285DOI: 10.3386/w24285

Jaeger, Ruist, and Stuhler highlighted a threat to shift-share instruments in immigration research: serial correlation in immigrant inflows can bias estimates if past immigration affects current outcomes through channels other than current immigration. This paper raised important concerns about the exclusion restriction.

Tags

design-basedinstrumentexposure