Shift-Share / Bartik Instruments
Uses national-level shocks interacted with local-level exposure to construct instruments for endogenous variables.
One-Line Implementation
feols(y ~ controls | 0 | exposure ~ shift_share, data = df, vcov = 'hetero')ivregress 2sls y controls (exposure = shift_share), first vce(robust)df['shift_share'] = shares @ shocks; IV2SLS(dependent=df['y'], exog=df[['const','controls']], endog=df['exposure'], instruments=df[['shift_share']]).fit(cov_type='robust')Download Full Analysis Code
Complete scripts with diagnostics, robustness checks, and result export.
Motivating Example: The China Trade Shock
Between 1990 and 2007, China's manufacturing exports exploded. Some US labor markets were devastated. Others barely noticed. Why? Because local economies have different industry compositions. A city dominated by furniture manufacturing was hammered by Chinese competition. A city dominated by software development was not.
Autor et al. (2013) wanted to estimate the causal effect of Chinese import competition on US manufacturing employment. The challenge is obvious: local employment changes are driven by many factors beyond Chinese trade, and areas that are declining for other reasons might also be more exposed to import competition.
Their solution was a , also known as a Bartik instrument. This approach extends the instrumental variables framework by constructing the instrument from two components. The idea: construct a predicted measure of local Chinese import exposure by interacting:
- National shocks ("shifts"): the growth in Chinese imports in each industry nationwide
- Local shares: the share of each industry in each local labor market's initial employment
The resulting instrument varies across locations because different places have different industry mixes, but the variation comes from national import growth, which is plausibly unrelated to local labor demand shocks.
This construction is widely used in applied economics — trade, immigration, fiscal policy, technology adoption. But for decades, researchers lacked clarity about exactly what makes the instrument valid. Recent work has clarified this question substantially.
AOverview
A shift-share instrument takes the form:
where:
- indexes locations (or other cross-sectional units)
- indexes industries (or other categories)
- is the share of industry in location (typically measured in a base period)
- is the shift (national growth rate or shock in industry )
The instrument exploits the idea that national shocks () affect different locations differently because of their pre-existing industry compositions ().
Two Frameworks
The breakthrough in recent scholarship has been recognizing that shift-share instruments can be justified through two fundamentally different sets of assumptions:
1. Shares-based identification (Goldsmith-Pinkham et al., 2020): The shares are the source of exogenous variation. The instrument is a GMM-style estimator using the shares as individual instruments, with the shocks serving as weights. This framework requires the shares to be uncorrelated with unobserved determinants of the outcome — that is, initial industry composition must be exogenous.
2. Shocks-based identification (Borusyak et al., 2022): The shocks are the source of exogenous variation. This identification requires the shocks to be as-good-as-randomly assigned, but allows the shares to be endogenous. This approach is the more common justification in trade and immigration settings, where national-level changes (like China's industrial policy) are plausibly exogenous to any individual local labor market.
Common Confusions
"Are these two frameworks in conflict?" No. They are complementary. They identify the same parameter under different assumptions. The question is: in your setting, is it more plausible that the shares are exogenous or that the shocks are exogenous? The answer determines which diagnostics to run.
"Can I just run 2SLS and not worry about this distinction?" Technically yes — the first-stage regression and 2SLS machinery are the same regardless of interpretation. But you need to know why your instrument is valid so you can test the right assumptions. Under the shares interpretation, it is important to check that shares are balanced (uncorrelated with local covariates). Under the shocks interpretation, it is important to check that shocks are as-if-random.
"What about the exclusion restriction?" The exclusion restriction requires that the shift-share instrument affects the outcome only through the endogenous variable. In the China shock example, the instrument should affect local employment only through its effect on local import competition, not through other channels. This restriction is violated if, for example, areas with high manufacturing shares also experience technology shocks that affect employment independently of trade.
"How many industries do I need?" Under the Borusyak et al. (2022) framework, you need many shocks (large ) so that the law of large numbers kicks in and the instrument's exogeneity holds on average. If is small, each individual shock has too much influence, and the as-if-random assumption is harder to justify.
BIdentification
The Estimating Equation
The typical setup is:
Second stage:
First stage:
where is the endogenous variable (e.g., change in local import exposure), is the shift-share instrument, and are controls.
Under the Shares Interpretation
Goldsmith-Pinkham et al. (2020) show that the shift-share instrumental variables (IV) estimator is numerically equivalent to a Generalized Method of Moments (GMM) estimator using the individual shares as instruments:
where are Rotemberg weights that reflect each industry's contribution to identification. Key implication: a few industries typically drive the result. It is important to check which industries have the largest Rotemberg weights and verify that their shares are plausibly exogenous.
Diagnostics:
- Report Rotemberg weights (which industries matter most)
- Check balance: regress shares on local covariates for the high-weight industries
- Over-identification test: if , you have multiple instruments and can test over-identifying restrictions
Under the Shocks Interpretation
Borusyak et al. (2022) show that the shift-share IV can be recast as an estimator at the industry level. The key assumption is that shocks are as-good-as-randomly assigned:
where is the exposure-weighted average of local residuals for industry .
Diagnostics:
- Check balance: regress shocks on industry-level characteristics
- Examine whether shocks are correlated with pre-period trends
- Verify that is large enough for the asymptotic approximation
CVisual Intuition
Imagine a map of the United States. Each local labor market has a pie chart showing its industry composition. Now imagine that China's rise disproportionately affects certain industries (shown in red). The areas with the largest red slices are the most "exposed" — they receive the largest values of the shift-share instrument.
The identifying variation comes from comparing outcomes in areas with large red slices (high exposure) to areas with small red slices (low exposure), where the redness of each industry is determined by national-level Chinese import growth.
DMathematical Derivation
Don't worry about the notation yet — here's what this means in words: The shift-share IV estimator is equivalent to using each industry share as a separate instrument, weighted by how much that industry contributes to the identifying variation.
Define the shift-share instrument:
The 2SLS estimator using as the instrument for can be written as follows. For exposition, we present the no-controls Wald-ratio form; with controls (as defined above), and are first residualized on and the instrument enters a standard 2SLS framework:
Goldsmith-Pinkham et al. (2020) show this equals:
where is the just-identified IV estimate using alone as the instrument, and are the Rotemberg weights:
where is the first-stage fitted value. These weights sum to one but can be negative (for industries where the shock and the first-stage effect have opposite signs).
Under the Borusyak et al. (2022) shocks interpretation, the estimator can be rewritten at the industry level:
where and is the exposure-weighted average outcome for industry . This expression is just a weighted IV regression at the industry level, which makes the exogeneity requirement on transparent.
EImplementation
# Requires: fixest, bartik.weight
library(fixest)
# Install from GitHub: devtools::install_github("paulgp/bartik-weight")
library(bartik.weight)
# --- Step 1: Construct the Shift-Share Instrument ---
# Z_l = sum_k (share_lk * shock_k) — inner product of shares and shocks.
# share_lk = location l's employment share in industry k (from a base period).
# shocks = national-level growth rates by industry (excluding location l, ideally).
# The instrument captures predicted local exposure to national industry shocks.
df$shift_share <- as.matrix(df[, share_cols]) %*% shocks
# --- Step 2: Two-Stage Least Squares ---
# feols() IV syntax: outcome ~ controls | FE | endogenous ~ instrument.
# The shift-share instrument is excluded from the second stage.
# vcov = "HC1" provides heteroskedasticity-robust standard errors.
est <- feols(outcome ~ controls | 0 | exposure ~ shift_share,
data = df, vcov = "HC1")
# Check the first-stage F-statistic: F > 10 indicates a strong instrument.
summary(est)
# --- Step 3: Rotemberg Weights Decomposition ---
# Decomposes the overall IV estimate into industry-level contributions.
# alpha_k = Rotemberg weight for industry k: how much industry k drives
# the aggregate estimate. Negative weights indicate extrapolation.
# If a few industries dominate, the estimate is fragile.
bw_result <- bw(
master = df, y = "outcome", x = "exposure",
controls = control_cols, weight = "weight_col",
local = df, Z = share_cols,
global = shocks_df, G = "shock_growth"
)
summary(bw_result)
# --- Step 4: Inspect Top-Weight Industries ---
# Industries with the largest Rotemberg weights drive the estimate.
# These high-weight industries must individually pass exogeneity and
# balance tests — if they fail, the overall instrument is suspect.
top_weights <- sort(bw_result$alpha, decreasing = TRUE)[1:5]
print(top_weights)FDiagnostics
-
First-stage F-statistic. As with any IV, check for weak instruments. F > 10 is the traditional threshold, though more recent guidance from (Lee et al., 2022) suggests stricter thresholds.
-
Rotemberg weights. Report the top 5-10 industries by Rotemberg weight. Check whether these industries are plausibly exogenous. If one industry dominates the instrument, your results depend heavily on that industry.
-
Balance checks. Under shares-based identification: regress shares on pre-period covariates. Under shocks-based identification: regress shocks on industry-level characteristics.
-
Over-identification test. Under the shares interpretation, you have instruments but estimate only one parameter. The Sargan-Hansen J-test checks whether the over-identifying restrictions hold. A rejection suggests that at least some shares are not valid instruments.
-
Leave-one-industry-out. Re-estimate removing the top Rotemberg weight industry. If results change dramatically, they are fragile.
-
Pre-trend tests. If you have multiple periods, check whether the shift-share instrument predicts outcome changes in the pre-period.
Interpreting Your Results
When reporting shift-share IV results, it is important to be clear about which identification framework you are invoking. State explicitly: "We interpret the shift-share instrument under the [shares/shocks]-based framework of [Goldsmith-Pinkham et al. (2020)/Borusyak et al. (2022)]."
Report the Rotemberg weights and discuss which industries drive identification. If 80% of the identifying variation comes from three industries, the reader needs to evaluate whether those three industries satisfy the exogeneity conditions.
GWhat Can Go Wrong
Rotemberg Weights Reveal Identification Driven by a Single Industry
Researcher studies the effect of immigration on native wages using a shift-share instrument where national immigrant inflows by origin country are the shocks and local baseline immigrant shares by origin are the exposure shares. They compute Rotemberg weights and find that the top 5 origin countries account for 38% of the identifying variation, with no single country exceeding 12%.
The diversified identification source supports the Borusyak et al. (2022) shocks-based interpretation: with many origin countries contributing, the as-if-random assignment of shocks averages out idiosyncratic concerns. Leave-one-out estimates removing each top-5 country range from -0.28 to -0.35, tightly bracketing the baseline estimate of -0.31.
Using Contemporaneous Shares Instead of Baseline Shares
Researcher constructs a Bartik instrument for local exposure to technology shocks using 1990 Census industry employment shares as weights, applied to 2000-2015 national technology adoption growth rates.
The baseline 1990 shares are predetermined relative to the 2000-2015 outcome period. Under the shares-based interpretation, the researcher can credibly argue that 1990 industry composition reflects historical comparative advantage and is plausibly uncorrelated with 2000-2015 local demand shocks.
Exclusion Restriction Violated Through Correlated Sectoral Shocks
Researcher studies the effect of Chinese import competition on local manufacturing employment using the Autor et al. (2013) shift-share instrument. They carefully address the concern that Chinese import growth in an industry might be correlated with US domestic technology shocks in the same industry by instrumenting with Chinese exports to other high-income countries.
Using Chinese exports to 8 other high-income countries as an alternative measure of Chinese supply shocks isolates the supply-driven component of Chinese trade growth. The instrument captures China's comparative advantage development rather than US demand conditions. The IV estimate is -0.73 jobs per 1,000 workers per $1,000 import exposure.
HPractice
You construct a shift-share instrument for local labor market exposure to automation using national-level robot adoption rates as shocks and baseline industry employment shares as weights. The Rotemberg weight decomposition reveals that the automobile manufacturing industry accounts for 45% of the identifying variation. What should you do?
In a shift-share (Bartik) instrument, what is the key distinction between the 'shares' and 'shocks' identification strategies?
Shift-Share Instrument: Import Competition and Manufacturing Employment
Autor et al. (2013) study how Chinese import growth affected US manufacturing employment. Their shift-share instrument assigns each US commuting zone exposure to Chinese imports based on the zone's pre-existing industry mix (shares) and the national growth of Chinese imports in each industry (shocks). A zone with many textile workers gets high 'China shock' exposure because textile imports surged nationally.
Read the analysis below carefully and identify the errors.
A researcher studies the effect of automation exposure on local unemployment. They construct a shift-share instrument: `Z_l = sum_k s_lk * g_k`, where `s_lk` is the share of industry k in location l's employment (measured in 2010) and `g_k` is the national growth in robot installations in industry k from 2010 to 2020. They run 2SLS with state fixed effects and find that a one-standard-deviation increase in automation exposure raises local unemployment by 1.8 percentage points (p = 0.01, F-stat = 18). They write: "Our shift-share instrument is valid because national robot adoption rates are plausibly exogenous to any individual local labor market."
Select all errors you can find:
Read the analysis below carefully and identify the errors.
A researcher studies the effect of trade liberalization on firm productivity in Brazilian municipalities. They construct a Bartik instrument using 1991 industry shares and tariff reductions from 1991 to 2010. The Rotemberg weight analysis shows that the top 3 industries (footwear, textiles, and food processing) account for 72% of the identifying variation. Balance tests show that 1991 footwear shares are strongly correlated with pre-period education levels (p = 0.003) and urbanization (p = 0.01). The researcher reports: "We control for education and urbanization in the second stage, which addresses the balance test failure."
Select all errors you can find:
Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.
Paper Summary
The authors study the causal effect of immigration on native wages in 722 US commuting zones from 1990 to 2010. They construct a shift-share instrument using 1980 immigrant shares by origin country as weights and decadal national immigrant inflows by origin country as shocks. The first-stage F-statistic is 32. They estimate that a 1 percentage point increase in the immigrant share reduces native wages by 0.8% (SE = 0.3%). They invoke the shocks-based framework of Borusyak et al. (2022) and report balance tests showing that origin-country-level shocks are uncorrelated with 1970 origin-country characteristics.
Key Table
| Variable | OLS | IV (shift-share) |
|---|---|---|
| Immigrant share | -0.15 (0.08) | -0.80 (0.30) |
| F-stat | — | 32 |
Top Rotemberg weights: Mexico: 0.41 Philippines: 0.09 China: 0.08 India: 0.07 Vietnam: 0.06
Authors' Identification Claim
Under the shocks-based framework, the exogeneity of origin-country immigration flows to local US labor demand conditions ensures that the shift-share instrument isolates supply-driven variation in local immigrant concentration.
ISwap-In: When to Use Something Else
- Standard IV: When a single instrument with a clear exclusion restriction is available — the shift-share structure adds complexity that is only warranted when the instrument is inherently composed of shares and shocks.
- Difference-in-differences: When the shock creates a clear before/after comparison for exposed versus unexposed regions, and parallel trends is directly defensible.
- Synthetic control: When few regions are heavily exposed and constructing a data-driven counterfactual from donor regions is feasible.
- OLS with controls: When the exposure variable is exogenous conditional on observables and the primary concern is confounding rather than endogeneity — no IV structure is needed.
JReviewer Checklist
Critical Reading Checklist
Paper Library
Foundational (5)
Adao, R., Kolesar, M., & Morales, E. (2019). Shift-Share Designs: Theory and Inference.
Adao, Kolesar, and Morales show that standard errors in shift-share regressions are too small when computed with conventional clustering because residuals are correlated across regions that share similar industry compositions. They propose an inference procedure that accounts for this dependence.
Bartik, T. J. (1991). Who Benefits from State and Local Economic Development Policies?.
Bartik introduces the shift-share instrument—constructing predicted local employment growth from national industry growth rates interacted with initial local industry composition. This 'Bartik instrument' has become one of the most widely used instruments in labor and urban economics.
Borusyak, K., Hull, P., & Jaravel, X. (2022). Quasi-Experimental Shift-Share Research Designs.
Borusyak, Hull, and Jaravel provide an alternative framework where identification comes from the exogeneity of the shocks rather than the shares. They show that with many independent shocks, the instrument is valid even if shares are endogenous, greatly expanding the range of credible applications.
Goldsmith-Pinkham, P., Sorkin, I., & Swift, H. (2020). Bartik Instruments: What, When, Why, and How.
Goldsmith-Pinkham, Sorkin, and Swift provide a rigorous econometric framework for shift-share instruments, showing that the Bartik instrument can be decomposed into a weighted sum of individual share-based instruments. They clarify that identification requires exogeneity of the initial shares, not the shocks.
Jaeger, D. A., Ruist, J., & Stuhler, J. (2018). Shift-Share Instruments and the Impact of Immigration.
Jaeger, Ruist, and Stuhler highlight a threat to shift-share instruments in immigration research: serial correlation in immigrant inflows can bias estimates if past immigration affects current outcomes through channels other than current immigration. This paper raises important concerns about the exclusion restriction.
Application (3)
Autor, D. H., Dorn, D., & Hanson, G. H. (2013). The China Syndrome: Local Labor Market Effects of Import Competition in the United States.
Autor, Dorn, and Hanson use a shift-share instrument to study how Chinese import competition affected U.S. local labor markets, instrumenting U.S. import exposure with Chinese exports to other high-income countries. This paper is one of the most influential and widely discussed shift-share applications.
Blanchard, O. J., & Katz, L. F. (1992). Regional Evolutions.
Blanchard and Katz study regional labor market adjustment in the United States, analyzing how local employment shocks affect wages, unemployment, and migration. They construct a predicted-employment instrument using national industry growth interacted with local industry shares—the approach the subsequent literature calls the Bartik or shift-share instrument.
Card, D. (2001). Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration.
Card uses a shift-share instrument based on historical settlement patterns of immigrant groups to predict current immigration flows to U.S. cities. This 'enclave instrument' is adopted in hundreds of subsequent immigration studies and is a classic example of the shift-share approach.