When should I use Synthetic Control?

When you have one or very few treated units at an aggregate level (state, country, industry), a rich pre-treatment period with many time points, and a pool of donor units unaffected by the treatment.

What is the key assumption of Synthetic Control?

The synthetic control accurately reproduces the treated unit's pre-treatment outcome trajectory, and the donor pool is uncontaminated by spillovers. Formally requires a linear factor model structure, which is weaker than parallel trends.

What is the most common mistake with Synthetic Control?

Using synthetic control when there are many treated units (use DiD instead), or proceeding when the pre-treatment fit is poor — a poor fit honestly signals the method does not work for your setting.

Method·advanced·11 min read

Design-BasedModern

Synthetic Control

Constructs a weighted combination of control units that best approximates the treated unit's pre-treatment trajectory.

When to Use: When you have one or very few treated units at an aggregate level (state, country, industry), a rich pre-treatment period with many time points, and a pool of donor units unaffected by the treatment.
Assumption: The synthetic control accurately reproduces the treated unit's pre-treatment outcome trajectory, and the donor pool is uncontaminated by spillovers. Formally requires a linear factor model structure, which is weaker than parallel trends.
Mistake: Using synthetic control when there are many treated units (use DiD instead), or proceeding when the pre-treatment fit is poor — a poor fit honestly signals the method does not work for your setting.
Reading Time: ~11 min read · 11 sections · 6 interactive exercises

One-Line Implementation

Raugsynth(outcome ~ treatment, unit = unit_id, time = year, data = df)

Statasynth outcome predictors, trunit(treated_id) trperiod(treat_year) figure

PythonSynth(df, 'outcome', 'unit_id', 'year', treated_unit, treatment_period) # SyntheticControlMethods

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: Terrorism in the Basque Country

In the early 2000s, researchers wanted to know: what was the economic cost of terrorism in the Basque Country? ETA's campaign of political violence had been raging since the late 1960s. Could you estimate the causal effect of terrorism on GDP?

The challenge is immediate and severe. There is only one Basque Country. You typically cannot randomly assign terrorism to some regions and not others. Standard Difference-in-Differences (DiD) requires a comparable control group, but no single Spanish region is a convincing stand-in for the Basque Country — the region has a unique industrial structure, culture, and geography.

Abadie and Gardeazabal (2003) pioneered an innovative solution: what if you could construct a synthetic Basque Country — a weighted combination of other Spanish regions whose economic trajectory before terrorism closely matched the real Basque Country? If this synthetic version tracked the Basque Country closely before terrorism began, then the gap between the real and synthetic Basque Country after terrorism is a credible estimate of the causal effect.

This paper introduced the core idea that grew into the . The formal estimator — with the donor-pool optimization, weight constraints, and inference procedures used today — was developed by Abadie et al. (2010), who applied it to California's Proposition 99 and laid out the canonical algorithm covered below.

AOverview

The synthetic control method constructs a counterfactual for a single treated unit by finding a weighted average of untreated ("donor") units that best resembles the treated unit in the pre-treatment period. The key insight is that rather than choosing one comparison unit, you can build a better comparison by combining several.

The method was formalized and extended by Abadie et al. (2010) in their study of California's Proposition 99, an aggressive tobacco control program. They showed that California's per-capita cigarette sales dropped dramatically after Proposition 99 relative to a synthetic California constructed from a weighted combination of other states.

When to Use Synthetic Control

The method shines when:

You have one or a few treated units at an aggregate level (country, state, industry)
You have a rich pre-treatment period with many time points
You have a pool of donor units that were not affected by the treatment
No single donor unit is a convincing control on its own

It is less appropriate when:

You have many treated units (use DiD or staggered DiD instead)
The pre-treatment period is very short
The is small or contaminated by spillovers

Common Confusions

"Is synthetic control just matching?" It is related to matching, but there are important differences. Matching typically operates at the individual level, matching on pre-treatment covariates. Synthetic control operates at the aggregate level, matching on the pre-treatment trajectory of the outcome variable itself. Matching on outcomes rather than just covariates is a stronger form of matching.

"What if the synthetic control does not fit well in the pre-treatment period?" Then the method is not appropriate for your setting. A poor pre-treatment fit means the convex combination of donors cannot reproduce the treated unit's trajectory, which undermines the entire logic. If the fit is poor, this failure is an honest signal that the method does not work for your setting — not a problem to fix by adjusting weights.

"Can I include the treated unit's own pre-treatment outcome as a predictor?" Yes, and doing so is standard practice. Abadie et al. (2010) recommend including lagged outcomes as predictors alongside other covariates. Including the full set of pre-treatment outcome lags effectively means the synthetic control is chosen to match the treated unit's entire pre-treatment trajectory.

BIdentification

Setup

Let $Y_{1t}$ be the outcome for the treated unit at time $t$ , and let $Y_{jt}$ for $j = 2, \ldots, J+1$ be outcomes for the $J$ donor units. Treatment occurs at time $T_0 + 1$ .

The synthetic control is a vector of weights $\mathbf{W} = (w_2, \ldots, w_{J+1})'$ such that:

$w_j \geq 0$ for all $j$ (non-negativity)
$\sum_j w_j = 1$ (weights sum to one)

The synthetic control outcome is: $\hat{Y}_{1t}^{SC} = \sum_{j=2}^{J+1} w_j^* Y_{jt}$

The estimated treatment effect at time $t > T_0$ is:

\hat{\tau}_{1t} = Y_{1t} - \hat{Y}_{1t}^{SC}

Choosing the Weights

The weights $\mathbf{W}^*$ are chosen to minimize the discrepancy between the treated unit and the synthetic control in the pre-treatment period:

\mathbf{W}^* = \arg\min_{\mathbf{W}} \|\mathbf{X}_1 - \mathbf{X}_0 \mathbf{W}\|_V

where $\mathbf{X}_1$ is a vector of pre-treatment characteristics (including lagged outcomes) for the treated unit, $\mathbf{X}_0$ is the corresponding matrix for donor units, and $V$ is a positive semidefinite matrix that weights the importance of different predictors.

Identifying Assumptions

1. Valid counterfactual persistence. The key assumption is that the synthetic control constructed from the pre-treatment period remains a valid counterfactual in the post-treatment period. Abadie et al. (2010) show that if the treated unit's outcome is generated by a linear factor model:

Y_{it} = \delta_t + \boldsymbol{\theta}_t' \mathbf{Z}_i + \boldsymbol{\lambda}_t' \boldsymbol{\mu}_i + \varepsilon_{it}

then a synthetic control that closely matches the treated unit's pre-treatment outcomes will also match the unobserved factors $\boldsymbol{\mu}_i$ , provided the number of pre-treatment periods is large enough.

This factor model is weaker than parallel trends in that it allows for heterogeneous trends — it allows for unit-specific trends driven by the factor loadings $\boldsymbol{\mu}_i$ . But it is stronger in another way: it requires a good pre-treatment fit, which may not always be achievable.

2. No interference (Stable Unit Treatment Value Assumption, SUTVA). The treatment of the treated unit does not affect the outcomes of the donor units. If terrorism in the Basque Country caused economic to other Spanish regions — for example, firms relocating from the Basque Country to Catalonia — the donor outcomes are contaminated and the synthetic control is no longer a valid counterfactual. There must also be no hidden variations of treatment across the post-treatment period.

CVisual Intuition

The classic synthetic control presentation has two panels:

Panel A: Trajectory plot. The treated unit's outcome over time (solid line) alongside the synthetic control (dashed line). Before treatment, they should overlap closely. After treatment, any gap is the estimated effect.

Panel B: Gap plot. The difference $Y_{1t} - \hat{Y}_{1t}^{SC}$ over time. Before treatment, the gap should hover around zero. After treatment, persistent deviations indicate a treatment effect.

Build your own synthetic California by adjusting donor state weights to match the pre-treatment trajectory:

DMathematical Derivation

Don't worry about the notation yet — here's what this means in words: Under a linear factor model, a synthetic control that perfectly matches the treated unit's pre-treatment outcomes also matches its unobserved factor loadings, which means it remains a valid counterfactual in the post-treatment period.

Suppose outcomes follow a factor model:

Y_{it}(0) = \delta_t + \boldsymbol{\theta}_t' \mathbf{Z}_i + \boldsymbol{\lambda}_t' \boldsymbol{\mu}_i + \varepsilon_{it}

where $\mathbf{Z}_i$ are observed covariates, $\boldsymbol{\mu}_i$ are unobserved factor loadings, and $\boldsymbol{\lambda}_t$ are common time-varying factors.

If we find weights $\mathbf{W}^*$ such that:

\sum_j w_j^* Y_{jt} = Y_{1t} \quad \text{for } t = 1, \ldots, T_0

and also $\sum_j w_j^* \mathbf{Z}_j = \mathbf{Z}_1$ , then:

Y_{1t}(0) - \sum_j w_j^* Y_{jt}(0) = \boldsymbol{\lambda}_t' \left(\boldsymbol{\mu}_1 - \sum_j w_j^* \boldsymbol{\mu}_j\right) + \left(\varepsilon_{1t} - \sum_j w_j^* \varepsilon_{jt}\right)

Abadie et al. (2010) show that as $T_0 \to \infty$ and the pre-treatment fit is exact, we typically need to have $\boldsymbol{\mu}_1 \approx \sum_j w_j^* \boldsymbol{\mu}_j$ — formally, perfect pre-treatment fit implies exact matching of factor loadings only when the matrix of pre-treatment factors $\boldsymbol{\lambda}_{1:T_0}$ has full column rank (ADH 2010, Appendix B), so that the unobserved $\boldsymbol{\mu}$ can be identified from the observed pre-period equalities — which implies:

Y_{1t}(0) - \sum_j w_j^* Y_{jt}(0) \approx 0 \quad \text{for } t > T_0

Therefore:

\hat{\tau}_{1t} = Y_{1t} - \sum_j w_j^* Y_{jt} \approx Y_{1t}(1) - Y_{1t}(0)

This gap is the causal effect of the treatment.

EImplementation

1# Requires: Synth, augsynth
2# Using the Synth package (original Abadie, Diamond, Hainmueller implementation)
3library(Synth)
4
5# --- Step 1: Prepare Data for Synthetic Control ---
6# dataprep() organizes the data into the format required by synth().
7# predictors = covariates to match on (averaged over time.predictors.prior).
8# time.optimize.ssr = pre-treatment years used to minimize the sum of
9#   squared prediction errors between treated and synthetic trajectories.
10# time.plot = full time range for visualization (pre + post treatment).
11dataprep_out <- dataprep(
12foo = df,
13predictors = c("pred1", "pred2"),
14predictors.op = "mean",
15dependent = "outcome",
16unit.variable = "unit_id",
17time.variable = "year",
18treatment.identifier = treated_id,
19controls.identifier = control_ids,
20time.predictors.prior = 1980:1993,
21time.optimize.ssr = 1980:1993,
22time.plot = 1980:2005
23)
24
25# --- Step 2: Run the Optimization ---
26# synth() finds non-negative donor weights that minimize the distance
27# between the treated unit and the synthetic control in the pre-treatment period.
28# Weights are sparse: most donors receive zero weight.
29synth_out <- synth(dataprep_out)
30
31# --- Step 3: Trajectory Plot ---
32# Plots treated unit vs. synthetic control over the full time range.
33# Pre-treatment fit should be very close (validates the method).
34# Post-treatment divergence = the estimated treatment effect.
35path.plot(synth_out, dataprep_out, Main = "Treated vs. Synthetic Control")
36
37# --- Step 4: Gap Plot (Treatment Effect Over Time) ---
38# Shows (treated - synthetic) at each time period.
39# Pre-treatment gaps should hover near zero; post-treatment gaps = effect.
40gaps.plot(synth_out, dataprep_out, Main = "Treatment Effect (Gap)")
41
42# --- Step 5 (Alternative): augsynth for Modern Features ---
43# augsynth provides bias correction, ridge regularization, and
44# automatic inference via the conformal method. Recommended for new projects.
45library(augsynth)
46syn <- augsynth(outcome ~ treatment, unit = unit_id, time = year, data = df)
47# Output: ATT estimate with p-value from permutation/conformal inference
48summary(syn)
49plot(syn)

RequiresSynth augsynth

FDiagnostics

Pre-treatment fit. A critically important diagnostic. If the synthetic control does not closely track the treated unit before treatment, the method fails. Report the root mean squared prediction error () for the pre-treatment period.
Donor weights. Report which units get positive weight and how much. If the synthetic control concentrates weight on a single donor, you are effectively doing a single-unit comparison. If weights are spread across many units, the synthetic control is a genuine average.
Predictor balance. Report a table comparing the treated unit, the synthetic control, and the unweighted average of all donors on all predictor variables. The synthetic control should be much closer to the treated unit than the simple average.
In-space placebos. Run the synthetic control analysis for every donor unit (pretending each one was treated). If the treated unit's gap is unusually large relative to the placebo gaps, this evidence supports a genuine treatment effect. This procedure provides a permutation-style p-value.
In-time placebos. Re-run the analysis using a fake treatment date in the pre-treatment period. If you see a "gap" at the fake treatment date, something is wrong with the specification.
Leave-one-out robustness. Re-estimate removing each donor unit one at a time. If the results change dramatically when one donor is removed, the synthetic control is fragile.

Interpreting Your Results

Good pre-treatment fit, large post-treatment gap: This pattern is the ideal result. The synthetic control tracks the treated unit well before treatment, and the post-treatment divergence is your estimated effect. Bolster this with placebo tests.

Poor pre-treatment fit: A synthetic control that cannot match the pre-treatment trajectory is not a credible counterfactual. If the fit is poor, consider a different method or a different research question.

Effect appears immediately and persists: Consistent with a permanent treatment effect.

Effect grows over time: The treatment may take time to fully materialize.

Effect appears and then fades: The treatment effect may be temporary.

GWhat Can Go Wrong

Common Pitfalls

Interpolation vs. extrapolation. The synthetic control works by interpolation — the treated unit should be "within" the convex hull of the donor pool. If the treated unit is an outlier on key characteristics, no weighted average of donors can match it.
Contaminated donor pool. If some donor units are also affected by the treatment (spillovers), the synthetic control is biased. Carefully screen your donor pool for indirect treatment effects.
Cherry-picking donors. Do not drop donor units just because they "do not fit." The donor pool is more credibly chosen based on substantive criteria before seeing the results.
Too few pre-treatment periods. The factor model justification requires a long pre-treatment period. With only 5-10 pre-treatment observations, the synthetic control may match by chance rather than by capturing underlying factors.
Multiple treated units. The original method is designed for a single treated unit. Extensions exist for multiple treated units — including synthetic DiD, which combines the strengths of DiD and synthetic control — but if you have many treated units, consider DiD or staggered DiD instead.

What Can Go Wrong

Treated Unit Outside the Convex Hull of the Donor Pool

Researcher studies the effect of German reunification on West German GDP per capita. The donor pool includes 16 OECD countries. West Germany's pre-reunification GDP, trade openness, and industry composition all fall within the range of the donor pool. The synthetic control achieves a pre-treatment RMSPE of 0.8% of GDP.

The synthetic control closely tracks West German GDP from 1960 to 1989 (pre-reunification), then diverges after 1990, showing a persistent GDP loss of approximately 1,600 USD per capita — a credible estimate of the reunification cost.

What Can Go Wrong

Contaminated Donor Pool Due to Spillover Effects

Researcher studies the effect of a large refugee influx on labor market outcomes in Jordan. The donor pool excludes neighboring countries (Lebanon, Turkey, Iraq) that also received refugees, as well as Gulf states that experienced labor supply shocks from the same conflict.

Using a donor pool of 25 non-affected developing countries, the synthetic Jordan tracks pre-2011 employment rates closely (RMSPE = 0.4 pp). Post-2012, the gap shows a 2.1 pp decline in native employment, with in-space placebos confirming the effect is unusually large (p = 0.04).

What Can Go Wrong

Overfitting with Too Few Pre-Treatment Periods

Researcher studies the effect of California's cap-and-trade program (adopted 2013) on manufacturing output using state-level data from 1990 to 2020, providing 23 pre-treatment years. Multiple lagged outcomes serve as predictors.

The synthetic California matches the treated unit's manufacturing trajectory closely over 23 years. The donor weights are spread across 5 states (Texas 0.28, Ohio 0.22, Illinois 0.19, Pennsylvania 0.18, Georgia 0.13). The long pre-treatment period provides confidence that the match reflects genuine structural similarity, not coincidence.

HPractice

Concept Check

You construct a synthetic California using data from 38 other states and find that the pre-treatment RMSPE is 2.1 (in cigarette packs per capita). An in-space placebo test shows that 3 of the 38 placebo states have pre-treatment RMSPEs larger than 2.1. What does this tell you?

The synthetic control for California is a poor fit because 3 placebos have larger errorsExclude the 3 states with larger pre-treatment errors from the placebo testThe p-value for the treatment effect is 3/38 = 0.079Compute the ratio of post-treatment RMSPE to pre-treatment RMSPE for each unit and check whether California's ratio is extreme in the placebo distribution

Concept Check

What is the key advantage of the synthetic control method over standard difference-in-differences?

It requires fewer time periodsIt constructs a data-driven counterfactual by reweighting control units to match the treated unit's pre-treatment trajectory, rather than assuming parallel trendsIt does not require any untreated unitsIt can estimate effects for multiple treated units simultaneously

Guided Exercise

Synthetic Control: The Effect of California's Tobacco Tax on Smoking

Abadie et al. (2010) study the effect of California's 1988 Proposition 99 (a large cigarette tax) on per-capita cigarette consumption. California is the treated unit. The donor pool is 38 other US states that did not adopt similar tobacco taxes during the study period (1970-2000).

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of Puerto Rico's Act 22 tax incentive (2012) on high-income migration using synthetic control. They construct a synthetic Puerto Rico from 50 US states. The pre-treatment period is 2005-2011 (7 years). The synthetic control assigns weights of 0.52 to Hawaii and 0.48 to New Mexico. The researcher reports: "The synthetic Puerto Rico achieves an excellent pre-treatment fit (RMSPE = 0.3%). The post-treatment gap shows a 15% increase in high-income tax filers. In-space placebos yield a p-value of 0.02, confirming the result is statistically significant."

Select all errors you can find:

US states are not valid donors for Puerto Rico(Donor pool selection)

Only 7 pre-treatment years is too short for reliable matching(Pre-treatment period length)

Weight concentrated on only two donors(Donor weight distribution)

Error Detective

Read the analysis below carefully and identify the errors.

A researcher studies the effect of Scotland's 2018 minimum unit pricing (MUP) policy on alcohol-related hospital admissions using a synthetic Scotland constructed from 15 European countries. The pre-treatment fit is good (RMSPE = 1.2 admissions per 100,000). The researcher conducts in-space placebos and finds that Scotland's post-treatment gap is the largest. They report: "The effect of MUP is a 9.3% reduction in alcohol-related admissions. To ensure robustness, we conduct an in-time placebo test using 2014 as a fake treatment date and find no placebo effect." They do not discuss that Wales and Ireland implemented similar alcohol policies in 2018 and 2019.

Select all errors you can find:

Donor pool contaminated by countries adopting similar policies(Donor pool construction)

Single in-time placebo is insufficient(Placebo testing)

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors study the effect of Uruguay's 2013 marijuana legalization on crime rates using synthetic control. They construct a synthetic Uruguay from 17 Latin American countries using pre-legalization data from 2000-2012 on homicide rates, GDP per capita, urbanization, and youth unemployment. The synthetic Uruguay places weight on Chile (0.35), Costa Rica (0.30), Argentina (0.20), and Panama (0.15). The pre-treatment RMSPE is 0.8 homicides per 100,000. Post-legalization, they estimate a 12% reduction in homicide rates by 2018.

Key Table

Predictor	Uruguay	Synthetic	Donor Avg
GDP/capita (PPP)	17,200	16,800	11,400
Urbanization	95%	89%	76%
Youth unemp.	19%	18%	16%
Homicide (2012)	7.9	7.6	22.1
Pre-RMSPE	—	0.8	--

Placebo p-value: 2/18 = 0.11

Authors' Identification Claim

The close pre-treatment fit and the fact that Uruguay's post-treatment gap exceeds most placebos supports a causal interpretation that marijuana legalization reduced homicide rates.

ISwap-In: When to Use Something Else

Difference-in-differences: When many units are treated and parallel trends is defensible — DiD is simpler and provides standard inference.
Synthetic DiD: When you want to combine the unit-reweighting logic of synthetic control with the time-differencing logic of DiD, especially with multiple treated units.
Event studies: When the full time profile of treatment effects is of primary interest and many treated units are available.
Matching: When many treated units exist and covariate-based matching on pre-treatment characteristics is feasible.

JReviewer Checklist

Critical Reading Checklist

0 of 8 items checked0%

Is the pre-treatment fit displayed and its quality discussed?
Are the synthetic control weights reported?
Is a predictor balance table shown (treated vs. synthetic vs. average)?
Are in-space placebo tests conducted?
Is the donor pool clearly justified and free from spillover contamination?
Are leave-one-out robustness checks reported?
Is the treated unit within the convex hull of the donor pool?
Is there discussion of whether the pre-treatment period is long enough?

Paper Library

Has replication code

Foundational (7)

Abadie, A., & Gardeazabal, J. (2003). The Economic Costs of Conflict: A Case Study of the Basque Country.

American Economic ReviewDOI: 10.1257/000282803321455188

Abadie and Gardeazabal introduce the synthetic control idea in the context of estimating the economic costs of terrorism in the Basque Country. They construct a synthetic Basque Country from other Spanish regions and show that terrorism reduced GDP per capita by about 10 percentage points.

Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program.

Journal of the American Statistical AssociationDOI: 10.1198/jasa.2009.ap08746

Abadie, Diamond, and Hainmueller formalize and popularize the synthetic control method, which constructs a weighted combination of control units to approximate the counterfactual for a single treated unit. The application to California's Proposition 99 tobacco control program becomes the canonical example of the method.

Ben-Michael, E., Feller, A., & Rothstein, J. (2021). The Augmented Synthetic Control Method.

Journal of the American Statistical AssociationDOI: 10.1080/01621459.2021.1929245

Ben-Michael, Feller, and Rothstein propose augmenting the synthetic control estimator with an outcome model to reduce bias when the synthetic control does not achieve perfect pre-treatment fit. The resulting doubly robust estimator is consistent if either the outcome model or the weighting is correct, providing a practical improvement for applied synthetic control studies.

Chernozhukov, V., Wuthrich, K., & Zhu, Y. (2021). An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.

Journal of the American Statistical AssociationDOI: 10.1080/01621459.2021.1920957

Chernozhukov, Wuthrich, and Zhu develop a conformal inference method for synthetic control that provides exact, finite-sample valid p-values and confidence intervals without requiring a large number of control units. This approach offers a modern, robust alternative to placebo-based inference for counterfactual and synthetic control estimators.

Doudchenko, N., & Imbens, G. W. (2016). Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis.

NBER Working Paper No. 22791DOI: 10.3386/w22791

Doudchenko and Imbens place synthetic control within a broader framework that includes DID and regression as special cases, proposing extensions that relax the non-negativity and adding-up constraints on weights. This paper helps researchers understand the connections between synthetic control and other methods.

Firpo, S., & Possebom, V. (2018). Synthetic Control Method: Inference, Sensitivity Analysis and Confidence Sets.

Journal of Causal InferenceDOI: 10.1515/jci-2016-0026

Firpo and Possebom develop formal inference procedures for the synthetic control method, including sensitivity analysis tools and confidence sets. Their framework provides a more rigorous basis for statistical inference in synthetic control applications beyond the standard permutation-based placebo tests.

Gobillon, L., & Magnac, T. (2016). Regional Policy Evaluation: Interactive Fixed Effects and Synthetic Controls.

Review of Economics and StatisticsDOI: 10.1162/REST_a_00537

Gobillon and Magnac connect synthetic control to interactive fixed-effects models, showing that synthetic control can be interpreted as an estimator that allows for time-varying factor loadings. This paper bridges the synthetic control and factor model literatures.

Application (4)

Abadie, A., Diamond, A., & Hainmueller, J. (2015). Comparative Politics and the Synthetic Control Method.

American Journal of Political ScienceDOI: 10.1111/ajps.12116

Abadie, Diamond, and Hainmueller apply the synthetic control method to estimate the economic impact of German reunification, constructing a synthetic West Germany from OECD countries. They demonstrate the method's applicability to major political events and provided inference procedures based on permutation tests.

Cunningham, S., & Shah, M. (2018). Decriminalizing Indoor Prostitution: Implications for Sexual Violence and Public Health.

Review of Economic StudiesDOI: 10.1093/restud/rdx065

Cunningham and Shah use the synthetic control method to study how Rhode Island's accidental decriminalization of indoor prostitution affected sex crimes and STI rates. This study is a well-known application that illustrates how synthetic control can exploit a unique policy change affecting a single unit.

Fremeth, A. R., Holburn, G. L. F., & Richter, B. K. (2016). Bridging Qualitative and Quantitative Methods in Organizational Research: Applications of Synthetic Control Methodology in the U.S. Automobile Industry.

Organization ScienceDOI: 10.1287/orsc.2015.1034

Fremeth, Holburn, and Richter introduce synthetic control methodology to strategic management research, demonstrating its application for studying the causal effect of organizational and regulatory events on individual firms. The paper shows how data-driven counterfactuals can replace ad-hoc comparison group selection in comparative case studies. It provides a template for strategy researchers seeking to apply synthetic control methods to firm-level outcome data with few treated units.

Kaul, A., Klossner, S., Pfeifer, G., & Schieler, M. (2022). Standard Synthetic Control Methods: The Case of Using All Preintervention Outcomes Together With Covariates.

Journal of Business & Economic StatisticsDOI: 10.1080/07350015.2021.1930012

Kaul et al. show that using all pre-treatment outcome lags as predictors in synthetic control (a form of matching for aggregate units) renders other covariates irrelevant, threatening unbiasedness. Their finding highlights pitfalls when matching on pre-treatment outcomes and is relevant for understanding matching assumptions more broadly.

Survey (2)

Abadie, A. (2021). Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects.

Journal of Economic LiteratureDOI: 10.1257/jel.20191450

Abadie provides a comprehensive methodological overview of synthetic control, covering data requirements, inference via placebo tests, extensions to multiple treated units, and common pitfalls. This paper is the authoritative practitioner's guide to the method.

Cunningham, S. (2021). Causal Inference: The Mixtape.

Yale University PressDOI: 10.12987/9780300255881 Replication

Cunningham provides an accessible textbook with an excellent DiD chapter that walks through the intuition, the math, and the code (in Stata and R). Freely available online at mixtape.scunning.com, it is a valuable companion for students who want worked examples alongside formal treatment.

One-Line Implementation

Download Full Analysis Code

Motivating Example: Terrorism in the Basque Country#

AOverview#

When to Use Synthetic Control#

Common Confusions#

BIdentification#

Setup#

Choosing the Weights#

Identifying Assumptions#

CVisual Intuition#

DMathematical Derivation#

EImplementation#

FDiagnostics#

Interpreting Your Results#

GWhat Can Go Wrong#

Treated Unit Outside the Convex Hull of the Donor Pool

Contaminated Donor Pool Due to Spillover Effects

Overfitting with Too Few Pre-Treatment Periods

HPractice#

Paper Summary

Key Table

Authors' Identification Claim

ISwap-In: When to Use Something Else#

JReviewer Checklist#

Critical Reading Checklist

Paper Library

Foundational (7)

Application (4)

Survey (2)

Tags

Motivating Example: Terrorism in the Basque Country

AOverview

When to Use Synthetic Control

Common Confusions

BIdentification

Setup

Choosing the Weights

Identifying Assumptions

CVisual Intuition

DMathematical Derivation

EImplementation

FDiagnostics

Interpreting Your Results

GWhat Can Go Wrong

HPractice

ISwap-In: When to Use Something Else

JReviewer Checklist