Guide·8 min read

Guide

DiD vs. Synthetic Control: When to Use Which

A practical comparison of Difference-in-Differences and Synthetic Control: assumptions, strengths, and when each fits — including the Synthetic DiD hybrid.

Reading Time: ~8 min read · 8 sections · 3 interactive exercises

Two Workhorses for Policy Evaluation

When a policy or event affects some units but not others, and you observe outcomes before and after the intervention, two natural estimators emerge: Difference-in-Differences (DiD) and Synthetic Control (SC) (Abadie et al., 2010) (Arkhangelsky et al., 2021). Both methods compare treated and untreated units over time and require a credible counterfactual for what would have happened to the treated units absent the intervention. But the two approaches construct the counterfactual differently, and the choice between them depends on the structure of your data and the plausibility of the identifying assumptions.

The Core Distinction: How Each Method Builds a Counterfactual

Difference-in-Differences

DiD assumes that treated and control units would have followed in the absence of treatment. The counterfactual for the treated group is the control group's post-treatment outcome shifted up (or down) by the pre-treatment difference between the groups. DiD does not require the levels to match — only the trends.

The parallel trends assumption states:

E[Y_{it}(0) - Y_{i,t-1}(0) \mid D_i = 1] = E[Y_{it}(0) - Y_{i,t-1}(0) \mid D_i = 0] \quad \text{for all } t

The expected change in the untreated potential outcome is the same for both groups in every period. The treated group would have followed the same trend as the control group, had treatment not occurred.

Synthetic Control

SC constructs a weighted combination of untreated units that closely matches the treated unit's pre-treatment outcomes. The counterfactual is the synthetic unit's post-treatment trajectory. Unlike DiD, SC explicitly optimizes pre-treatment fit, selecting weights so that the synthetic control mirrors the treated unit's outcome path before the intervention (Abadie et al., 2010).

The synthetic control weights $w_j$ solve:

\min_{w} \sum_{t < T_0} \left( Y_{1t} - \sum_{j=2}^{J+1} w_j Y_{jt} \right)^2 \quad \text{s.t.} \quad w_j \geq 0, \quad \sum_j w_j = 1

In practice, the full Abadie et al. (2010) procedure nests this inner optimization within an outer optimization over a diagonal matrix $V$ that weights predictors (including pre-treatment outcomes and other covariates) to minimize pre-treatment fit error, but the core idea is matching the pre-treatment outcome trajectory.

When to Use DiD

DiD is the right choice when your setting has these features:

Many treated and control units. DiD works well when you have a sizable number of treated units (states, firms, individuals) and a comparable set of control units. The averaging across units helps reduce noise and provides standard tools for inference.

Plausible parallel trends. You can make a convincing case — ideally supported by an event study plot showing flat pre-treatment coefficients — that treated and control units would have trended similarly absent the treatment. Industry shocks, regional trends, and macro conditions affect both groups similarly.

Standard inference is sufficient. DiD has well-developed standard error procedures: clustering at the unit level, wild bootstrap for few clusters, and randomization inference. You can construct confidence intervals and perform hypothesis tests using familiar tools.

Staggered adoption. When different units adopt treatment at different times, staggered DiD estimators (Callaway-Sant'Anna, Sun-Abraham, or imputation methods) handle the heterogeneity in treatment timing. SC is harder to apply to staggered settings.

DiD Weaknesses

Parallel trends is untestable in the post-treatment period. Pre-treatment evidence supports but cannot prove the assumption.
If treated and control groups diverge in pre-treatment trends, DiD will produce biased estimates. Adding unit-specific linear trends can help but introduces its own problems.
DiD estimates can obscure heterogeneous treatment effects when treatment is staggered, unless you use modern estimators.

When to Use Synthetic Control

SC is the right choice when your setting has these features:

Very few treated units. SC was designed for "comparative case studies" — situations where one country adopts a policy, one state experiences a shock, or one firm undergoes a merger. With a single treated unit, you typically cannot estimate DiD with standard inference (Abadie et al., 2010).

Long pre-treatment period. SC requires enough pre-treatment time periods to construct a credible synthetic match. With only two or three pre-treatment periods, the optimization has little information, and overfitting is a real concern. A rich pre-treatment history allows the synthetic control to demonstrate that the weighted combination tracks the treated unit closely.

Poor parallel trends. When no single control unit or simple average of control units follows the same pre-treatment trend as the treated unit, SC can construct a weighted combination that does. SC is more flexible than DiD in matching pre-treatment dynamics.

Transparency of weights. SC makes the counterfactual construction explicit: you can see exactly which control units contribute and how much. Readers can assess whether the synthetic control makes substantive sense.

SC Weaknesses

Inference is non-standard. The most common approach is permutation-based (placebo tests), which lacks the power of conventional hypothesis tests.
SC can fail to achieve good pre-treatment fit if the treated unit is an outlier that no convex combination of control units can match.
SC does not naturally extend to settings with many treated units (though recent methods address this limitation).
The convexity constraint (weights sum to one, non-negative) can be restrictive.

Concept Check

Germany reunified in 1990. You want to estimate the effect of reunification on West Germany's GDP per capita, using other OECD countries as potential controls. Only one 'treated' unit exists (West Germany). Which method is more appropriate?

DiD with West Germany vs. average of all other OECD countries.Synthetic Control using pre-reunification GDP and covariates to weight OECD countries.OLS regression of GDP per capita on a reunification dummy with country fixed effects.Either method works equally well here.

Head-to-Head Comparison

Assumptions

Feature	DiD	Synthetic Control
Key assumption	Parallel trends	Pre-treatment fit implies valid counterfactual
Testable?	Partially (event study, pre-trend tests)	Partially (pre-treatment root mean squared prediction error, RMSPE)
Functional form	Linear (additive )	Weighted average (convex combination)
Covariate adjustment	Controls added to regression	Covariates used in matching optimization

Data Requirements

Feature	DiD	Synthetic Control
Treated units	Many (10+)	Few (1-5)
Control units	Many	Moderate donor pool
Pre-treatment periods	At least 2, more is better	Many (20+ ideal)
Panel balance	Strongly balanced preferred	Balanced preferred

Inference

Feature	DiD	Synthetic Control
Standard errors	Clustered, wild bootstrap	Permutation/placebo tests
Confidence intervals	Standard	Conformal inference (Chernozhukov et al., 2021), permutation p-values
Power	Higher with many units	Lower, depends on donor pool size

The Hybrid: Synthetic Difference-in-Differences

Arkhangelsky et al. (2021) propose Synthetic DiD (SDiD), which combines the strengths of both methods. SDiD uses SC-style unit weights to reweight control units and DiD-style time weights to reweight pre-treatment periods, then estimates the treatment effect as a weighted double-difference.

How SDiD Works

Unit weights (like SC): find weights for control units so the weighted control group matches the treated group's pre-treatment trajectory.
Time weights (new): find weights for pre-treatment periods so the weighted pre-treatment outcome matches the post-treatment control group level.
Estimate: compute the doubly-weighted difference-in-differences.

Why SDiD Can Be Better Than Either

SDiD inherits SC's ability to match pre-treatment dynamics, relaxing the rigid parallel trends assumption of canonical DiD.
SDiD inherits DiD's averaging across units, which enables standard inference even with multiple treated units.
SDiD offers a form of robustness: it is consistent if parallel trends hold for the reweighted control group, and the unit weights can relax the parallel trends assumption by improving pre-treatment fit (note: this property is not the same as Augmented Inverse Probability Weighting (AIPW)-style double robustness).
SDiD provides valid standard errors without relying on permutation inference, unlike classical SC.

When to consider SDiD

In settings with multiple treated units, a moderate-to-large donor pool, and a reasonable number of pre-treatment periods, SDiD is worth considering as an estimator. SDiD nests both canonical DiD (when unit weights are uniform) and SC (when time weights are uniform), so SDiD automatically adapts to whichever structure the data supports (Arkhangelsky et al., 2021). However, the choice depends on the specific setting: for a single treated unit with a long pre-treatment history, classical SC may be preferable; for large N with clean parallel trends, canonical DiD remains simpler and more transparent; and for staggered adoption, the original SDiD framework requires modification. The quality of the donor pool, the noise structure, and the available inference approach all matter for the decision. See the Synthetic Difference-in-Differences method page for implementation details.

Concept Check

You have 10 treated states and 40 control states, with 8 pre-treatment periods. The event study plot shows slightly divergent pre-trends for the treated group. What estimator should you consider first?

Canonical two-way fixed effects DiD.Synthetic Control applied to each treated state separately.Synthetic DiD, which reweights control states to match the treated group's pre-treatment trajectory.Drop the states with divergent trends and re-run DiD.

A Decision Framework

Use the following questions to guide your choice:

How many treated units do you have?
- One: SC is the natural choice.
- A handful (2-5): SC or SDiD. DiD with few clusters is problematic for inference.
- Many (10+): DiD, SDiD, or staggered DiD estimators.
How many pre-treatment periods do you have?
- Few (2-4): DiD is more feasible. SC and SDiD need more pre-treatment data to construct credible weights.
- Many (10+): SC and SDiD can exploit the long pre-treatment series.
Do parallel trends hold?
- Yes (event study looks clean): DiD is straightforward and efficient.
- Questionable: SC or SDiD, which construct a counterfactual that matches pre-treatment dynamics rather than assuming parallel trends.
Is treatment staggered?
- Yes: Staggered DiD with modern estimators or staggered SC extensions.
- No (simultaneous adoption): All three methods apply.

Practical Recommendations

Showing the pre-treatment fit is important for transparency. Whether you use DiD (event study plot), SC (pre-treatment match plot), or SDiD, readers need to see the pre-treatment dynamics to judge the credibility of the counterfactual.

Report both methods when feasible. If your data support both DiD and SC, reporting both strengthens your paper. Agreement between the two methods is reassuring; disagreement is informative and should be explained.

Use SDiD as a robustness check. Even if DiD is your primary specification, SDiD is a valuable robustness check because SDiD relaxes the parallel trends assumption.

Be transparent about limitations. DiD requires parallel trends. SC requires that the synthetic unit is a good counterfactual. SDiD requires both sets of weights to be well-behaved. No method is assumption-free.

Concept Check

You run both DiD and SC on the same dataset and get very different treatment effect estimates. What should you do?

Report whichever estimate is statistically significant.Average the two estimates.Investigate why the estimates differ — check pre-trends, SC fit, weighting, and whether the estimands are the same — and report both with a discussion of the discrepancy.Discard both estimates and use a different identification strategy.

Two Workhorses for Policy Evaluation#

The Core Distinction: How Each Method Builds a Counterfactual#

Difference-in-Differences#

Synthetic Control#

When to Use DiD#

DiD Weaknesses#

When to Use Synthetic Control#

SC Weaknesses#

Head-to-Head Comparison#

Assumptions#

Data Requirements#

Inference#

The Hybrid: Synthetic Difference-in-Differences#

How SDiD Works#

Why SDiD Can Be Better Than Either#

A Decision Framework#

Practical Recommendations#

Two Workhorses for Policy Evaluation

The Core Distinction: How Each Method Builds a Counterfactual

Difference-in-Differences

Synthetic Control

When to Use DiD

DiD Weaknesses

When to Use Synthetic Control

SC Weaknesses

Head-to-Head Comparison

Assumptions

Data Requirements

Inference

The Hybrid: Synthetic Difference-in-Differences

How SDiD Works

Why SDiD Can Be Better Than Either

A Decision Framework

Practical Recommendations