MethodAtlas
Guide

DiD vs. Synthetic Control: When to Use Which

A practical comparison of Difference-in-Differences and Synthetic Control methods, covering their assumptions, strengths, and when each is appropriate. Includes the Synthetic DiD hybrid.

Two Workhorses for Policy Evaluation

(Abadie et al., 2010) (Arkhangelsky et al., 2021)

When a policy or event affects some units but not others, and you observe outcomes before and after the intervention, two natural estimators emerge: Difference-in-Differences (DiD) and Synthetic Control (SC). Both methods compare treated and untreated units over time. Both methods require a credible counterfactual for what would have happened to the treated units absent the intervention. But the two approaches construct the counterfactual differently, and the choice between the two approaches depends on the structure of your data and the plausibility of the identifying assumptions.

The Core Distinction: How Each Method Builds a Counterfactual

Difference-in-Differences

DiD assumes that treated and control units would have followed parallel trends in the absence of treatment. The counterfactual for the treated group is the control group's post-treatment outcome shifted up (or down) by the pre-treatment difference between the groups. DiD does not require the levels to match — only the trends.

The parallel trends assumption states:

E[Yit(0)Di=1]E[Yit(0)Di=0]=δfor all tE[Y_{it}(0) | D_i = 1] - E[Y_{it}(0) | D_i = 0] = \delta \quad \text{for all } t

where δ\delta is a constant difference. The treated group would have maintained the same gap relative to the control group, had treatment not occurred.

Synthetic Control

SC constructs a weighted combination of untreated units that closely matches the treated unit's pre-treatment outcomes and covariates. The counterfactual is the synthetic unit's post-treatment trajectory. Unlike DiD, SC explicitly optimizes pre-treatment fit, selecting weights so that the synthetic control mirrors the treated unit's outcome path before the intervention (Abadie et al., 2010).

The synthetic control weights wjw_j solve:

minwt<T0(Y1tj=2J+1wjYjt)2s.t.wj0,jwj=1\min_{w} \sum_{t < T_0} \left( Y_{1t} - \sum_{j=2}^{J+1} w_j Y_{jt} \right)^2 \quad \text{s.t.} \quad w_j \geq 0, \quad \sum_j w_j = 1

When to Use DiD

DiD is the right choice when your setting has these features:

Many treated and control units. DiD works well when you have a sizable number of treated units (states, firms, individuals) and a comparable set of control units. The averaging across units helps reduce noise and provides standard tools for inference.

Plausible parallel trends. You can make a convincing case — ideally supported by an event study plot showing flat pre-treatment coefficients — that treated and control units would have trended similarly absent the treatment. Industry shocks, regional trends, and macro conditions affect both groups similarly.

Standard inference is sufficient. DiD has well-developed standard error procedures: clustering at the unit level, wild bootstrap for few clusters, and randomization inference. You can construct confidence intervals and perform hypothesis tests using familiar tools.

Staggered adoption. When different units adopt treatment at different times, staggered DiD estimators (Callaway-Sant'Anna, Sun-Abraham, or imputation methods) handle the heterogeneity in treatment timing. SC is harder to apply to staggered settings.

DiD Weaknesses

  • Parallel trends is untestable in the post-treatment period. Pre-treatment evidence supports but cannot prove the assumption.
  • If treated and control groups diverge in pre-treatment trends, DiD will produce biased estimates. Adding unit-specific linear trends can help but introduces its own problems.
  • DiD estimates can obscure heterogeneous treatment effects when treatment is staggered, unless you use modern estimators.

When to Use Synthetic Control

SC is the right choice when your setting has these features:

Very few treated units. SC was designed for "comparative case studies" — situations where one country adopts a policy, one state experiences a shock, or one firm undergoes a merger. With a single treated unit, you cannot estimate DiD with standard inference (Abadie et al., 2010).

Long pre-treatment period. SC requires enough pre-treatment time periods to construct a credible synthetic match. With only two or three pre-treatment periods, the optimization has little information, and overfitting is a real concern. A rich pre-treatment history allows the synthetic control to demonstrate that the weighted combination tracks the treated unit closely.

Poor parallel trends. When no single control unit or simple average of control units follows the same pre-treatment trend as the treated unit, SC can construct a weighted combination that does. SC is more flexible than DiD in matching pre-treatment dynamics.

Transparency of weights. SC makes the counterfactual construction explicit: you can see exactly which control units contribute and how much. Readers can assess whether the synthetic control makes substantive sense.

SC Weaknesses

  • Inference is non-standard. The most common approach is permutation-based (placebo tests), which lacks the power of conventional hypothesis tests.
  • SC can fail to achieve good pre-treatment fit if the treated unit is an outlier that no convex combination of control units can match.
  • SC does not naturally extend to settings with many treated units (though recent methods address this limitation).
  • The convexity constraint (weights sum to one, non-negative) can be restrictive.
Concept Check

Germany reunified in 1990. You want to estimate the effect of reunification on West Germany's GDP per capita, using other OECD countries as potential controls. Only one 'treated' unit exists (West Germany). Which method is more appropriate?

Head-to-Head Comparison

Assumptions

FeatureDiDSynthetic Control
Key assumptionParallel trendsPre-treatment fit implies valid counterfactual
Testable?Partially (event study, pre-trend tests)Partially (pre-treatment RMSPE)
Functional formLinear (additive fixed effects)Weighted average (convex combination)
Covariate adjustmentControls added to regressionCovariates used in matching optimization

Data Requirements

FeatureDiDSynthetic Control
Treated unitsMany (10+)Few (1-5)
Control unitsManyModerate donor pool
Pre-treatment periodsAt least 2, more is betterMany (20+ ideal)
Panel balanceStrongly balanced preferredBalanced preferred

Inference

FeatureDiDSynthetic Control
Standard errorsClustered, wild bootstrapPermutation/placebo tests
Confidence intervalsStandardConformal inference, permutation p-values
PowerHigher with many unitsLower, depends on donor pool size

The Hybrid: Synthetic Difference-in-Differences

Arkhangelsky et al. (2021) propose Synthetic DiD (SDiD), which combines the strengths of both methods. SDiD uses SC-style unit weights to reweight control units and DiD-style time weights to reweight pre-treatment periods, then estimates the treatment effect as a weighted double-difference.

How SDiD Works

  1. Unit weights (like SC): find weights for control units so the weighted control group matches the treated group's pre-treatment trajectory.
  2. Time weights (new): find weights for pre-treatment periods so the weighted pre-treatment outcome matches the post-treatment control group level.
  3. Estimate: compute the doubly-weighted difference-in-differences.

Why SDiD Can Be Better Than Either

  • SDiD inherits SC's ability to match pre-treatment dynamics, relaxing the rigid parallel trends assumption of canonical DiD.
  • SDiD inherits DiD's averaging across units, which enables standard inference even with multiple treated units.
  • SDiD is doubly robust in a specific sense: if either the unit weights or the time weights alone suffice for identification, SDiD is consistent.
  • SDiD provides valid standard errors without relying on permutation inference, unlike classical SC.
Concept Check

You have 10 treated states and 40 control states, with 8 pre-treatment periods. The event study plot shows slightly divergent pre-trends for the treated group. What estimator should you consider first?

A Decision Framework

Use the following questions to guide your choice:

  1. How many treated units do you have?

    • One: SC is the natural choice.
    • A handful (2-5): SC or SDiD. DiD with few clusters is problematic for inference.
    • Many (10+): DiD, SDiD, or staggered DiD estimators.
  2. How many pre-treatment periods do you have?

    • Few (2-4): DiD is more feasible. SC and SDiD need more pre-treatment data to construct credible weights.
    • Many (10+): SC and SDiD can exploit the long pre-treatment series.
  3. Do parallel trends hold?

    • Yes (event study looks clean): DiD is straightforward and efficient.
    • Questionable: SC or SDiD, which construct a counterfactual that matches pre-treatment dynamics rather than assuming parallel trends.
  4. Is treatment staggered?

    • Yes: Staggered DiD with modern estimators or staggered SC extensions.
    • No (simultaneous adoption): All three methods apply.

Practical Recommendations

Showing the pre-treatment fit is essential for transparency. Whether you use DiD (event study plot), SC (pre-treatment match plot), or SDiD, readers need to see the pre-treatment dynamics to judge the credibility of the counterfactual.

Report both methods when feasible. If your data support both DiD and SC, reporting both strengthens your paper. Agreement between the two methods is reassuring; disagreement is informative and should be explained.

Use SDiD as a robustness check. Even if DiD is your primary specification, SDiD is a valuable robustness check because SDiD relaxes the parallel trends assumption.

Be transparent about limitations. DiD requires parallel trends. SC requires that the synthetic unit is a good counterfactual. SDiD requires both sets of weights to be well-behaved. No method is assumption-free.

Concept Check

You run both DiD and SC on the same dataset and get very different treatment effect estimates. What should you do?