MethodAtlas

Chapter 6 of 8

A Taxonomy of Identification Strategies

The full toolkit — design-based, model-based, experimental, and quasi-experimental approaches.

The Mystery: Here are all the tools we have to solve problems like the training mystery.

Every Tool in the Toolbox

We have spent four pages building up to this moment. You understand why correlation is not causation (F1). You know that selection bias is the enemy (F3). You can speak the language of estimands and identification (F4). You can draw a DAG and figure out which backdoor paths need blocking (F5).

Now the question is: how do you actually block those paths?

The answer depends on your setting — what data you have, what variation exists, and what assumptions you are willing to make. Different settings call for different tools. This page is your map of the entire toolkit — what Angrist and Pischke (2009) describe as the core toolkit of applied econometrics, plus the methods that have emerged since. Think of it as the table of contents for every method you will learn on this site, organized by the logic of how each method solves the identification problem.

The Big Divide: Design-Based vs. Model-Based

The most fundamental distinction in modern empirical research is between design-based and model-based approaches.

Design-Based Approaches

methods rely on features of how the treatment was assigned or how variation arose in the real world. The credibility of the estimate comes from the research design — the institutional setting, the policy, the natural experiment — rather than from the statistical model.

The philosophy: "I found a setting where treatment was assigned in a way that is as if random (or at least, where I can isolate exogenous variation). My statistical model can be simple because the design does the heavy lifting."

Examples: randomized experiments, difference-in-differences, regression discontinuity, instrumental variables.

Model-Based Approaches

methods rely on the statistical model itself — its functional form, its distributional assumptions, its conditioning set — to achieve identification. The credibility comes from getting the model "right."

The philosophy: "I will control for enough variables, in the right way, to eliminate confounding. My model does the heavy lifting."

Examples: OLS with controls, matching, inverse probability weighting, doubly robust estimation.

Three Tiers of Evidence

Another useful taxonomy organizes methods by how close they come to a true experiment:

Tier 1: Experimental (Randomized)

What it is: The researcher (or someone) randomly assigns treatment. Randomization ensures that, in expectation, treated and control groups are identical on all confounders — observed and unobserved.

For the training mystery: Randomly assign some job seekers to receive training and others to a control group. Compare earnings. This approach is exactly what the National Supported Work (NSW) Demonstration did in the 1970s — a landmark study we will revisit in F8. You can learn the mechanics of designing and analyzing such studies on the experimental design page.

Limitation: Often infeasible or unethical. You cannot randomly assign firm bankruptcy, gender, or economic recessions.

Tier 2: Quasi-Experimental (Natural Experiments)

What it is: The researcher finds a setting where some naturally occurring event, policy change, or institutional feature created variation in treatment that is as if random — or at least, provides exogenous variation that can be isolated. Dunning (2012) provides a comprehensive framework for evaluating the credibility of such natural experiments.

For the training mystery: Perhaps the program was rolled out in some cities before others (difference-in-differences). Perhaps eligibility depended on a test score cutoff (regression discontinuity). Perhaps draft lottery numbers affected who enrolled (instrumental variables).

Limitation: Requires finding the right setting. The identifying variation must be genuinely exogenous, which is often debatable.

Tier 3: Observational (Selection on Observables)

What it is: No experiment, no natural experiment. The researcher uses observational data and relies on controlling for the right set of confounders to eliminate selection bias.

For the training mystery: Use survey data on trainees and non-trainees, control for education, age, prior earnings, motivation proxies, and hope that you have captured all the relevant confounders.

Limitation: Relies on the assumption that there are no unobserved confounders — an assumption that can never be verified from the data.

The Complete Method Map

Below is every method covered on this site, organized by category. Click any method to jump to its full page.

Animated Explanation — method map
Experimental(1/6)

Randomized experiments are the gold standard — random assignment eliminates all confounding. When you can randomize, the analysis is straightforward.

Walkthrough of all 20 identification strategies organized by category: Experimental, Quasi-Experimental (Design-Based), Observational (Model-Based), and Machine Learning + Causal Inference.

Method Summary Table

Here is a compact reference for all the methods you will encounter. Do not worry about memorizing this table — you will learn each method in depth on its own page. Use this table to orient yourself and to find the right method for your setting.

MethodCategoryWhat It ExploitsKey AssumptionWhen to Use
Randomized ExperimentExperimentalRandom assignmentCompliance, no spilloversYou can randomize treatment
OLSModel-basedConditional independenceNo omitted variable biasBaseline; building block for other methods
Fixed EffectsDesign-basedWithin-unit variationTime-invariant confounders onlyPanel data with repeated observations
Difference-in-DifferencesDesign-basedTreatment timing variationParallel trendsPolicy change, staggered rollout
Event StudiesDesign-basedPre/post treatment dynamicsParallel trends (visible in pre-period)Complement to DiD; visualizing dynamic effects
Staggered DiDDesign-basedDifferential treatment timingParallel trends (heterogeneity-robust)Multiple units treated at different times
Synthetic ControlDesign-basedWeighted donor poolGood pre-treatment fitFew treated units, aggregate data
Synthetic DiDDesign-basedCombines DiD + synthetic controlRelaxed parallel trendsWhen parallel trends alone is too strong
Regression DiscontinuityDesign-basedScore cutoffContinuity at the cutoffTreatment assigned by a threshold
Instrumental VariablesDesign-basedExogenous instrumentExclusion restriction, relevanceInstrument available
Shift-Share / BartikDesign-basedInteraction of shares and shocksExogenous shares or shocksRegional/industry exposure to national shocks
MatchingModel-basedObservable similaritySelection on observablesRich covariate data, overlap
Inverse Probability WeightingModel-basedPropensity scoresCorrect propensity modelReweighting to balance covariates
Doubly Robust EstimationModel-basedOutcome model + propensityOne of two models correctInsurance against misspecification
Logit / ProbitModel-basedLatent variable thresholdCorrect functional formBinary outcomes
Poisson / Negative BinomialModel-basedCount data structureMean specificationCount outcomes
Double/Debiased MLML-causalHigh-dimensional controlsApproximate sparsityMany potential controls, flexible functions
Causal ForestsML-causalHeterogeneous treatment effectsUnconfoundednessDiscovering who benefits most
Causal MediationMechanismMediator variationSequential ignorabilityUnderstanding how treatment works

Choosing a Method: First Principles

When you face a research question, here is how to think about method choice:

Step 1: What is your estimand? (F4) Do you want the ATE, ATT, or LATE? This narrows your options.

Step 2: What does your DAG look like? (F5) What are the confounders? Are any of them unobservable?

Step 3: What variation exists? This step is the crucial question:

  • Was treatment randomized? Use an experiment.
  • Is there a cutoff? Consider RDD.
  • Is there a before-and-after, with a comparison group? Consider DiD.
  • Is there an instrument? Consider IV.
  • Do you only have observational data with rich covariates? Consider matching, IPW, or doubly robust methods.
  • Do you need to handle many covariates flexibly? Consider DML or causal forests.

Step 4: Are the assumptions plausible? Every method requires assumptions. The best method is the one whose assumptions are most credible in your setting — not the one that sounds most sophisticated.

Concept Check

A researcher wants to estimate the effect of a new job training program on earnings. The program was available to all unemployed workers in City A but not in City B. Both cities are similar. She has earnings data for workers in both cities before and after the program launched. Which identification strategy is most natural for this setting?

How the Training Mystery Gets Solved (Preview)

Our running question — "Did the job training program work?" — has been studied with nearly every method on this site:

  • Randomized experiment: The NSW Demonstration randomly assigned participants ().
  • Matching and reweighting: Dehejia and Wahba (1999) showed that propensity score methods could recover the experimental benchmark from observational data.
  • Difference-in-differences: Training programs rolled out at different times across regions have been studied using DiD.
  • Instrumental variables: Draft lotteries and eligibility rules have been used as instruments for training participation.

Each approach has strengths and weaknesses. The credibility revolution (F8) is partly the story of researchers learning which approach works best in which setting — and developing tools to assess credibility rather than taking any single method on faith.

Cross-Cutting Practices

Beyond choosing a method, credible empirical research requires a set of practices that apply to every method:

  • Sensitivity analysis: How much would your results change if your assumptions were slightly wrong? (See the Sensitivity Analysis page.)
  • Power analysis: Did you have enough data to detect the effect you were looking for?
  • Multiple testing corrections: If you tested many outcomes or subgroups, did you account for the increased chance of false positives?
  • Pre-registration: Did you commit to your analysis plan before seeing the results?
  • Specification curves: How robust are your results across reasonable alternative specifications?

These practices are not optional extras. In modern empirical research, reviewers expect them. We cover each in the Practices section of this site.

Key Takeaways

What Comes Next

Before we dive into specific methods, there is one more foundational skill: working with data. The next page covers loading, cleaning, reshaping, and constructing variables — the practical skills you need before you can implement any method.

Next Step: Working with Data — Master the practical skills of loading, cleaning, reshaping, and constructing variables before implementing any method.