Chapter 6 of 8
A Taxonomy of Identification Strategies
The full toolkit — design-based, model-based, experimental, and quasi-experimental approaches.
Every Tool in the Toolbox
We have spent four pages building up to this moment. You understand why correlation is not causation (F1). You know that selection bias is the enemy (F3). You can speak the language of estimands and identification (F4). You can draw a DAG and figure out which backdoor paths need blocking (F5).
Now the question is: how do you actually block those paths?
The answer depends on your setting — what data you have, what variation exists, and what assumptions you are willing to make. Different settings call for different tools. This page is your map of the entire toolkit — what Angrist and Pischke (2009) describe as the core toolkit of applied econometrics, plus the methods that have emerged since. Think of it as the table of contents for every method you will learn on this site, organized by the logic of how each method solves the identification problem.
The Big Divide: Design-Based vs. Model-Based
The most fundamental distinction in modern empirical research is between design-based and model-based approaches.
Design-Based Approaches
methods rely on features of how the treatment was assigned or how variation arose in the real world. The credibility of the estimate comes from the research design — the institutional setting, the policy, the natural experiment — rather than from the statistical model.
The philosophy: "I found a setting where treatment was assigned in a way that is as if random (or at least, where I can isolate exogenous variation). My statistical model can be simple because the design does the heavy lifting."
Examples: randomized experiments, difference-in-differences, regression discontinuity, instrumental variables.
Model-Based Approaches
methods rely on the statistical model itself — its functional form, its distributional assumptions, its conditioning set — to achieve identification. The credibility comes from getting the model "right."
The philosophy: "I will control for enough variables, in the right way, to eliminate confounding. My model does the heavy lifting."
Examples: OLS with controls, matching, inverse probability weighting, doubly robust estimation.
Three Tiers of Evidence
Another useful taxonomy organizes methods by how close they come to a true experiment:
Tier 1: Experimental (Randomized)
What it is: The researcher (or someone) randomly assigns treatment. Randomization ensures that, in expectation, treated and control groups are identical on all confounders — observed and unobserved.
For the training mystery: Randomly assign some job seekers to receive training and others to a control group. Compare earnings. This approach is exactly what the National Supported Work (NSW) Demonstration did in the 1970s — a landmark study we will revisit in F8. You can learn the mechanics of designing and analyzing such studies on the experimental design page.
Limitation: Often infeasible or unethical. You cannot randomly assign firm bankruptcy, gender, or economic recessions.
Tier 2: Quasi-Experimental (Natural Experiments)
What it is: The researcher finds a setting where some naturally occurring event, policy change, or institutional feature created variation in treatment that is as if random — or at least, provides exogenous variation that can be isolated. Dunning (2012) provides a comprehensive framework for evaluating the credibility of such natural experiments.
For the training mystery: Perhaps the program was rolled out in some cities before others (difference-in-differences). Perhaps eligibility depended on a test score cutoff (regression discontinuity). Perhaps draft lottery numbers affected who enrolled (instrumental variables).
Limitation: Requires finding the right setting. The identifying variation must be genuinely exogenous, which is often debatable.
Tier 3: Observational (Selection on Observables)
What it is: No experiment, no natural experiment. The researcher uses observational data and relies on controlling for the right set of confounders to eliminate selection bias.
For the training mystery: Use survey data on trainees and non-trainees, control for education, age, prior earnings, motivation proxies, and hope that you have captured all the relevant confounders.
Limitation: Relies on the assumption that there are no unobserved confounders — an assumption that can never be verified from the data.
The Complete Method Map
Below is every method covered on this site, organized by category. Click any method to jump to its full page.
Randomized experiments are the gold standard — random assignment eliminates all confounding. When you can randomize, the analysis is straightforward.
Method Summary Table
Here is a compact reference for all the methods you will encounter. Do not worry about memorizing this table — you will learn each method in depth on its own page. Use this table to orient yourself and to find the right method for your setting.
| Method | Category | What It Exploits | Key Assumption | When to Use |
|---|---|---|---|---|
| Randomized Experiment | Experimental | Random assignment | Compliance, no spillovers | You can randomize treatment |
| OLS | Model-based | Conditional independence | No omitted variable bias | Baseline; building block for other methods |
| Fixed Effects | Design-based | Within-unit variation | Time-invariant confounders only | Panel data with repeated observations |
| Difference-in-Differences | Design-based | Treatment timing variation | Parallel trends | Policy change, staggered rollout |
| Event Studies | Design-based | Pre/post treatment dynamics | Parallel trends (visible in pre-period) | Complement to DiD; visualizing dynamic effects |
| Staggered DiD | Design-based | Differential treatment timing | Parallel trends (heterogeneity-robust) | Multiple units treated at different times |
| Synthetic Control | Design-based | Weighted donor pool | Good pre-treatment fit | Few treated units, aggregate data |
| Synthetic DiD | Design-based | Combines DiD + synthetic control | Relaxed parallel trends | When parallel trends alone is too strong |
| Regression Discontinuity | Design-based | Score cutoff | Continuity at the cutoff | Treatment assigned by a threshold |
| Instrumental Variables | Design-based | Exogenous instrument | Exclusion restriction, relevance | Instrument available |
| Shift-Share / Bartik | Design-based | Interaction of shares and shocks | Exogenous shares or shocks | Regional/industry exposure to national shocks |
| Matching | Model-based | Observable similarity | Selection on observables | Rich covariate data, overlap |
| Inverse Probability Weighting | Model-based | Propensity scores | Correct propensity model | Reweighting to balance covariates |
| Doubly Robust Estimation | Model-based | Outcome model + propensity | One of two models correct | Insurance against misspecification |
| Logit / Probit | Model-based | Latent variable threshold | Correct functional form | Binary outcomes |
| Poisson / Negative Binomial | Model-based | Count data structure | Mean specification | Count outcomes |
| Double/Debiased ML | ML-causal | High-dimensional controls | Approximate sparsity | Many potential controls, flexible functions |
| Causal Forests | ML-causal | Heterogeneous treatment effects | Unconfoundedness | Discovering who benefits most |
| Causal Mediation | Mechanism | Mediator variation | Sequential ignorability | Understanding how treatment works |
Choosing a Method: First Principles
When you face a research question, here is how to think about method choice:
Step 1: What is your estimand? (F4) Do you want the ATE, ATT, or LATE? This narrows your options.
Step 2: What does your DAG look like? (F5) What are the confounders? Are any of them unobservable?
Step 3: What variation exists? This step is the crucial question:
- Was treatment randomized? Use an experiment.
- Is there a cutoff? Consider RDD.
- Is there a before-and-after, with a comparison group? Consider DiD.
- Is there an instrument? Consider IV.
- Do you only have observational data with rich covariates? Consider matching, IPW, or doubly robust methods.
- Do you need to handle many covariates flexibly? Consider DML or causal forests.
Step 4: Are the assumptions plausible? Every method requires assumptions. The best method is the one whose assumptions are most credible in your setting — not the one that sounds most sophisticated.
A researcher wants to estimate the effect of a new job training program on earnings. The program was available to all unemployed workers in City A but not in City B. Both cities are similar. She has earnings data for workers in both cities before and after the program launched. Which identification strategy is most natural for this setting?
How the Training Mystery Gets Solved (Preview)
Our running question — "Did the job training program work?" — has been studied with nearly every method on this site:
- Randomized experiment: The NSW Demonstration randomly assigned participants ().
- Matching and reweighting: Dehejia and Wahba (1999) showed that propensity score methods could recover the experimental benchmark from observational data.
- Difference-in-differences: Training programs rolled out at different times across regions have been studied using DiD.
- Instrumental variables: Draft lotteries and eligibility rules have been used as instruments for training participation.
Each approach has strengths and weaknesses. The credibility revolution (F8) is partly the story of researchers learning which approach works best in which setting — and developing tools to assess credibility rather than taking any single method on faith.
Cross-Cutting Practices
Beyond choosing a method, credible empirical research requires a set of practices that apply to every method:
- Sensitivity analysis: How much would your results change if your assumptions were slightly wrong? (See the Sensitivity Analysis page.)
- Power analysis: Did you have enough data to detect the effect you were looking for?
- Multiple testing corrections: If you tested many outcomes or subgroups, did you account for the increased chance of false positives?
- Pre-registration: Did you commit to your analysis plan before seeing the results?
- Specification curves: How robust are your results across reasonable alternative specifications?
These practices are not optional extras. In modern empirical research, reviewers expect them. We cover each in the Practices section of this site.
Key Takeaways
What Comes Next
Before we dive into specific methods, there is one more foundational skill: working with data. The next page covers loading, cleaning, reshaping, and constructing variables — the practical skills you need before you can implement any method.
Next Step: Working with Data — Master the practical skills of loading, cleaning, reshaping, and constructing variables before implementing any method.