Foundation·Chapter 2 of 8·10 min read

Chapter 2 of 8

The Anatomy of a Research Design

The pipeline from question to credible answer — every step matters.

The Mystery

Now that we know comparison isn't enough, what does a credible research design look like?

Prerequisites: Why Causal Inference?
Reading Time: ~10 min read · 5 sections · 2 interactive exercises
Next Up: Chapter 3: Selection Bias and Confounding

On the previous page, you learned that naive comparisons between treated and untreated groups can be deeply misleading. Selection bias means that the people who receive a treatment are often systematically different from those who do not — and those differences contaminate any simple comparison.

So how do researchers actually produce credible causal evidence? Not through any single technique, but through a pipeline — a sequence of interconnected decisions, each of which needs to be defensible. If any stage of the pipeline is weak, the entire study's conclusions are compromised.

This page introduces that pipeline and walks through each stage using one of the most famous empirical studies in economics.

The Research Pipeline

Every credible empirical study — whether it uses an experiment, a natural experiment, or observational data — follows roughly the same architecture. Here are the stages:

Research Question — What causal relationship are you trying to understand?
Theory & Prior Work — What does existing knowledge predict, and why might it be wrong or incomplete?
Data — What data exist, and are they adequate for your question?
Identification Strategy — What is your argument for why your comparison is causal, not merely correlational?
Estimation — What statistical procedure will you use to implement your identification strategy?
Diagnostics & Robustness — What could go wrong, and how will you check? (See also sensitivity analysis.)
Interpretation — What do your results mean, and what do they not mean?
Write-Up — How do you communicate your findings honestly and precisely?

Think of these stages as a checklist. When you read a paper, you generally want to be able to identify what the authors did at each stage. When you design your own study, you generally want to have a defensible answer for each stage.

Animated Explanation

The Research Pipeline

1. Research Question(1/8)

What causal relationship are you trying to understand? A precise, answerable question is the foundation of every study.

Let us see how this pipeline works in practice.

A Walkthrough: Card & Krueger (1994)

To make this concrete, we will trace the pipeline through one of the most influential empirical papers in economics: Card and Krueger's study of minimum wages and employment .

This paper asks a question that closely resembles our training mystery: does a policy intervention actually have the effect people claim? And like our mystery, the naive answer turns out to be more complicated than it first appears.

Stage 1: Research Question

What causal relationship are you trying to understand?

Card and Krueger asked: Does raising the minimum wage reduce employment?

This question is one of the oldest debates in economics. Standard textbook theory (supply and demand in the labor market) predicts that raising the minimum wage above the market-clearing level should reduce employment — firms will hire fewer workers when labor becomes more expensive. But the empirical evidence had been surprisingly mixed.

Notice how specific this question is. Not "what determines employment?" or "are minimum wages good policy?" but a precise causal claim: minimum wage up → employment down (or not). Good research questions are sharp.

Stage 2: Theory & Prior Work

What does existing knowledge predict?

The competitive labor market model predicts that a binding minimum wage increase reduces employment. But alternative models — monopsony (where employers have market power over workers), efficiency wage models, search-and-matching models — predict that moderate minimum wage increases might not reduce employment, and could even increase it.

Card and Krueger were not the first to study this question. Previous research had used time-series data (national-level minimum wage changes over time) and generally found small negative employment effects. But those studies were vulnerable to confounding: many things change over time besides the minimum wage.

Stage 3: Data

What data exist, and are they adequate?

Here is where the study gets creative. On April 1, 1992, New Jersey raised its minimum wage from $4.25 to $5.05 per hour. The neighboring state of Pennsylvania did not change its minimum wage.

Card and Krueger conducted telephone surveys of 410 fast-food restaurants in New Jersey and eastern Pennsylvania — both before the wage increase (in February-March 1992) and after (in November-December 1992). They collected data on employment levels (measured as full-time equivalent employees), starting wages, prices, and store characteristics.

The choice of fast-food restaurants was deliberate: these restaurants are businesses where many workers earn close to the minimum wage, so the policy change would directly affect them. The geographic focus on the NJ-PA border meant the two groups of restaurants were in similar labor markets and served similar populations.

Stage 4: Identification Strategy

Why should we believe this comparison is causal?

This identification question is the heart of any empirical paper, and it is where Card and Krueger's design is most instructive.

Their was a difference-in-differences (DiD) design. The logic is:

Compare employment before and after the minimum wage increase in New Jersey (this difference captures the effect of the wage increase plus any other changes over time).
Compare employment before and after the same period in Pennsylvania (this difference captures only the other changes over time, since Pennsylvania's minimum wage did not change).
Take the difference of these two differences. If the time trends would have been the same in both states absent the policy change, this "difference-in-differences" isolates the causal effect of the minimum wage increase.

The key assumption — called the — is that New Jersey and Pennsylvania restaurants would have followed the same employment trajectory if New Jersey had not raised its minimum wage. This assumption is not testable directly, but Card and Krueger argued it was reasonable given the geographic proximity and similarity of the two groups. You can explore the causal structure of this argument visually using DAGs.

Stage 5: Estimation

What statistical procedure implements the identification strategy?

Card and Krueger estimated the effect using a simple regression:

\Delta E_i = \alpha + \beta \cdot \text{NJ}_i + \gamma \cdot X_i + \varepsilon_i

where $\Delta E_i$ is the change in employment at restaurant $i$ between the two survey waves, $\text{NJ}_i$ is an indicator for being in New Jersey, and $X_i$ are control variables (chain type, whether company-owned or franchised, etc.). The coefficient $\beta$ is the difference-in-differences estimate: the additional change in employment in NJ relative to PA.

Their headline finding surprised many economists: $\beta$ was positive (though not always statistically significant), suggesting that the minimum wage increase did not reduce employment and may even have slightly increased it.

Stage 6: Diagnostics & Robustness

What could go wrong?

Card and Krueger conducted several checks:

They verified that wages actually increased in New Jersey relative to Pennsylvania (confirming the "first stage" — the policy actually changed the variable of interest).
They tested whether results were sensitive to the specific employment measure used.
They checked whether restaurants that were already paying above $5.00 (and thus were largely unaffected by the increase) showed different trends.
They examined whether the results varied by chain or ownership type.

Later, Neumark and Wascher challenged the findings using payroll data rather than survey data, reigniting the debate (Neumark & Wascher, 2000).

This back-and-forth is healthy science. No single study settles a question. But Card and Krueger's design was influential precisely because it used a clear identification strategy that could be evaluated, critiqued, and replicated.

Stage 7: Interpretation

What do the results mean — and what do they not mean?

Card and Krueger interpreted their results as evidence against the simple competitive model's prediction that minimum wage increases necessarily reduce employment. They did not claim that minimum wages never reduce employment, or that arbitrarily large increases would be harmless.

This restraint is crucial: good researchers state what their results show and explicitly note the boundaries of their conclusions. The study provided evidence about a specific minimum wage increase (from $4.25 to $5.05) in a specific industry (fast food) in a specific region (NJ/PA) at a specific time (1992). Whether those results generalize to other settings is a separate question — one of .

Stage 8: Write-Up

Card and Krueger's paper is a model of clear empirical writing. The research design is explained in accessible language. Tables are well-organized. Limitations are discussed honestly. Alternative interpretations are considered.

When you write up your own research, study papers like this paper for their structure and tone — not just their content.

Why Every Stage Matters

Here is an exercise that will sharpen your thinking: for each stage of the pipeline, ask yourself what would happen if it were weak.

Weak question? The study might answer something no one cares about.
Weak theory? The findings might be hard to interpret or could have been predicted without any data.
Weak data? Measurement error, missing variables, or unrepresentative samples undermine everything that follows.
Weak identification? The central estimates might reflect correlation, not causation — the problem we started with.
Weak estimation? Even with a good design, the wrong statistical procedure can introduce bias or inflate uncertainty.
Weak diagnostics? Hidden problems go undetected, and readers cannot assess credibility. (Tools like sensitivity analysis help here.)
Weak interpretation? Results are over-claimed or under-claimed, misleading readers or policymakers.
Weak write-up? Good work goes unappreciated because it is poorly communicated.

Concept Check

A researcher finds a negative and statistically significant effect of a new drug on patient recovery times. However, they did not verify that the treatment group actually received the drug at higher rates than the control group. Which stage of the research pipeline did they skip?

Research QuestionIdentification StrategyDiagnostics & RobustnessWrite-Up

✓Key Takeaways

Key Takeaways

Credible empirical research follows a pipeline: question → theory → data → identification → estimation → diagnostics → interpretation → write-up. Every stage needs to be defensible.
The identification strategy is the heart of the study. It is your argument for why a comparison reveals a causal effect rather than a mere correlation.
is a model of clear research design. Their difference-in-differences approach — comparing employment changes in NJ vs. PA around a minimum wage increase — shows how geographic and temporal variation can provide causal leverage.
No study is perfect. Credibility comes from transparent assumptions, honest diagnostics, and modest interpretation — not from claiming to have eliminated all possible threats.
When reading or writing a paper, use the pipeline as a checklist. Can you identify what the authors did at each stage? Can you identify the weakest link?

→What Comes Next

Remember our job training program? Think about what a credible research design would look like for that question. You would need:

A clear counterfactual: what would have happened to trainees without the program?
A strategy for dealing with the fact that trainees chose to participate (selection bias).
Data on outcomes both before and after the program, ideally for both participants and non-participants.
Checks to verify your strategy is working as intended — ideally pre-registered before you look at the results.

The next page dives deep into the biggest obstacle to credible evidence: selection bias and confounding. You have already seen that selection bias inflated the naive estimate of the training effect from $2,000 to $7,500. Now you will learn exactly why this inflation happens, how to measure it, and why many econometricians consider it among the most important concepts in empirical research.

Next Step: Selection Bias and Confounding — The single biggest threat to your research, and the reason every method on this site exists.

The Research Pipeline#

The Research Pipeline

A Walkthrough: Card & Krueger (1994)#

Stage 1: Research Question#

Stage 2: Theory & Prior Work#

Stage 3: Data#

Stage 4: Identification Strategy#

Stage 5: Estimation#

Stage 6: Diagnostics & Robustness#

Stage 7: Interpretation#

Stage 8: Write-Up#

Why Every Stage Matters#

✓Key Takeaways#

→What Comes Next#

The Research Pipeline

A Walkthrough: Card & Krueger (1994)

Stage 1: Research Question

Stage 2: Theory & Prior Work

Stage 3: Data

Stage 4: Identification Strategy

Stage 5: Estimation

Stage 6: Diagnostics & Robustness

Stage 7: Interpretation

Stage 8: Write-Up

Why Every Stage Matters

✓Key Takeaways

→What Comes Next