Lab·replication·9 min read

replication120 minutes

Replication Lab: Quarter of Birth and Returns to Schooling

Replicate Angrist and Krueger on returns to schooling with quarter-of-birth IV: first stage, reduced form, 2SLS, instrument strength, OLS vs IV comparison.

Method: Instrumental Variables / 2SLS
Languages: Python, R, Stata
Dataset: Simulated Census data with quarter-of-birth instrument matching Angrist & Krueger (1991)

Overview

In this replication lab, you will reproduce the key results from one of the most influential papers in the instrumental variables literature:

Angrist, Joshua D., and Alan B. Krueger. 1991. "Does Compulsory School Attendance Affect Schooling and Earnings?" Quarterly Journal of Economics 106(4): 979–1014.

Angrist and Krueger use quarter of birth as an instrument for years of schooling. The logic: compulsory schooling laws require students to remain in school until they turn 16 (or 17 or 18). Because school entry is based on calendar cutoff dates, students born earlier in the year reach the compulsory age sooner and can legally drop out with less schooling. Quarter of birth is thus correlated with schooling but (arguably) unrelated to earnings except through its effect on schooling.

Why this paper matters: It introduced one of the most creative identification strategies in economics and sparked decades of debate about instrument validity, weak instruments, and the interpretation of IV estimates.

What you will do:

Simulate data matching the published first-stage and 2SLS estimates
Estimate the first stage (quarter of birth on schooling)
Estimate the reduced form (quarter of birth on earnings)
Compute 2SLS by hand and via standard packages
Test instrument strength (first-stage F-statistic)
Compare OLS and IV estimates of returns to education

Step 1: Simulate the Census Data

1# First-time setup: install.packages(c("estimatr", "modelsummary", "ivreg", "car"))
2library(estimatr)
3library(modelsummary)
4library(ivreg)
5library(car)
6
7set.seed(1991)
8n <- 50000
9
10qob <- sample(1:4, n, replace = TRUE)
11yob <- sample(1930:1939, n, replace = TRUE)
12ability <- rnorm(n)
13
14educ <- pmin(pmax(round(12 + 2*ability - 0.10*(qob==1) - 0.05*(qob==2) +
150.05*(qob==4) + rnorm(n, 0, 2.5), 1), 0), 20)
16
17log_wage <- 4.5 + 0.08*educ + 0.15*ability + 0.02*(yob-1930) + rnorm(n, 0, 0.5)
18
19df <- data.frame(log_wage, educ, qob, yob, ability,
20               q1 = as.integer(qob == 1),
21               q2 = as.integer(qob == 2),
22               q3 = as.integer(qob == 3))
23
24cat("Education by quarter of birth:\n")
25tapply(df$educ, df$qob, mean)

Requiresestimatr modelsummary ivreg car

Expected output:

Sample summary:

Statistic	Value
N	50,000
Mean education	~12.0 years
Mean log wage	~5.5

Education by quarter of birth:

Quarter of Birth	Mean Education
Q1	~11.90
Q2	~11.95
Q3	~12.00
Q4	~12.05

Q1-born individuals have slightly less education (about 0.10–0.15 years less than Q4), consistent with the compulsory schooling mechanism: students born earlier in the year reach the legal dropout age sooner and can leave school with less completed schooling.

Step 2: OLS Estimate (Biased Benchmark)

# OLS with year-of-birth fixed effects
m_ols <- lm_robust(log_wage ~ educ + factor(yob), data = df, se_type = "HC1")
cat("OLS return to education:", coef(m_ols)["educ"], "\n")
cat("Published OLS: ~0.070-0.071\n")

Expected output:

Variable	Coefficient	Robust SE	95% CI
educ (OLS)	~0.095	~0.002	[0.091, 0.099]
Published OLS	~0.070	—	—

The OLS return to education is biased upward because unobserved ability is positively correlated with both education and wages: OLS captures the true causal effect (0.08) plus the omitted variable bias from ability, producing an estimate around 0.095. In the published paper, OLS gives approximately 0.070 — lower than our simulation because real-world measurement error in schooling attenuates the estimate toward zero, partially offsetting the upward ability bias.

Step 3: First Stage — Quarter of Birth Predicts Schooling

1# First stage
2m_first <- lm_robust(educ ~ q1 + q2 + q3 + factor(yob),
3                    data = df, se_type = "HC1")
4cat("=== First Stage ===\n")
5cat("Q1:", coef(m_first)["q1"], "\n")
6cat("Q2:", coef(m_first)["q2"], "\n")
7cat("Q3:", coef(m_first)["q3"], "\n")
8
9# F-test for joint significance of instruments
10f_test <- linearHypothesis(lm(educ ~ q1 + q2 + q3 + factor(yob), data = df),
11                         c("q1 = 0", "q2 = 0", "q3 = 0"))
12cat("First-stage F:", f_test$F[2], "\n")

Expected output:

Instrument	Coefficient	SE	t-statistic
Q1 (born_q1)	~-0.15	~0.04	~-3.8
Q2 (born_q2)	~-0.10	~0.04	~-2.5
Q3 (born_q3)	~-0.05	~0.04	~-1.2

Diagnostic	Value	Threshold
First-stage F (joint)	~3–6 (simulated)	9.08 (Stock-Yogo, 3 instruments)
Instrument strength	Weak (F < 10) in this simulation	F > 10 (Staiger-Stock screening); F > 104.7 (LMMP 2022) for valid t-test inference in just-identified case
Famously-weak F in literature	~1.6	from BJB-1995 over-identified spec, not 3-QOB

The first-stage F-statistic in this simulated subsample is low (often below 10), making this a textbook example of the weak instruments problem. The famously-weak "F ≈ 1.6" figure from Bound et al. (1995) refers to Angrist and Krueger (1991)'s over-identified specifications with 180 instruments (quarter-of-birth × year-of-birth × state-of-birth interactions). The 3-QOB-only specification used in this lab is much stronger — Angrist & Krueger Table V reports an F around 30 in the full Census sample.

Concept Check

The famously-weak first-stage F of ~1.6 in Angrist and Krueger (1991) — from Bound, Jaeger & Baker's (1995) reanalysis of the *over-identified* 180-instrument specification — exemplifies the weak instruments problem. What does a weak first stage imply for the 2SLS estimator?

The 2SLS estimate will be biased toward zero.The 2SLS estimate may be biased toward the OLS estimate, have very large standard errors, and be unreliable for inference even in large samples.The 2SLS estimate will be consistent but inefficient.The instrument is invalid (violates the exclusion restriction).

Step 4: 2SLS Estimation

1# 2SLS using ivreg
2m_2sls <- ivreg(log_wage ~ educ + factor(yob) | q1 + q2 + q3 + factor(yob),
3              data = df)
4summary(m_2sls, diagnostics = TRUE)
5
6cat("\n=== Comparison ===\n")
7cat("OLS:", coef(m_ols)["educ"], "\n")
8cat("2SLS:", coef(m_2sls)["educ"], "\n")
9cat("Published: OLS ~0.070, IV ~0.080-0.100\n")

Requiresivreg

Expected output:

Wald / IV estimate (using Q1 only):

Component	Estimate
Reduced form (Q1 on log_wage)	~-0.012
First stage (Q1 on educ)	~-0.15
Wald (RF / FS)	~0.080

2SLS with 3 quarter dummies:

Variable	Coefficient	Robust SE
educ (2SLS)	~0.080–0.100	~0.04

Comparison table:

Estimator	Return to Education	Published Value
OLS	~0.095	~0.070
2SLS (3 instruments)	~0.080–0.100	~0.089
True causal effect	0.080	—
IV > OLS?	Yes	Varies

In the published paper, the IV estimate exceeds OLS, which is consistent with the LATE interpretation: compliers may have higher marginal returns to education. In our simulation, because we omit measurement error (which attenuates the published OLS downward), OLS may be above or below IV depending on the random seed. The qualitative lesson remains the same: the two estimators identify different parameters.

Concept Check

In IV estimation, what does the exclusion restriction require for the quarter-of-birth instrument?

Quarter of birth must be correlated with years of schooling.Quarter of birth must affect earnings only through its effect on years of schooling — it cannot have a direct effect on wages through any other channel.Quarter of birth must be randomly assigned by nature.The instrument must be uncorrelated with the error term of the first-stage equation.

Step 5: Compare with Published Results

cat("=== Comparison with Angrist & Krueger (1991) ===\n")
cat("Published OLS: ~0.070\n")
cat("Published 2SLS: ~0.089\n")
cat("Our OLS:", round(coef(m_ols)["educ"], 4), "\n")
cat("Our 2SLS:", round(coef(m_2sls)["educ"], 4), "\n")

Expected output:

Statistic	Published (Angrist & Krueger, 1991)	Our Replication
OLS return to education	~0.070	~0.095
2SLS return to education	~0.089	~0.080–0.100
First-stage F (3 QOB dummies, AK91 Table V)	~30 (well above Staiger-Stock 1997 screening F > 10; below LMMP 2022 F > 104.7 just-identified threshold)	~3–6 (simulated subsample)
First-stage F (180-instrument over-id, BJB-1995)	~1.6 (famously weak)	—
IV > OLS?	Yes	Varies
N	~330,000	50,000

Our simulated results capture the qualitative patterns of the original paper: (1) the first-stage F is low, and (2) the instruments are weak. In the original paper, the IV estimate exceeds OLS, consistent with measurement error attenuating OLS downward. In our simulation, OLS is biased upward by ability bias, so IV may be above or below OLS depending on the random seed.

Expected output

If your code runs correctly, expect to see:

OLS return to education: Around 0.09–0.10 (biased upward from the true 0.08; published OLS is approximately 0.070, which is lower due to measurement error attenuation)
2SLS return to education: Around 0.07–0.11 (published: approximately 0.089), larger than OLS
First-stage F-statistic: Low, often below the Staiger-Stock 1997 screening threshold of 10 with a single QOB instrument (and far below the LMMP 2022 F > 104.7 just-identified threshold) — illustrating the weak instruments problem
Education by quarter of birth: Q1-born individuals have slightly less education (about 0.1 years less than Q4)
Reduced form: Small negative effect of Q1 birth on wages (around -0.005 to -0.015)
IV > OLS: Consistent with the LATE interpretation — compulsory schooling laws affect marginal students who may have higher returns
Sample size: 50,000 simulated men (original: approximately 330,000 men born 1930–1939)

Summary

Our replication confirms the central findings and limitations of Angrist and Krueger (1991):

Quarter of birth predicts years of schooling, but the relationship is weak. The first-stage F-statistic is low, making this a textbook example of the weak instruments problem.
In the published paper, the IV estimate of returns to education exceeds OLS. This pattern is consistent with the LATE interpretation: compulsory schooling laws affect marginal students who may have higher returns to additional education. In our simulation, the relative magnitudes may differ because the simulated DGP does not include measurement error attenuation.
Weak instruments are a serious concern. Bound et al. (1995) showed that with many weak instruments, 2SLS can be severely biased. See the IV / 2SLS method page for the full set of weak-instrument thresholds (Staiger-Stock, Stock-Yogo, Lee et al.) and for weak-instrument-robust alternatives such as the Anderson-Rubin test.
The exclusion restriction is debatable. If quarter of birth affects outcomes through channels other than schooling (e.g., birth season effects on health or family resources), the instrument is invalid.

Extension Exercises

LIML estimator. Limited Information Maximum Likelihood (LIML) is less biased than 2SLS with weak instruments. Estimate LIML and compare to 2SLS.
Just-identified IV. Use only Q1 as a single instrument (just-identified case). Compare to the over-identified 2SLS with three instruments.
Anderson-Rubin test. Implement the weak-instrument-robust Anderson-Rubin confidence interval and compare to the standard 2SLS confidence interval.
Hausman test. Formally test whether OLS and IV estimates are statistically different using the Hausman (1978) test.
Many instruments. Angrist and Krueger also used QOB x YOB interactions (30 instruments). Estimate 2SLS with many instruments and observe how the estimate moves toward OLS. This convergence illustrates the "many weak instruments" bias.

Overview#

Step 1: Simulate the Census Data#

Step 2: OLS Estimate (Biased Benchmark)#

Step 3: First Stage — Quarter of Birth Predicts Schooling#

Step 4: 2SLS Estimation#

Step 5: Compare with Published Results#

Summary#

Extension Exercises#

Overview

Step 1: Simulate the Census Data

Step 2: OLS Estimate (Biased Benchmark)

Step 3: First Stage — Quarter of Birth Predicts Schooling

Step 4: 2SLS Estimation

Step 5: Compare with Published Results

Summary

Extension Exercises