MethodAtlas
replication120 minutes

Replication Lab: Quarter of Birth and Returns to Schooling

Replicate Angrist and Krueger's instrumental variables analysis of returns to schooling using quarter of birth as an instrument. Estimate first stage, reduced form, and 2SLS, test instrument strength, and compare OLS and IV estimates.

Overview

In this replication lab, you will reproduce the key results from one of the most influential papers in the instrumental variables literature:

Angrist, Joshua D., and Alan B. Krueger. 1991. "Does Compulsory School Attendance Affect Schooling and Earnings?" Quarterly Journal of Economics 106(4): 979–1014.

Angrist and Krueger use quarter of birth as an instrument for years of schooling. The logic: compulsory schooling laws require students to remain in school until they turn 16 (or 17 or 18). Because school entry is based on calendar cutoff dates, students born earlier in the year reach the compulsory age sooner and can legally drop out with less schooling. Quarter of birth is thus correlated with schooling but (arguably) unrelated to earnings except through its effect on schooling.

Why this paper matters: It introduced one of the most creative identification strategies in economics and sparked decades of debate about instrument validity, weak instruments, and the interpretation of IV estimates.

What you will do:

  • Simulate data matching the published first-stage and 2SLS estimates
  • Estimate the first stage (quarter of birth on schooling)
  • Estimate the reduced form (quarter of birth on earnings)
  • Compute 2SLS by hand and via standard packages
  • Test instrument strength (first-stage F-statistic)
  • Compare OLS and IV estimates of returns to education

Step 1: Simulate the Census Data

library(estimatr)
library(modelsummary)
library(ivreg)
library(car)

set.seed(1991)
n <- 50000

qob <- sample(1:4, n, replace = TRUE)
yob <- sample(1930:1939, n, replace = TRUE)
ability <- rnorm(n)

educ <- pmin(pmax(round(12 + 2*ability - 0.10*(qob==1) - 0.05*(qob==2) +
0.05*(qob==4) + rnorm(n, 0, 2.5), 1), 0), 20)

log_wage <- 4.5 + 0.08*educ + 0.15*ability + 0.02*(yob-1930) + rnorm(n, 0, 0.5)

df <- data.frame(log_wage, educ, qob, yob, ability,
               q1 = as.integer(qob == 1),
               q2 = as.integer(qob == 2),
               q3 = as.integer(qob == 3))

cat("Education by quarter of birth:\n")
tapply(df$educ, df$qob, mean)

Expected output:

Sample summary:

StatisticValue
N50,000
Mean education~12.0 years
Mean log wage~5.5

Education by quarter of birth:

Quarter of BirthMean Education
Q1~11.90
Q2~11.95
Q3~12.00
Q4~12.05

Q1-born individuals have slightly less education (about 0.10–0.15 years less than Q4), consistent with the compulsory schooling mechanism: students born earlier in the year reach the legal dropout age sooner and can leave school with less completed schooling.


Step 2: OLS Estimate (Biased Benchmark)

# OLS with year-of-birth fixed effects
m_ols <- lm_robust(log_wage ~ educ + factor(yob), data = df, se_type = "HC1")
cat("OLS return to education:", coef(m_ols)["educ"], "\n")
cat("Published OLS: ~0.070-0.071\n")

Expected output:

VariableCoefficientRobust SE95% CI
educ (OLS)~0.095~0.002[0.091, 0.099]
Published OLS~0.070

The OLS return to education is biased upward because unobserved ability is positively correlated with both education and wages: OLS captures the true causal effect (0.08) plus the omitted variable bias from ability, producing an estimate around 0.095. In the published paper, OLS gives approximately 0.070 — lower than our simulation because real-world measurement error in schooling attenuates the estimate toward zero, partially offsetting the upward ability bias.


Step 3: First Stage — Quarter of Birth Predicts Schooling

# First stage
m_first <- lm_robust(educ ~ q1 + q2 + q3 + factor(yob),
                    data = df, se_type = "HC1")
cat("=== First Stage ===\n")
cat("Q1:", coef(m_first)["q1"], "\n")
cat("Q2:", coef(m_first)["q2"], "\n")
cat("Q3:", coef(m_first)["q3"], "\n")

# F-test for joint significance of instruments
f_test <- linearHypothesis(lm(educ ~ q1 + q2 + q3 + factor(yob), data = df),
                         c("q1 = 0", "q2 = 0", "q3 = 0"))
cat("First-stage F:", f_test$F[2], "\n")

Expected output:

InstrumentCoefficientSEt-statistic
Q1 (born_q1)~-0.10~0.04~-2.5
Q2 (born_q2)~-0.05~0.04~-1.2
Q3 (born_q3)~0.00~0.04~0.0
DiagnosticValueThreshold
First-stage F (joint)~3–69.08 (Stock-Yogo, 3 instruments)
Instrument strengthWeak (F < 10)F > 10 needed
Published F-stat~1.6

The first-stage F-statistic is low, often below 10 — making this a textbook example of the weak instruments problem. In the original Angrist and Krueger (1991), the F-statistic is approximately 1.6, which is far below the Stock-Yogo threshold.

Concept Check

The first-stage F-statistic in Angrist and Krueger (1991) is famously low (~1.6 with 3 quarter dummies). What does a weak first stage imply for the 2SLS estimator?


Step 4: 2SLS Estimation

# 2SLS using ivreg
m_2sls <- ivreg(log_wage ~ educ + factor(yob) | q1 + q2 + q3 + factor(yob),
              data = df)
summary(m_2sls, diagnostics = TRUE)

cat("\n=== Comparison ===\n")
cat("OLS:", coef(m_ols)["educ"], "\n")
cat("2SLS:", coef(m_2sls)["educ"], "\n")
cat("Published: OLS ~0.070, IV ~0.080-0.100\n")
Requiresivreg

Expected output:

Wald / IV estimate (using Q1 only):

ComponentEstimate
Reduced form (Q1 on log_wage)~-0.008
First stage (Q1 on educ)~-0.10
Wald (RF / FS)~0.080

2SLS with 3 quarter dummies:

VariableCoefficientRobust SE
educ (2SLS)~0.080–0.100~0.04

Comparison table:

EstimatorReturn to EducationPublished Value
OLS~0.095~0.070
2SLS (3 instruments)~0.080–0.100~0.089
True causal effect0.080
IV > OLS?YesVaries

In the published paper, the IV estimate exceeds OLS, which is consistent with the LATE interpretation: compliers may have higher marginal returns to education. In our simulation, because we omit measurement error (which attenuates the published OLS downward), OLS may be above or below IV depending on the random seed. The qualitative lesson remains the same: the two estimators identify different parameters.

Concept Check

In IV estimation, what does the exclusion restriction require for the quarter-of-birth instrument?


Step 5: Compare with Published Results

cat("=== Comparison with Angrist & Krueger (1991) ===\n")
cat("Published OLS: ~0.070\n")
cat("Published 2SLS: ~0.089\n")
cat("Our OLS:", round(coef(m_ols)["educ"], 4), "\n")
cat("Our 2SLS:", round(coef(m_2sls)["educ"], 4), "\n")

Expected output:

StatisticPublished (AK 1991)Our Replication
OLS return to education~0.070~0.095
2SLS return to education~0.089~0.080–0.100
First-stage F (3 QOB dummies)~1.6~3–6
IV > OLS?YesYes
N~330,00050,000

Our simulated results capture the qualitative patterns of the original paper: (1) the IV estimate exceeds the OLS estimate, (2) the first-stage F is low, and (3) the instruments are weak. Quantitative differences arise because we use a smaller simulated sample with a slightly different DGP.


Summary

Our replication confirms the central findings and limitations of Angrist and Krueger (1991):

  1. Quarter of birth predicts years of schooling, but the relationship is weak. The first-stage F-statistic is low, making this a textbook example of the weak instruments problem.

  2. In the published paper, the IV estimate of returns to education exceeds OLS. This pattern is consistent with the LATE interpretation: compulsory schooling laws affect marginal students who may have higher returns to additional education. In our simulation, the relative magnitudes may differ because the simulated DGP does not include measurement error attenuation.

  3. Weak instruments are a serious concern. Bound et al. (1995) showed that with many weak instruments, 2SLS can be severely biased. Modern practice requires first-stage F > 10 ((Stock & Yogo, 2005)) or uses weak-instrument-robust inference (Anderson-Rubin test).

  4. The exclusion restriction is debatable. If quarter of birth affects outcomes through channels other than schooling (e.g., birth season effects on health or family resources), the instrument is invalid.


Extension Exercises

  1. LIML estimator. Limited Information Maximum Likelihood (LIML) is less biased than 2SLS with weak instruments. Estimate LIML and compare to 2SLS.

  2. Just-identified IV. Use only Q1 as a single instrument (just-identified case). Compare to the over-identified 2SLS with three instruments.

  3. Anderson-Rubin test. Implement the weak-instrument-robust Anderson-Rubin confidence interval and compare to the standard 2SLS confidence interval.

  4. Hausman test. Formally test whether OLS and IV estimates are statistically different using the Hausman (1978) test.

  5. Many instruments. Angrist and Krueger also used QOB x YOB interactions (30 instruments). Estimate 2SLS with many instruments and observe how the estimate moves toward OLS. This convergence illustrates the "many weak instruments" bias.