Lab·tutorial·8 min read

tutorial90 minutes

Lab: Interrupted Time Series from Scratch

Implement an interrupted time series analysis step by step. Simulate monthly data with an intervention, fit segmented regression, interpret level and slope changes, test for autocorrelation, and apply Newey-West standard errors.

MethodInterrupted Time Series (ITS)

LanguagesPython, R, Stata

DatasetHospital admissions (simulated monthly data)

Overview

Interrupted Time Series (ITS) is a quasi-experimental design that evaluates the effect of an intervention by comparing the trend in an outcome before and after the intervention. It is one of the strongest non-experimental designs when the intervention occurs at a clearly defined point in time and affects an entire population simultaneously.

What you will learn:

How to set up and estimate a segmented regression for ITS
How to interpret the level change and slope change parameters
How to construct the counterfactual trajectory
How to test for and address autocorrelation (Durbin-Watson test)
How to compute Newey-West standard errors that are robust to autocorrelation
How to compute the cumulative effect at specific post-intervention time points

Prerequisites: OLS regression, basic understanding of time series concepts (trends, autocorrelation).

Step 1: Simulate Monthly Time Series Data

We simulate monthly hospital admissions over 8 years (96 months). A policy intervention occurs at month 49, reducing admissions immediately (level change) and altering the trend (slope change).

1library(lmtest)
2library(sandwich)
3
4set.seed(42)
5T_total <- 96       # 96 months (8 years)
6T_interv <- 49      # Intervention at month 49
7
8# Time variable
9time <- 1:T_total
10
11# Intervention indicator: 1 after intervention
12D <- as.integer(time >= T_interv)
13
14# Time since intervention (0 before, 1, 2, ... after)
15time_after <- pmax(time - T_interv + 1, 0) * D
16
17# True parameters
18true_intercept   <- 500    # Baseline admissions at time 0
19true_trend       <- -0.5   # Pre-intervention trend (slight decline)
20true_level_change <- -30   # Immediate drop at intervention
21true_slope_change <- -1.5  # Additional decline per month after intervention
22
23# Generate data with autocorrelated errors (AR(1) with phi = 0.4)
24phi <- 0.4
25epsilon <- numeric(T_total)
26epsilon[1] <- rnorm(1, sd = 10)
27for (t in 2:T_total) {
28epsilon[t] <- phi * epsilon[t - 1] + rnorm(1, sd = 10)
29}
30
31# Outcome: admissions
32admissions <- true_intercept + true_trend * time +
33            true_level_change * D +
34            true_slope_change * time_after + epsilon
35
36df <- data.frame(time, admissions, D, time_after)
37
38cat("Pre-intervention mean admissions:", round(mean(admissions[D == 0]), 1), "\n")
39cat("Post-intervention mean admissions:", round(mean(admissions[D == 1]), 1), "\n")
40cat("Intervention at month:", T_interv, "\n")

Requireslmtest sandwich

Expected output:

Period	Mean Admissions
Pre-intervention (months 1–48)	~475–485
Post-intervention (months 49–96)	~400–430

The raw comparison of means shows a decline, but ITS allows us to separate the intervention effect from the pre-existing trend.

Step 2: Fit the Segmented Regression

The ITS segmented regression model is:

Y_t = beta_0 + beta_1 * time + beta_2 * D_t + beta_3 * time_after_t + epsilon_t

where beta_2 is the level change and beta_3 is the slope change at the intervention.

1# Fit segmented regression (standard OLS)
2its_model <- lm(admissions ~ time + D + time_after, data = df)
3summary(its_model)
4
5cat("\n=== ITS Estimates vs. True Values ===\n")
6cat(sprintf("%-20s %10s %10s\n", "Parameter", "Estimate", "True"))
7cat(sprintf("%-20s %10.3f %10.3f\n", "Intercept", coef(its_model)[1], true_intercept))
8cat(sprintf("%-20s %10.3f %10.3f\n", "Pre-trend", coef(its_model)["time"], true_trend))
9cat(sprintf("%-20s %10.3f %10.3f\n", "Level change", coef(its_model)["D"], true_level_change))
10cat(sprintf("%-20s %10.3f %10.3f\n", "Slope change", coef(its_model)["time_after"], true_slope_change))

Expected output:

Parameter	Estimate	True Value	Interpretation
Intercept (beta_0)	~500	500	Baseline admissions at time 0
Pre-trend (beta_1)	~-0.5	-0.5	Monthly decline before intervention
Level change (beta_2)	~-30	-30	Immediate drop at intervention
Slope change (beta_3)	~-1.5	-1.5	Additional monthly decline after intervention
Post-intervention slope	~-2.0	-2.0	beta_1 + beta_3: total monthly decline after intervention

The level change (beta_2) captures the immediate effect of the intervention. The slope change (beta_3) captures how the intervention altered the trajectory. The post-intervention slope (beta_1 + beta_3) describes the total monthly change after the intervention.

Step 3: Construct the Counterfactual

The counterfactual is what would have happened without the intervention — the extrapolation of the pre-intervention trend.

1# Counterfactual: predicted values if intervention had NOT occurred
2# Set D = 0 and time_after = 0 for all time points
3df$counterfactual <- coef(its_model)["(Intercept)"] +
4                   coef(its_model)["time"] * df$time
5
6# Predicted values (actual model)
7df$predicted <- fitted(its_model)
8
9# Plot
10plot(df$time, df$admissions, pch = 16, cex = 0.5, col = "gray40",
11   xlab = "Month", ylab = "Admissions",
12   main = "ITS: Observed, Fitted, and Counterfactual")
13lines(df$time, df$predicted, col = "blue", lwd = 2)
14lines(df$time[df$D == 1], df$counterfactual[df$D == 1],
15    col = "red", lwd = 2, lty = 2)
16abline(v = T_interv - 0.5, lty = 3)
17legend("topright", legend = c("Observed", "Fitted", "Counterfactual"),
18     col = c("gray40", "blue", "red"), pch = c(16, NA, NA),
19     lty = c(NA, 1, 2), lwd = c(NA, 2, 2))
20
21# Effect at specific post-intervention time points
22for (m in c(1, 12, 24, 48)) {
23effect <- coef(its_model)["D"] + coef(its_model)["time_after"] * m
24cat(sprintf("Effect at %d months post-intervention: %.1f admissions\n", m, effect))
25}

Expected output:

Months Post-Intervention	Effect (admissions reduction)
1	~-31.5
12	~-48.0
24	~-66.0
48	~-102.0

The effect grows over time because the slope change adds to the level change. The counterfactual line shows where admissions would have been without the intervention.

Concept Check

In the ITS model Y_t = beta_0 + beta_1*time + beta_2*D + beta_3*time_after + e_t, what is the total effect of the intervention at 24 months after it was implemented?

beta_2 (the level change only)beta_3 * 24 (the accumulated slope change only)beta_2 + beta_3 * 24 (level change plus accumulated slope change)beta_1 + beta_3 (the post-intervention slope)

Step 4: Test for Autocorrelation

Time series data are often autocorrelated — adjacent observations are not independent. Autocorrelation does not bias OLS coefficients, but it makes standard errors incorrect (usually too small).

1# Durbin-Watson test for autocorrelation
2dw_test <- dwtest(its_model)
3cat("Durbin-Watson Test:\n")
4cat("  DW statistic:", dw_test$statistic, "\n")
5cat("  p-value:", dw_test$p.value, "\n")
6cat("  (DW ~ 2: no autocorrelation; DW < 2: positive; DW > 2: negative)\n")
7
8# Breusch-Godfrey test (more general: tests for higher-order AC)
9bg_test <- bgtest(its_model, order = 4)
10cat("\nBreusch-Godfrey Test (up to lag 4):\n")
11cat("  LM statistic:", bg_test$statistic, "\n")
12cat("  p-value:", bg_test$p.value, "\n")
13
14# Examine residual autocorrelation
15resid <- residuals(its_model)
16acf_vals <- acf(resid, plot = FALSE, lag.max = 12)
17cat("\nResidual ACF at lag 1:", round(acf_vals$acf[2], 3), "\n")
18cat("Residual ACF at lag 2:", round(acf_vals$acf[3], 3), "\n")
19cat("\nIf ACF at lag 1 is large, autocorrelation is present.\n")

Expected output:

Test	Statistic	p-value	Interpretation
Durbin-Watson	~1.2–1.5	< 0.05	Positive autocorrelation detected
Breusch-Godfrey (lag 4)	~10–25	< 0.05	Autocorrelation present
ACF at lag 1	~0.30–0.45	—	Substantial first-order autocorrelation

The DW statistic is below 2, confirming positive autocorrelation in the residuals. This finding is expected — we simulated AR(1) errors with phi = 0.4. The OLS coefficient estimates are still unbiased, but the standard errors are too small.

Concept Check

Your ITS model shows significant autocorrelation in the residuals (DW = 1.3, BG p < 0.01). How does this affect your results?

The coefficient estimates (level change, slope change) are biased and should not be trusted.The coefficient estimates are unbiased, but the standard errors are likely too small, leading to over-rejection of null hypotheses.Autocorrelation is expected in time series and requires no correction.The model should be re-estimated using a logistic regression to fix the autocorrelation.

Step 5: Apply Newey-West Standard Errors

Newey-West standard errors are robust to both heteroskedasticity and autocorrelation (HAC). They correct the standard errors without changing the point estimates.

1# Newey-West standard errors
2# Rule of thumb for bandwidth: floor(T^(1/3)) or floor(0.75 * T^(1/3))
3bw <- floor(T_total^(1/3))
4cat("Newey-West bandwidth (lags):", bw, "\n\n")
5
6nw_vcov <- NeweyWest(its_model, lag = bw, prewhite = FALSE)
7nw_test <- coeftest(its_model, vcov = nw_vcov)
8
9cat("=== Comparison: OLS SEs vs. Newey-West SEs ===\n")
10cat(sprintf("%-15s %10s %10s %10s\n", "Parameter", "Estimate", "OLS SE", "NW SE"))
11params <- c("time", "D", "time_after")
12for (p in params) {
13cat(sprintf("%-15s %10.3f %10.3f %10.3f\n",
14    p, coef(its_model)[p],
15    summary(its_model)$coef[p, "Std. Error"],
16    nw_test[p, "Std. Error"]))
17}
18
19cat("\nNewey-West SEs are larger than OLS SEs,\n")
20cat("reflecting the autocorrelation in the data.\n")

Expected output:

Parameter	Estimate	OLS SE	Newey-West SE	Ratio
Pre-trend	~-0.50	~0.08	~0.12	~1.5x
Level change	~-30.0	~3.5	~5.5	~1.6x
Slope change	~-1.50	~0.12	~0.18	~1.5x

The Newey-West standard errors are 40–60% larger than OLS standard errors, reflecting the positive autocorrelation that OLS ignores. The level change and slope change are still statistically significant with the corrected standard errors, but confidence intervals are wider.

Step 6: Alternative — Prais-Winsten GLS Estimation

An alternative to Newey-West is to model the autocorrelation structure directly using Prais-Winsten GLS estimation.

1# Prais-Winsten estimation (GLS for AR(1) errors)
2library(prais)
3
4pw_model <- prais_winsten(admissions ~ time + D + time_after, data = df)
5summary(pw_model)
6
7cat("\n=== All Three Approaches ===\n")
8cat(sprintf("%-20s %10s %10s %10s\n", "Method", "Level Chg", "SE", "Slope Chg"))
9cat(sprintf("%-20s %10.2f %10.2f %10.3f\n", "OLS",
10  coef(its_model)["D"], summary(its_model)$coef["D", "Std. Error"],
11  coef(its_model)["time_after"]))
12cat(sprintf("%-20s %10.2f %10.2f %10.3f\n", "Newey-West",
13  coef(its_model)["D"], nw_test["D", "Std. Error"],
14  coef(its_model)["time_after"]))
15cat(sprintf("%-20s %10.2f %10.2f %10.3f\n", "Prais-Winsten",
16  coef(pw_model)["D"], summary(pw_model)$coef["D", "Std. Error"],
17  coef(pw_model)["time_after"]))
18
19cat("\nEstimated AR(1) coefficient:", pw_model$rho, "\n")
20cat("True AR(1) coefficient:", phi, "\n")

Expected output:

Method	Level Change	SE	Slope Change
OLS	~-30.0	~3.5	~-1.50
Newey-West	~-30.0	~5.5	~-1.50
Prais-Winsten	~-30.0	~5.0	~-1.50

All three methods give similar point estimates. The Prais-Winsten and Newey-West standard errors are larger than OLS, properly accounting for autocorrelation. Prais-Winsten may be slightly more efficient because it models the error structure parametrically.

Step 7: Robustness Checks

1# Robustness 1: Placebo test — move intervention to month 25
2df$D_placebo <- as.integer(df$time >= 25)
3df$time_after_placebo <- pmax(df$time - 25 + 1, 0) * df$D_placebo
4
5# Only use pre-intervention data
6placebo_model <- lm(admissions ~ time + D_placebo + time_after_placebo,
7                  data = df[df$time < T_interv, ])
8cat("=== Placebo Test (intervention at month 25) ===\n")
9cat("Level change:", coef(placebo_model)["D_placebo"],
10  "(should be ~0)\n")
11cat("Slope change:", coef(placebo_model)["time_after_placebo"],
12  "(should be ~0)\n")
13cat("p-value (level):", summary(placebo_model)$coef["D_placebo", "Pr(>|t|)"],
14  "\n")
15
16# Robustness 2: Add quadratic time trend
17its_quad <- lm(admissions ~ time + I(time^2) + D + time_after, data = df)
18cat("\n=== Quadratic Time Trend ===\n")
19cat("Level change (quadratic):", coef(its_quad)["D"], "\n")
20cat("Level change (linear):", coef(its_model)["D"], "\n")
21
22# Robustness 3: Exclude observations near the intervention
23df_donut <- df[abs(df$time - T_interv) > 3, ]
24its_donut <- lm(admissions ~ time + D + time_after, data = df_donut)
25cat("\n=== Donut Specification (exclude 3 months around intervention) ===\n")
26cat("Level change (donut):", coef(its_donut)["D"], "\n")

Expected output:

Robustness Check	Level Change	Significant?
Main model	~-30.0	Yes
Placebo (month 25)	~0.0	No
Quadratic trend	~-30.0	Yes
Donut (exclude 3 months)	~-30.0	Yes

The placebo test should find no significant effect, confirming that the effect is specific to the true intervention date. The quadratic trend and donut specifications should give similar estimates, suggesting the result is robust to functional form and observations near the discontinuity.

Step 8: Exercises

Guided Exercise

Computing the Effect at a Specific Post-Intervention Time

You estimate an ITS model and find: pre-trend = -0.5 admissions/month, level change = -30 admissions, slope change = -1.5 admissions/month. You want to compute the intervention effect at 12 months post-intervention.

Seasonality. Add monthly seasonal dummies to the model. In real hospital data, admissions often follow seasonal patterns. How do the ITS estimates change?
Control series. If you have a control group that was not exposed to the intervention, add it as a comparison series (controlled ITS). This strengthens the design by controlling for contemporaneous shocks.
Poisson ITS. When the outcome is a count variable, consider using Poisson or negative binomial regression instead of OLS. Refit the model using a GLM with a log link.

✓Key Takeaways

Interrupted time series is a powerful quasi-experimental design for evaluating interventions that affect an entire population at a known time point
The segmented regression model separates the intervention effect into a level change (immediate shift) and a slope change (altered trajectory)
The total effect at t months post-intervention is beta_2 + beta_3 * t, which grows over time if the slope change is non-zero
The counterfactual is the extrapolation of the pre-intervention trend — what would have happened without the intervention
Autocorrelation is common in time series data and biases OLS standard errors (typically making them too small)
Always test for autocorrelation (Durbin-Watson, Breusch-Godfrey) and use robust standard errors (Newey-West) or model the error structure (Prais-Winsten)
Robustness checks should include placebo tests, alternative functional forms, and donut specifications
ITS is strongest when the intervention date is clearly defined and there are no contemporaneous confounders — a control series strengthens the design

Overview#

Step 1: Simulate Monthly Time Series Data#

Step 2: Fit the Segmented Regression#

Step 3: Construct the Counterfactual#

Step 4: Test for Autocorrelation#

Step 5: Apply Newey-West Standard Errors#

Step 6: Alternative — Prais-Winsten GLS Estimation#

Step 7: Robustness Checks#

Step 8: Exercises#

✓Key Takeaways#

Overview

Step 1: Simulate Monthly Time Series Data

Step 2: Fit the Segmented Regression

Step 3: Construct the Counterfactual

Step 4: Test for Autocorrelation

Step 5: Apply Newey-West Standard Errors

Step 6: Alternative — Prais-Winsten GLS Estimation

Step 7: Robustness Checks

Step 8: Exercises

✓Key Takeaways