MethodAtlas
Design-BasedEstablished

Interrupted Time Series (ITS)

Estimates causal effects of interventions by modeling level and slope changes in a single unit's time series at the intervention point.

Quick Reference

When to Use
When you have a single treated unit (or group) with a long pre-intervention time series and no suitable control group, and the intervention date is known and sharp.
Key Assumption
The pre-intervention trend would have continued unchanged in the absence of the intervention. No concurrent events affect the outcome at the intervention time.
Common Mistake
Ignoring autocorrelation in the time series, which inflates t-statistics and produces false positives. Use Newey-West SEs or model the autocorrelation structure explicitly.
Estimated Time
2.5 hours

One-Line Implementation

Stata: itsa outcome, single trperiod(intervention_date) lag(1) posttrend
R: lm(y ~ time + intervention + time_since_intervention, data = df) |> coeftest(vcov = NeweyWest)
Python: smf.ols('y ~ time + intervention + time_since', data=df).fit(cov_type='HAC', cov_kwds={'maxlags': 4})

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example

A public health researcher wants to know whether a comprehensive smoking ban introduced in January 2010 reduced hospital admissions for acute coronary events. She collects monthly hospital admission counts from January 2004 through December 2015 -- six years before and six years after the ban.

Here is the problem: she cannot randomly assign the smoking ban to some months and not others. The ban was a single policy change applied to an entire jurisdiction at a specific point in time. There is no control group -- every hospital in the region was affected simultaneously.

She cannot simply compare the average admission rate before and after the ban, either. Hospital admissions were already declining over time due to secular trends in cardiovascular health, improvements in emergency medicine, and other public health initiatives. A naive before-after comparison would attribute the entire pre-existing downward trend to the ban, vastly overstating its effect.

The design solves this problem (Wagner et al., 2002). It uses the pre-intervention trend as a counterfactual -- projecting what would have happened without the ban -- and then tests whether the post-intervention data deviates from that projection. The deviation, if any, is attributed to the intervention.

Specifically, the researcher fits a model that allows both the level and the slope of the time series to change at the moment of the intervention. A sudden drop in admissions at the ban date indicates an immediate level change; a steeper post-ban decline indicates a gradual slope change. Both are policy-relevant: the level change captures the immediate effect, and the slope change captures whether the effect grows or diminishes over time.


A. Overview

What the ITS Design Does

The interrupted time series design estimates the causal effect of an intervention that occurs at a known point in time by modeling the outcome as a function of time, with a structural break at the intervention date. The standard segmented regression model is:

Yt=β0+β1Tt+β2Dt+β3Pt+εtY_t = \beta_0 + \beta_1 T_t + \beta_2 D_t + \beta_3 P_t + \varepsilon_t

where:

  • YtY_t is the outcome at time tt (e.g., monthly hospital admissions)
  • TtT_t is the time elapsed since the start of the series (1, 2, 3, ...)
  • DtD_t is a dummy variable equal to 1 after the intervention and 0 before
  • PtP_t is the time elapsed since the intervention (0 before, 1, 2, 3, ... after)

The four parameters have clear interpretations:

  • β0\beta_0: baseline level at T=0T = 0
  • β1\beta_1: pre-intervention slope (the secular trend)
  • β2\beta_2: immediate level change at the intervention -- the jump (or drop) in the outcome the moment the policy takes effect
  • β3\beta_3: change in slope after the intervention -- the difference between the post-intervention trend and the pre-intervention trend

Level Change vs. Slope Change

Different interventions produce different patterns:

  • Level change only (β20\beta_2 \neq 0, β3=0\beta_3 = 0): the intervention causes an immediate, permanent shift. Example: a new billing code instantly changes how diagnoses are recorded.
  • Slope change only (β2=0\beta_2 = 0, β30\beta_3 \neq 0): the intervention has no immediate effect but gradually changes the trajectory. Example: a new medical guideline slowly changes physician behavior.
  • Both (β20\beta_2 \neq 0, β30\beta_3 \neq 0): the intervention causes an immediate shift and a change in trajectory. Example: a smoking ban immediately reduces exposure and also accelerates a downward trend as compliance increases.

How It Differs from Simple Before-After Comparison

A simple before-after comparison estimates E[Ytpost]E[Ytpre]E[Y_t | \text{post}] - E[Y_t | \text{pre}]. This comparison conflates the intervention effect with any pre-existing trend. ITS explicitly models the pre-trend and asks whether the post-intervention data deviates from what the pre-trend would have predicted. This feature is why ITS is sometimes called "the strongest quasi-experimental design when randomization is not possible" .

When to Use ITS

  • Your intervention occurs at a single known time point applied to a population
  • You have a sufficient number of time points before and after the intervention (at least 8-12 per segment, ideally more) (Kontopantelis et al., 2015)
  • The pre-intervention trend is reasonably stable and estimable
  • No control group is available (though adding one strengthens the design -- see Section D)

When NOT to Use ITS

  • The intervention was phased in gradually with no clear start date
  • The outcome is measured at only a few time points before or after (use DiD instead)
  • Multiple major changes happened at the same time as the intervention
  • The pre-intervention trend is highly volatile or nonlinear, making extrapolation unreliable

Common Confusions


B. Identification

For the ITS design to provide valid causal estimates, three key assumptions must hold (Lopez Bernal et al., 2017).

Assumption 1: Stable Pre-Intervention Trend

Plain language: The pre-intervention trend must be well-characterized and would have continued unchanged in the absence of the intervention. The counterfactual is the projection of the pre-trend into the post-period.

Formally: E[Yt(0)t>T]=β0+β1tE[Y_t^{(0)} | t > T^*] = \beta_0 + \beta_1 t for all t>Tt > T^*, where Yt(0)Y_t^{(0)} is the potential outcome without the intervention and TT^* is the intervention time.

This assumption is violated if the pre-intervention trend was nonlinear (e.g., admissions were already accelerating downward before the ban), if it was driven by a transient shock, or if there was a "regression to the mean" effect from a temporary spike just before the intervention.

Assumption 2: No Concurrent Events (History Threat)

Plain language: Nothing else that could affect the outcome happened at the same time as the intervention. If a new cardiac treatment was introduced in the same month as the smoking ban, the estimated effect of the ban is confounded.

Concurrent events are the most common threat to ITS validity . The researcher must carefully document the policy landscape and argue that no other plausible cause of the observed change coincided with the intervention.

Assumption 3: No Anticipation Effects

Plain language: Individuals, firms, or institutions did not change their behavior before the intervention in anticipation of it. If hospitals reduced admissions or smokers quit in the months leading up to the ban (because the ban was announced in advance), the pre-trend is contaminated and the level change at TT^* is attenuated.

If anticipation is plausible, the researcher can:

  1. Move the intervention date earlier to the announcement date
  2. Exclude a "transition window" around the intervention
  3. Test for a structural break before the official date

When to Use

  1. A policy or event occurs at a single known date. Smoking bans, speed limit changes, new regulations, product launches, organizational restructurings -- any clearly dated intervention that applies to an entire population.

  2. You have a long time series before and after the intervention. At least 8 observations per segment, ideally 24+ for seasonal data (Kontopantelis et al., 2015).

  3. No suitable control group exists. ITS does not require a control group (though one helps). This flexibility makes it ideal for nationwide policies where everyone is treated.

  4. You want to separate immediate from gradual effects. The level and slope change parameters distinguish immediate shifts from long-term trend changes.

Do NOT Use ITS When:

  1. The intervention timing is ambiguous. If the policy was phased in over months or years, the sharp break assumed by segmented regression is inappropriate.

  2. You have very few time points. With 3-4 observations per segment, you cannot reliably estimate the pre-trend or the slope change. Consider DiD with panel data instead.

  3. The pre-trend is chaotic or nonlinear. If the outcome fluctuates wildly before the intervention, the linear pre-trend extrapolation is unreliable and the counterfactual is poorly identified.

  4. Multiple interventions overlap. If several policies changed simultaneously, ITS cannot disentangle their individual effects without strong additional assumptions.

Connection to Other Methods

The ITS design relates to several other causal inference methods:

  • Difference-in-Differences (DiD): DiD uses a control group to net out common time trends; ITS uses the pre-trend of the treated group as the counterfactual. When you add a control group to ITS, you get a controlled ITS (CITS), which is DiD with more flexible time trends. DiD is preferred when you have a good control group but few time points; ITS is preferred when you have many time points but no control group.

  • Regression Discontinuity (RDD): Both ITS and RDD exploit a discontinuity, but the running variable differs. In RDD, the running variable is a score that determines treatment (e.g., test scores above a cutoff). In ITS, the running variable is time. ITS can be thought of as "RDD in time" (Lopez Bernal et al., 2017).

  • Synthetic Control: When no single control group is available, synthetic control constructs a weighted combination of untreated units that matches the treated unit's pre-trend. ITS uses the treated unit's own pre-trend as the counterfactual. Synthetic control is preferred when you have a panel of potential control units; ITS is preferred when you have a single treated unit with a long time series.

  • Event Studies: Event study designs estimate dynamic treatment effects at multiple leads and lags around the intervention. ITS can be viewed as a parametric event study that constrains the pre- and post-effects to follow linear trends. Event studies are more flexible but require more data.


C. Visual Intuition

Compare three approaches to estimating the intervention effect. The naive pre-post difference ignores the pre-existing trend, the OLS trend model misses the slope change, and the segmented regression correctly captures both the level shift and the change in trajectory.

Interactive Simulation

Why Segmented Regression? Three Estimators on the Same Data

DGP: Yₜ = 50 + 0.3·t + -5.0·Dₜ + -0.5·(t−t₀)·Dₜ + 2.0·εₜ. Intervention at t = 24, N = 48 periods.

Intervention44.647.951.254.557.861.106121824303642Time (months)Outcome (Y)
Naive pre-post (-3.31)OLS + trend (-5.67)Segmented (-5.98)CounterfactualData

Estimation Results

Estimatorβ̂SE95% CIBias
Naive pre-post-3.3150.924[-5.13, -1.50]+1.685
OLS with trendclosest-5.6741.825[-9.25, -2.10]-0.674
Segmented regression-5.9821.322[-8.57, -3.39]-0.982
True β-5.000
48

Total number of monthly observations

-5.0

Immediate shift in outcome at intervention

-0.5

Change in trend after intervention

2.00

Standard deviation of idiosyncratic error

Why the difference?

The naive pre-post estimator yields a level change of -3.31 (bias = +1.69). It ignores the pre-existing upward trend of 0.3 per period, so it attributes trend-driven changes to the intervention. OLS with a linear trend controls for the secular trend but assumes no slope change, yielding a level change of -5.67 (bias = -0.67). Because the true DGP includes a slope change, forcing a common slope across pre and post periods introduces bias. The segmented regression models both a level shift and a slope change, yielding a level change of -5.98 (bias = -0.98) and a slope change of -0.617. This is the correct specification for this DGP.


D. Mathematical Derivation

Don't worry about the notation yet — here's what this means in words: The segmented regression model estimates both level and slope changes at the intervention point, with appropriate standard errors for autocorrelated time-series data.

Setup. Suppose we observe YtY_t for t=1,,Nt = 1, \ldots, N, with an intervention occurring at time TT^*.

Step 1: Define the design variables.

  • Tt=tT_t = t (time index)
  • Dt=1(t>T)D_t = \mathbb{1}(t > T^*) (post-intervention indicator)
  • Pt=(tT)DtP_t = (t - T^*) \cdot D_t (time since intervention, zero in pre-period)

Step 2: Fit the model.

Yt=β0+β1Tt+β2Dt+β3Pt+εtY_t = \beta_0 + \beta_1 T_t + \beta_2 D_t + \beta_3 P_t + \varepsilon_t

Under the null hypothesis of no intervention effect, β2=0\beta_2 = 0 and β3=0\beta_3 = 0.

Step 3: Counterfactual construction. The predicted value at post-intervention time tt without the intervention is:

Y^t(0)=β^0+β^1t\hat{Y}_t^{(0)} = \hat{\beta}_0 + \hat{\beta}_1 t

The predicted value with the intervention is:

Y^t(1)=β^0+β^1t+β^2+β^3(tT)\hat{Y}_t^{(1)} = \hat{\beta}_0 + \hat{\beta}_1 t + \hat{\beta}_2 + \hat{\beta}_3 (t - T^*)

The estimated effect at time tt is the difference:

τ^t=β^2+β^3(tT)\hat{\tau}_t = \hat{\beta}_2 + \hat{\beta}_3 (t - T^*)

This effect grows (or shrinks) linearly over time if β^30\hat{\beta}_3 \neq 0.

Step 4: Autocorrelation-robust inference. Because YtY_t is a time series, the errors εt\varepsilon_t are typically . OLS standard errors assume independence and will be too small, leading to false positives. Use Newey-West standard errors, GLS with an AR(1) error structure, or ARIMA-based approaches.


E. Implementation

Segmented Regression with Autocorrelation-Robust SEs

library(lmtest)
library(sandwich)

# ---- Step 1: Construct ITS variables ----
# Assume df has columns: month (1..N), admissions, ban (0/1)
df$time       <- 1:nrow(df)
df$post       <- as.integer(df$month >= as.Date("2010-01-01"))
df$time_since <- ifelse(df$post == 1,
                      df$time - min(df$time[df$post == 1]) + 1, 0)

# ---- Step 2: Fit segmented regression (OLS) ----
its_ols <- lm(admissions ~ time + post + time_since, data = df)
summary(its_ols)

# ---- Step 3: Newey-West HAC standard errors ----
# Bandwidth = floor(0.75 * N^(1/3)) is a common rule of thumb
bw <- floor(0.75 * nrow(df)^(1/3))
coeftest(its_ols, vcov = NeweyWest(its_ols, lag = bw,
                                  prewhite = FALSE))

# ---- Step 4: Check for autocorrelation ----
dwtest(its_ols)                  # Durbin-Watson test
bgtest(its_ols, order = 12)     # Breusch-Godfrey (up to lag 12)
acf(resid(its_ols), main = "ACF of Residuals")

# ---- Step 5: GLS with AR(1) errors (alternative) ----
library(nlme)
its_gls <- gls(admissions ~ time + post + time_since,
             data = df,
             correlation = corARMA(p = 1, q = 0))
summary(its_gls)

# ---- Step 6: Plot the ITS ----
plot(df$time, df$admissions, pch = 19, cex = 0.6,
   xlab = "Month", ylab = "Hospital Admissions",
   main = "Interrupted Time Series: Smoking Ban")
abline(v = min(df$time[df$post == 1]) - 0.5,
     lty = 2, col = "red", lwd = 2)

# Pre-intervention fitted line
pre <- df[df$post == 0, ]
lines(pre$time, predict(its_ols, pre), col = "blue", lwd = 2)

# Post-intervention fitted line
pst <- df[df$post == 1, ]
lines(pst$time, predict(its_ols, pst), col = "darkgreen", lwd = 2)

# Counterfactual projection
cf <- data.frame(time = pst$time, post = 0, time_since = 0)
lines(pst$time, predict(its_ols, cf),
    col = "blue", lwd = 2, lty = 3)
legend("topright",
     c("Pre-trend", "Post-trend", "Counterfactual"),
     col = c("blue", "darkgreen", "blue"),
     lty = c(1, 1, 3), lwd = 2)

F. Diagnostics

F.1 Durbin-Watson Test

The Durbin-Watson test checks for first-order autocorrelation in OLS residuals. The test statistic ranges from 0 to 4: values near 2 indicate no autocorrelation, values significantly below 2 indicate positive autocorrelation (common in time series), and values above 2 indicate negative autocorrelation.

  • In R: dwtest(model) from the lmtest package
  • In Stata: estat dwatson after regress
  • In Python: durbin_watson(results.resid) from statsmodels

If the Durbin-Watson statistic is far from 2, OLS standard errors are invalid and you must use Newey-West SEs, GLS, or an ARIMA-based approach.

F.2 Ljung-Box Test

The Ljung-Box test checks for autocorrelation at multiple lags simultaneously. Unlike Durbin-Watson (which only tests lag 1), Ljung-Box tests whether the first kk autocorrelations are jointly zero. Multi-lag testing is especially important for seasonal data where autocorrelation may be present at lag 12 (monthly) or lag 4 (quarterly) even if lag 1 autocorrelation is mild.

  • In R: Box.test(resid(model), lag = 12, type = "Ljung-Box")
  • In Stata: estat bgodfrey, lags(1/12) (Breusch-Godfrey, similar purpose)
  • In Python: acorr_ljungbox(results.resid, lags=12)

F.3 Residual Autocorrelation Plots

Plot the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the model residuals. Under correct specification:

  • ACF should show no significant spikes beyond lag 0
  • PACF should show no significant spikes beyond lag 0

Significant spikes at lag 1 suggest AR(1) errors. Spikes at seasonal lags (12 for monthly data) suggest seasonality not captured by the model. Either add seasonal dummies or use seasonal ARIMA.

F.4 Seasonal Decomposition

For monthly or quarterly data, decompose the outcome into trend, seasonal, and residual components before fitting the ITS model. If strong seasonality is present, add seasonal dummies (month indicators) to the segmented regression:

Yt=β0+β1Tt+β2Dt+β3Pt+s=212γsMst+εtY_t = \beta_0 + \beta_1 T_t + \beta_2 D_t + \beta_3 P_t + \sum_{s=2}^{12} \gamma_s M_{st} + \varepsilon_t

where MstM_{st} are month dummies. Omitting seasonal controls when seasonality is present will bias the level-change estimate if the intervention happens to coincide with a seasonal peak or trough (Lopez Bernal et al., 2017).

F.5 Visual Inspection of the Fit

Always plot three elements:

  1. Raw data points -- the observed time series
  2. Fitted segments -- the pre- and post-intervention regression lines
  3. Counterfactual projection -- the dotted extension of the pre-trend into the post-period

The gap between the post-trend line and the counterfactual is the estimated intervention effect. If the data points do not track the fitted lines reasonably well, the linear specification may be wrong.

library(lmtest)

its_fit <- lm(admissions ~ time + post + time_since, data = df)

# F.1 Durbin-Watson
dwtest(its_fit)

# F.2 Breusch-Godfrey for higher-order autocorrelation
bgtest(its_fit, order = 12)

# F.3 ACF / PACF plots
par(mfrow = c(1, 2))
acf(resid(its_fit), lag.max = 24, main = "ACF of Residuals")
pacf(resid(its_fit), lag.max = 24, main = "PACF of Residuals")

# F.4 Seasonal decomposition (pre-intervention only)
pre_ts <- ts(df$admissions[df$post == 0], frequency = 12)
decomp <- decompose(pre_ts)
plot(decomp)

# F.5 Add seasonal dummies if needed
df$month_of_year <- factor(format(df$month, "%m"))
its_seasonal <- lm(admissions ~ time + post + time_since +
                   month_of_year, data = df)
bgtest(its_seasonal, order = 12)
Requireslmtest

Reading the Coefficients

The four coefficients from the segmented regression have direct policy interpretations:

ParameterEstimateInterpretation
β^0\hat{\beta}_0320Estimated admissions at time 0 (intercept)
β^1\hat{\beta}_1-0.35Pre-ban trend: admissions falling by 0.35/month
β^2\hat{\beta}_2-11.2Immediate effect: admissions dropped by 11.2 at the ban date
β^3\hat{\beta}_3-0.40Trend change: post-ban decline is 0.40/month steeper

The total effect at time kk after the intervention is:

τ^k=β^2+β^3k=11.2+(0.40)k\hat{\tau}_k = \hat{\beta}_2 + \hat{\beta}_3 \cdot k = -11.2 + (-0.40) \cdot k

At 12 months post-ban: τ^12=11.24.8=16.0\hat{\tau}_{12} = -11.2 - 4.8 = -16.0 admissions per month relative to counterfactual.

What to Report

A well-reported ITS analysis should include:

  1. Number of time points before and after the intervention
  2. Level change (β^2\hat{\beta}_2) with confidence interval and p-value
  3. Slope change (β^3\hat{\beta}_3) with confidence interval and p-value
  4. Pre-intervention trend (β^1\hat{\beta}_1) to show what the counterfactual trajectory looks like
  5. Standard error type (Newey-West, GLS, etc.) and bandwidth if applicable
  6. Autocorrelation diagnostics (Durbin-Watson, ACF)
  7. A plot showing the raw data, fitted segments, and counterfactual projection

G. What Can Go Wrong

Assumption Failure Demo

Concurrent Event Confounds the Intervention

ITS applied to a smoking ban where no other major cardiovascular policy changed at the same time

Level change: -11.2 admissions/month (SE = 3.1, p < 0.001). Slope change: -0.4/month (SE = 0.15, p = 0.008). The ban is associated with an immediate drop and an accelerating decline in admissions.

Assumption Failure Demo

Autocorrelation Ignored — False Precision

Segmented regression with Newey-West HAC standard errors to account for serial correlation in monthly admissions data

Level change: -8.3 admissions/month. Newey-West SE = 4.2. 95% CI: [-16.5, -0.1]. p = 0.047. The effect is marginally significant with appropriately wide confidence intervals.

Assumption Failure Demo

Short Pre-Period — Unreliable Counterfactual

ITS with 72 monthly observations before the intervention (6 years), providing a stable and well-estimated pre-trend

Pre-trend slope: -0.35 admissions/month (SE = 0.05). The pre-trend is precisely estimated with narrow confidence intervals, yielding a credible counterfactual projection.

Assumption Failure Demo

Anticipation Effects Contaminate the Pre-Trend

An unannounced policy change — hospitals and patients had no advance knowledge of the smoking ban, so behavior did not change before the implementation date

Level change: -10.5 admissions/month. The discontinuity at the intervention date is sharp and clearly visible in the data.


H. Practice

H.1 Concept Checks

Concept Check

A researcher fits a segmented regression to monthly data and finds beta_2 = -15 (p < 0.001) and beta_3 = +1.2 (p = 0.03). What does this pattern tell you about the intervention effect?

Concept Check

You run a Durbin-Watson test on the residuals of your ITS segmented regression and get DW = 0.95. What does this imply, and what should you do?

Concept Check

A public health study uses ITS to evaluate a vaccination campaign launched in March 2015. The analysis uses monthly disease incidence from 2010-2019. The researcher finds a significant level drop at March 2015. A critic points out that a new diagnostic test was introduced in the same region in February 2015, which reduced reported disease cases by changing the diagnostic threshold. How should the researcher respond?

H.2 Guided Exercise

Guided Exercise

Interpreting ITS Output from a Smoking Ban Study

You evaluate the effect of a citywide smoking ban (effective January 2012) on monthly emergency room visits for respiratory complaints. You fit a segmented regression to 48 pre-intervention months (2008-2011) and 48 post-intervention months (2012-2015). Your output: Variable | Coefficient | Newey-West SE | 95% CI | p-value Intercept | 245.0 | 8.2 | [228.9, 261.1] | < 0.001 Time (pre-trend) | -0.50 | 0.12 | [-0.74, -0.26] | < 0.001 Post (level) | -18.3 | 5.6 | [-29.3, -7.3] | 0.001 Time_since (slope)| -0.65 | 0.22 | [-1.08, -0.22] | 0.004 Durbin-Watson = 1.82. Ljung-Box (12 lags) p = 0.31. N = 96 monthly observations (48 pre, 48 post).

What was the pre-ban trend in ER visits? Is it statistically significant?

What was the immediate effect of the ban? How confident are you in this estimate?

Did the ban change the trend in ER visits? What is the post-ban slope?

Is autocorrelation a concern? How do you know?

What is the total estimated effect of the ban at 12 months post-intervention?

H.3 Error Detective

Error Detective

Read the analysis below carefully and identify the errors.

A health policy researcher studies the effect of a hospital staffing mandate enacted on July 1, 2018 on patient mortality rates. She collects quarterly mortality data from Q1 2016 to Q4 2020 (20 quarters: 10 pre, 10 post). She fits: lm(mortality ~ time + post + time_since, data = df) She reports: "The staffing mandate reduced mortality by 2.3 deaths per 1,000 admissions (SE = 0.8, p = 0.004). There was no significant slope change (p = 0.45)." She does not report autocorrelation diagnostics or seasonal controls. The data is quarterly. She does not discuss that a major electronic health records (EHR) system was implemented at the same hospital network in Q3 2018.

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

An education researcher evaluates a new math curriculum introduced at the start of the 2017-2018 school year. She collects annual standardized test scores for a single school district from 2010 to 2022 (8 pre-intervention years, 5 post-intervention years — 13 total observations). She fits a segmented regression and reports: "The new curriculum improved test scores by 4.2 points (SE = 1.5, p = 0.012). The slope change was +0.8 points per year (SE = 0.4, p = 0.06). We used Newey-West standard errors with bandwidth 3." She presents a plot showing the fitted pre-trend, post-trend, and counterfactual. The plot shows that the pre-trend line is estimated from 8 annual points and appears to fit the data well.

Select all errors you can find:

H.4 You Are the Referee

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

The authors evaluate the effect of a statewide opioid prescribing limit (enacted July 2019, capping initial prescriptions at 7 days) on monthly opioid-related emergency department visits. They collect monthly ED visit counts from January 2017 to December 2020 (30 pre-intervention months, 18 post-intervention months) for a single state. They fit a segmented regression with OLS and report a significant level drop of 42 visits per month (p = 0.003) and a non-significant slope change. They present a plot of the fitted model but do not report autocorrelation diagnostics.

Key Table

VariableCoefficientSEp-value
Intercept312.014.5<0.001
Time (pre-trend)-1.800.42<0.001
Post (level change)-42.013.80.003
Time_since (slope)+0.350.680.610
N (months)48
Pre-intervention30
Post-intervention18

Authors' Identification Claim

The prescribing limit created a clear intervention point. The authors argue that the stable pre-trend validates the counterfactual projection, and that no other major opioid policy changes occurred in the state during the study period.


I. Swap-In: When to Use Something Else

  • Difference-in-Differences (DiD) with a control group: when you have a suitable comparison group not affected by the intervention but with a similar pre-trend. DiD-with-control is more credible than single-group ITS because it nets out time-varying confounders that affect both groups. Preferred when control groups are available, even if you have only a few time periods.

  • Synthetic Control: when you have one treated unit and a panel of untreated units that can be combined to match the treated unit's pre-trend. Synthetic control is the method of choice when a single jurisdiction adopts a policy and you have data on many non-adopting jurisdictions. More flexible than ITS because it does not assume a linear counterfactual.

  • ARIMA-based ITS: when the time series has strong autocorrelation, seasonality, or nonlinear trends that the simple segmented regression cannot capture. ARIMA models (e.g., intervention analysis via transfer functions) fit the autocorrelation structure directly rather than relying on post-hoc corrections. More complex to specify but handles autocorrelation better. See for discussion.


J. Reviewer Checklist

Critical Reading Checklist

Paper Library

Foundational (2)

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference.

Houghton Mifflin

The definitive textbook on quasi-experimental designs, including a comprehensive treatment of interrupted time series. Discusses threats to validity (history, instrumentation, selection-maturation interaction) specific to ITS designs and provides guidance on when ITS is most credible.

Wagner, A. K., Soumerai, S. B., Zhang, F., & Ross-Degnan, D. (2002). Segmented Regression Analysis of Interrupted Time Series Studies in Medication Use Research.

Journal of Clinical Pharmacy and TherapeuticsDOI: 10.1046/j.1365-2710.2002.00430.x

Foundational paper formalizing segmented regression for ITS in health services research. Clearly specifies the model with level-change and slope-change parameters, discusses autocorrelation correction, and provides practical recommendations for minimum series length and model diagnostics.

Survey (4)

Lopez Bernal, J., Cummins, S., & Gasparrini, A. (2017). Interrupted Time Series Regression for the Evaluation of Public Health Interventions: A Tutorial.

International Journal of EpidemiologyDOI: 10.1093/ije/dyw098

Accessible tutorial on ITS regression for public health researchers. Covers the segmented regression model, autocorrelation diagnostics, Newey-West standard errors, and practical guidance on minimum number of time points. An excellent starting point for applied researchers.

Kontopantelis, E., Doran, T., Springate, D. A., Buchan, I., & Reeves, D. (2015). Regression Based Quasi-Experimental Approach When Randomisation Is Not an Option: Interrupted Time Series Analysis.

Practical guide to ITS analysis published in the BMJ. Covers model specification, autocorrelation testing, sensitivity analyses, and the addition of control series. Provides clear visual examples of level and slope changes and discusses common pitfalls.

Linden, A. (2015). Conducting Interrupted Time-Series Analysis for Single- and Multiple-Group Comparisons.

Introduces the itsa command in Stata for single- and multiple-group ITS analysis. Covers Newey-West standard errors for autocorrelation, Prais-Winsten estimation, and the extension to controlled ITS with a comparison group. A key reference for Stata users.

Lopez Bernal, J., Cummins, S., & Gasparrini, A. (2018). The Use of Controls in Interrupted Time Series Studies of Public Health Interventions.

International Journal of EpidemiologyDOI: 10.1093/ije/dyy135

Tutorial on extending ITS analysis with control groups to strengthen causal inference. Discusses controlled ITS (CITS) designs that combine the ITS framework with a comparison series, addressing the key threat of concurrent events confounding the intervention effect.

Tags

design-basedtime-seriespolicy-evaluation