MethodAtlas
Method·intermediate·12 min read
PanelEstablished

Fixed Effects (Two-Way FE)

Removes time-invariant unobserved confounders by exploiting within-unit variation over time.

When to UseWhen you have panel data and worry about unobserved time-invariant confounders that differ across units.
AssumptionStrict exogeneity: E[error | X_all_periods, unit_effect] = 0. All confounders are either observed or time-invariant — FE cannot remove time-varying unobservables.
MistakeClaiming FE eliminates all confounders — it only removes time-invariant ones. Time-varying confounders remain. Also, not checking whether the key variable has sufficient within-unit variation.
Reading Time~12 min read · 11 sections · 9 interactive exercises

One-Line Implementation

Rfeols(y ~ x1 | unit_id + year, data = df, vcov = ~unit_id)
Statareghdfe y x1, absorb(unit_id year) vce(cluster unit_id)
PythonPanelOLS(df['y'], df[['x1']], entity_effects=True, time_effects=True).fit(cov_type='clustered', cluster_entity=True)

Download Full Analysis Code

Complete scripts with diagnostics, robustness checks, and result export.

Motivating Example: The Effect of Unionization on Wages

You want to know whether joining a union raises a worker's wages. You have panel data tracking thousands of workers over ten years, observing their union status and wages each year. A simple cross-sectional OLS regression of wages on union membership is almost certainly biased: workers who join unions differ systematically from workers who do not — in ability, motivation, industry, occupation, and dozens of other ways you cannot fully measure.

But here is the key insight: with panel data, you can compare the same worker before and after they join a union. Whatever make worker ii different from worker jj — ability, motivation, family background — those characteristics are constant over time for the same worker. The fixed effects model strips all of that confounding out.

This within-unit comparison is the power of fixed effects: it exploits within-unit variation over time, effectively asking "what happens to the same worker's wages when their union status changes?" rather than "do workers in unions earn more than workers not in unions?"


AOverview

The Panel Data Model

Consider a simple panel model:

Yit=βXit+αi+λt+εitY_{it} = \beta X_{it} + \alpha_i + \lambda_t + \varepsilon_{it}

where:

  • YitY_{it} is the outcome (wages) for unit ii at time tt
  • XitX_{it} is the treatment/covariate of interest (union status)
  • αi\alpha_i is a unit fixed effect — everything about unit ii that does not change over time (ability, gender, race, permanent personality traits)
  • λt\lambda_t is a time fixed effect — everything that changes over time but affects all units the same way (macroeconomic conditions, inflation, policy changes)
  • εit\varepsilon_{it} is the idiosyncratic error

What Fixed Effects Do

Including αi\alpha_i (unit fixed effects) is equivalent to including a separate dummy variable for each unit. This inclusion absorbs all time-invariant differences between units — both observed and unobserved.

Including λt\lambda_t (time fixed effects) absorbs all temporal shocks that are common across units.

Together, two-way fixed effects (TWFE) control for unit-level permanent differences and common time trends. The identifying variation that remains is within-unit, over-time variation that deviates from common trends.

What Fixed Effects Do NOT Do


Common Confusions


BIdentification

The Math Behind "Demeaning"

The within transformation works by subtracting the unit-specific mean from each variable. Define:

Yˉi=1Tit=1TiYit,Xˉi=1Tit=1TiXit\bar{Y}_i = \frac{1}{T_i} \sum_{t=1}^{T_i} Y_{it}, \quad \bar{X}_i = \frac{1}{T_i} \sum_{t=1}^{T_i} X_{it}

Subtracting the unit mean from the original equation:

(YitYˉi)=β(XitXˉi)+(εitεˉi)(Y_{it} - \bar{Y}_i) = \beta(X_{it} - \bar{X}_i) + (\varepsilon_{it} - \bar{\varepsilon}_i)

The fixed effect αi\alpha_i has vanished — because αiαi=0\alpha_i - \alpha_i = 0. What remains is the within-unit variation, which is why this approach is called the "within estimator."

The Key Assumption: Strict Exogeneity

For the FE estimator to be consistent, you need :

E[εitXi1,Xi2,,XiT,αi]=0  tE[\varepsilon_{it} \mid X_{i1}, X_{i2}, \ldots, X_{iT}, \alpha_i] = 0 \quad \forall \; t

This condition means the error at time tt is uncorrelated with the regressors at all time periods — past, present, and future. This restriction rules out feedback effects (where past outcomes affect future treatment). If there are lagged dependent variables, strict exogeneity fails, and FE is biased in short panels () (Nickell, 1981). The general formula for the bias on the AR(1) coefficient is approximately (1+ρ)/(T1)-(1+\rho)/(T-1), where ρ\rho is the true autoregressive parameter. When ρ=0\rho=0, this simplifies to 1/(T1)-1/(T-1).

The Hausman Test: FE vs. RE

The compares FE and random effects (RE) estimates. Under the null hypothesis that RE is consistent (i.e., the unit effect αi\alpha_i is uncorrelated with the regressors), both FE and RE are consistent but RE is more efficient. Under the alternative, only FE is consistent.

H=(β^FEβ^RE)[Var(β^FE)Var(β^RE)]1(β^FEβ^RE)χk2H = (\hat{\beta}_{FE} - \hat{\beta}_{RE})'[\text{Var}(\hat{\beta}_{FE}) - \text{Var}(\hat{\beta}_{RE})]^{-1}(\hat{\beta}_{FE} - \hat{\beta}_{RE}) \sim \chi^2_k

A significant test statistic rejects RE in favor of FE. In practice, this test requires both estimators to use the same variance estimator. When using cluster-robust or heteroscedasticity-robust standard errors, the standard Hausman test is invalid; use a robust version instead (Wooldridge, 2010).


CVisual Intuition

Imagine a scatterplot of wages (Y) vs. years of experience (X) for multiple workers. Without fixed effects, you would draw one regression line through all the data. But each worker has a different starting wage (some are high-ability, some low-ability), creating parallel clouds at different heights.

Fixed effects is like drawing a separate regression line for each worker (or equivalently, centering each worker's data on their own mean). You are no longer comparing high-ability workers to low-ability workers. You are looking at how each worker's own wages change when their experience (or union status) changes.

The slope you estimate is the average of these within-person slopes — purged of all between-person differences.


DMathematical Derivation

Don't worry about the notation yet — here's what this means in words: Subtracting unit-specific means from all variables removes the unobserved fixed effect, leaving only within-unit variation for identification.

Start with the model:

Yit=Xitβ+αi+εitY_{it} = X_{it}'\beta + \alpha_i + \varepsilon_{it}

Take the unit-specific time average:

Yˉi=Xˉiβ+αi+εˉi\bar{Y}_i = \bar{X}_i'\beta + \alpha_i + \bar{\varepsilon}_i

Subtract:

Y¨it=X¨itβ+ε¨it\ddot{Y}_{it} = \ddot{X}_{it}'\beta + \ddot{\varepsilon}_{it}

where Y¨it=YitYˉi\ddot{Y}_{it} = Y_{it} - \bar{Y}_i (the demeaned variable).

The OLS estimator on the demeaned data is:

β^FE=(itX¨itX¨it)1(itX¨itY¨it)\hat{\beta}_{FE} = \left(\sum_i \sum_t \ddot{X}_{it}\ddot{X}_{it}'\right)^{-1} \left(\sum_i \sum_t \ddot{X}_{it}\ddot{Y}_{it}\right)

Standard errors: Because demeaning introduces serial correlation in the transformed error ε¨it\ddot{\varepsilon}_{it}, it is recommended to at the unit level. This clustering also handles arbitrary within-unit heteroskedasticity and autocorrelation.

Degrees of freedom: FE estimation uses nkNn - k - N degrees of freedom, where NN is the number of units. With many units and short panels, this correction matters.

Equivalence: The FE estimator is numerically identical to OLS with unit dummy variables (the Least Squares Dummy Variable estimator, LSDV). Software uses the within transformation for computational efficiency.


EImplementation

# Requires: fixest, plm
# fixest: fast fixed-effects estimation with multi-way clustering (Berge)
library(fixest)

# --- Step 1: One-way fixed effects (unit FE only) ---
# feols() absorbs fixed effects via | worker_id (demeaning)
# Unit FE controls for all time-invariant worker characteristics (ability, etc.)
# vcov = ~worker_id: cluster SEs at the unit level to account for serial correlation
fe1 <- feols(wages ~ union + tenure + hours_worked | worker_id,
           data = df, vcov = ~worker_id)
summary(fe1)
# Coefficient on union: within-worker effect of joining/leaving a union

# --- Step 2: Two-way fixed effects (unit + time FE) ---
# Adding year FE controls for common time shocks (recessions, inflation, etc.)
# This prevents attributing common wage trends to changes in union status
fe2 <- feols(wages ~ union + tenure + hours_worked | worker_id + year,
           data = df, vcov = ~worker_id)
summary(fe2)

# --- Step 3: Hausman test (FE vs. RE) ---
# plm: panel linear models for R
library(plm)
pdata <- pdata.frame(df, index = c("worker_id", "year"))
fe_plm <- plm(wages ~ union + tenure + hours_worked, data = pdata, model = "within")
re_plm <- plm(wages ~ union + tenure + hours_worked, data = pdata, model = "random")
# H0: RE is consistent (unit effects uncorrelated with regressors)
# Rejection => use FE (unobserved heterogeneity is correlated with regressors)
phtest(fe_plm, re_plm)
Requiresfixestplm

FDiagnostics

Is There Enough Within Variation?

If your key variable barely changes within units over time, FE will have very low power. Check the within variation:

  • In Stata: xtsum variable reports between and within standard deviations.
  • If the within SD is tiny relative to the between SD, FE is identifying off very little variation, and your estimates will be imprecise.

F-Test for Joint Significance of Fixed Effects

Test whether the unit fixed effects are jointly significant. If not, pooled OLS may be preferred. In Stata: test after areg or check the F-stat from reghdfe.

Cluster Your Standard Errors

Robustness to Functional Form

Report results with and without time fixed effects. If the coefficient changes dramatically when you add time FE, common time trends were confounding your estimate. Also consider adding unit-specific linear trends if you worry about differential trends.


Interpreting Your Results

  • The FE coefficient measures the within-unit effect: "when union status changes for a given worker, their wages change by β^\hat{\beta} on average."
  • This coefficient is not the same as the cross-sectional comparison: "workers in unions earn β^\hat{\beta} more than workers not in unions."
  • If the FE estimate is much smaller than the OLS estimate, it suggests that positive selection (higher-ability workers select into unions) was inflating the OLS coefficient.
  • Time fixed effects control for common shocks. If wages grew nationally during your sample period, time FE prevent you from attributing that growth to changes in union status.

GWhat Can Go Wrong

ProblemWhat It DoesHow to Fix It
Time-varying confoundersFE does not remove them; estimates are biasedAdd controls, use DiD design, or conduct sensitivity analysis
Nickell biasIncluding lagged dependent variables with FE produces biased estimates in short panelsUse Arellano-Bond or long panels
Low within variationEstimates are very imprecise, wide confidence intervalsCheck xtsum; consider RE if the Hausman test supports it
Not clustering SEsDramatically understated standard errorsCluster at the unit level (at minimum)
Cannot estimate time-invariant effectsPerfectly collinear with unit dummiesUse RE, Mundlak approach, or Hausman-Taylor
TWFE with staggered treatmentBiased if treatment effects are heterogeneousUse modern estimators (Callaway & Sant'Anna, 2021), Sun-Abraham
What Can Go Wrong

Time-Varying Confounders Bias FE Estimates

Researcher includes controls for time-varying confounders (job changes, hours worked) alongside worker FE

FE estimate of union wage premium: 0.08 (SE = 0.02). Adding controls for concurrent job changes reduces the estimate to 0.06, suggesting some time-varying confounding. Both estimates are reported for transparency.

What Can Go Wrong

Nickell Bias from Lagged Dependent Variables

Panel has T = 30 periods and does not include a lagged dependent variable, or uses Arellano-Bond GMM for dynamic specifications

FE estimate of union premium: 0.08 (SE = 0.02). No lagged dependent variable is included. Specification with Arellano-Bond GMM for the dynamic model gives a similar estimate of 0.07.

What Can Go Wrong

Insufficient Within Variation

Key variable (union status) changes for 30% of workers during the panel, providing substantial within-variation for identification

FE estimate: 0.08 (SE = 0.02). The within standard deviation of union status is 0.35 (compared to between SD of 0.46). The estimate is precisely identified with adequate variation.

Concept Check

You estimate the effect of union membership on wages using (1) pooled OLS and (2) worker fixed effects. The OLS coefficient is 0.18 and the FE coefficient is 0.08. What is the most likely explanation for the difference?


HPractice

Concept Check

A researcher includes worker fixed effects and claims her estimate is 'free of omitted variable bias.' She finds that workers who get promoted earn 15% more. A colleague points out that promotions coincide with relocations to higher-cost-of-living cities. Is the colleague's concern valid?

Concept Check

You want to estimate the effect of a worker's gender on wages using panel data. Can you include gender in a worker fixed effects model?

Concept Check

A study uses firm fixed effects and year fixed effects (two-way FE) to estimate the impact of adopting a new technology on firm productivity. The researcher has 500 firms over 20 years, but only 15 firms adopt the technology during the sample period. What concern should you raise?

Concept Check

A researcher adds lagged wages as a control variable in a worker fixed effects model with T = 5 periods. The coefficient on union membership drops from 0.08 to 0.02. What is the most likely explanation?

Referee Exercise

Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.

Paper Summary

A study examines whether CEO education (MBA vs. no MBA) affects firm performance. Using a panel of 2,000 firms over 15 years, the authors regress firm ROA on a CEO MBA dummy, controlling for firm size, industry, and year. They include firm fixed effects to control for unobserved firm characteristics.

Key Table

VariableCoefficientSEp-value
CEO has MBA0.0240.0090.008
Firm size (log)0.0310.0120.010
Firm FEYes
Year FEYes
Clustered SEsFirm level
N18,500
R-squared (within)0.12

Authors' Identification Claim

Firm fixed effects control for all time-invariant firm characteristics. Therefore, the coefficient on CEO MBA captures the causal effect of hiring an MBA-educated CEO.

Guided Exercise

Applying Fixed Effects: Teacher Effectiveness and Student Test Scores

A district administrator wants to know whether students taught by teachers with a master's degree score higher on standardized tests. She has panel data on 800 teachers observed across 5 years, and plans to estimate a model with teacher fixed effects and year fixed effects.

What does the teacher fixed effect absorb?

Why can she not estimate the effect of gender on test scores in this model?

What type of variation identifies the effect of earning a master's degree mid-career?

If teachers strategically pursue master's degrees when they expect to switch to higher-performing schools, is the estimate still valid?

Error Detective

Read the analysis below carefully and identify the errors.

A labor economist studies whether right-to-work (RTW) laws reduce wages. Using state-level panel data (50 states, 20 years), they estimate:

reghdfe avg_wage rtw_law unemployment_rate, absorb(state year) vce(robust)

They find: coefficient on RTW law = -0.034 (SE = 0.012, p = 0.005). They write: "After controlling for state and year fixed effects, right-to-work laws reduce average wages by 3.4%. Standard errors are robust to heteroskedasticity."

Select all errors you can find:

Error Detective

Read the analysis below carefully and identify the errors.

A management researcher studies whether firms that adopt enterprise resource planning (ERP) systems experience higher productivity. Using a panel of 3,000 manufacturing firms over 8 years, they run:

reghdfe log_productivity erp_adopted firm_size, absorb(firm_id) vce(cluster firm_id)

They report: "The coefficient on ERP adoption is 0.15 (SE = 0.04, p < 0.001). With firm fixed effects, we compare each firm to itself before and after ERP adoption, eliminating all confounders. We also include lagged productivity as a control to account for mean reversion." In a robustness check, they add lagged log_productivity to the model and find the coefficient drops to 0.04.

Select all errors you can find:


ISwap-In: When to Use Something Else

  • Random Effects: When you believe unobserved unit effects are uncorrelated with regressors (Hausman test does not reject). More efficient than FE and can estimate time-invariant effects.
  • Correlated Random Effects (Mundlak): Add group means of time-varying regressors to the RE model. Gives FE-equivalent coefficients for time-varying variables while also estimating time-invariant effects.
  • First differencing: Instead of demeaning, take first differences: ΔYit=βΔXit+Δεit\Delta Y_{it} = \beta \Delta X_{it} + \Delta \varepsilon_{it}. Equivalent to FE with two periods; with more periods, FE is generally more efficient unless errors follow a random walk. Under homoscedastic, serially uncorrelated errors, FE is more efficient than first differencing. If errors follow a random walk, first differencing is preferred. Testing for serial correlation in the first-differenced errors can help choose between the two (Wooldridge, 2010).
  • Arellano-Bond generalized method of moments (GMM): For dynamic panels (lagged dependent variable) where FE is biased. Uses past levels as instruments for first-differenced equations.

JReviewer Checklist

Critical Reading Checklist

0 of 8 items checked0%


Paper Library

Foundational (7)

Chamberlain, G. (1980). Analysis of Covariance with Qualitative Data.

Review of Economic StudiesDOI: 10.2307/2297110

Chamberlain extends the fixed effects approach to nonlinear models like logit, showing how to condition out the fixed effects in discrete choice settings. This work is fundamental for researchers who need fixed effects in models where the dependent variable is binary or categorical.

Correia, S. (2017). Linear Models with High-Dimensional Fixed Effects: An Efficient and Feasible Estimator.

Working Paper

Correia develops an efficient iterative demeaning estimator for linear models with multiple high-dimensional fixed effects that scales to very large datasets. The estimator handles arbitrary numbers of fixed-effect dimensions and supports cluster-robust standard errors. Its implementation as the reghdfe Stata command has become the standard tool for applied researchers working with high-dimensional fixed effects in panel data.

de Chaisemartin, C., & D'Haultfoeuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.

American Economic ReviewDOI: 10.1257/aer.20181169

De Chaisemartin and D'Haultfoeuille show that the TWFE estimator can assign negative weights to some treatment effects, potentially producing estimates with the wrong sign. They propose an alternative estimator and a decomposition that reveals which group-time effects receive negative weights.

Hausman, J. A. (1978). Specification Tests in Econometrics.

EconometricaDOI: 10.2307/1913827

Hausman develops a general framework for specification testing based on comparing two estimators: one consistent under a broad set of assumptions and one efficient under a narrower null hypothesis. The test's most well-known application compares fixed effects (consistent if unit effects are correlated with regressors) against random effects (efficient under the null of no correlation), but the framework applies broadly to IV, simultaneous equations, and time-series cross-section models. The test statistic has a chi-squared distribution under the null and remains one of the most widely used diagnostic tools in applied econometrics.

Imai, K., & Kim, I. S. (2019). When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?.

American Journal of Political ScienceDOI: 10.1111/ajps.12417

Imai and Kim provide a modern causal-inference framework for understanding when unit fixed effects regression yields unbiased estimates with longitudinal data. They clarify the often-implicit assumptions about treatment history and carryover effects, offering a more rigorous foundation for applied fixed effects analysis.

Mundlak, Y. (1978). On the Pooling of Time Series and Cross Section Data.

EconometricaDOI: 10.2307/1913646

Mundlak shows that the fixed effects estimator can be understood as an OLS regression that includes the group means of all time-varying regressors. This 'correlated random effects' interpretation bridges the fixed effects and random effects models and clarifies exactly what assumption is being relaxed.

Nickell, S. (1981). Biases in Dynamic Models with Fixed Effects.

EconometricaDOI: 10.2307/1911408

Nickell shows that including a lagged dependent variable in a fixed effects regression creates a bias that does not vanish as the number of cross-sectional units grows. This 'Nickell bias' is a critical concern for researchers using fixed effects in dynamic panel models with short time series.

Application (5)

Abowd, J. M., Kramarz, F., & Margolis, D. N. (1999). High Wage Workers and High Wage Firms.

Abowd, Kramarz, and Margolis use worker and firm fixed effects jointly to decompose wage variation into worker ability and firm pay premia in this landmark paper. The 'AKM' model has become the standard framework for studying labor market sorting, wage inequality, and the role of firms in wage-setting.

Bertrand, M., & Schoar, A. (2003). Managing with Style: The Effect of Managers on Firm Policies.

Quarterly Journal of EconomicsDOI: 10.1162/003355303322552775

Bertrand and Schoar use manager fixed effects (tracking CEOs who moved between firms) to show that individual managerial 'style' explains a significant portion of the variation in corporate investment, financial, and organizational practices. This paper is a key reference linking fixed effects methods to management questions.

Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates.

American Economic ReviewDOI: 10.1257/aer.104.9.2593

Chetty, Friedman, and Rockoff use teacher fixed effects (value-added models) and quasi-experimental validation to measure individual teachers' causal impacts on student outcomes. They demonstrate that teacher fixed effects capture real causal effects, not just selection, and their work has influenced education policy worldwide.

Freeman, R. B., & Medoff, J. L. (1984). What Do Unions Do?.

Basic Books

Freeman and Medoff examine the effects of unions on wages, productivity, inequality, and workplace governance, drawing on a wide range of data sources and econometric methods including longitudinal analysis. The book argues that unions have both a monopoly face (raising wages above competitive levels) and a collective voice face (improving workplace communication and reducing turnover). It remains influential as a comprehensive empirical assessment of union effects and a common pedagogical motivation for fixed effects methods in labor economics.

Henderson, A. D., Miller, D., & Hambrick, D. C. (2006). How Quickly Do CEOs Become Obsolete? Industry Dynamism, CEO Tenure, and Company Performance.

Strategic Management JournalDOI: 10.1002/smj.524

Henderson, Miller, and Hambrick study how CEO tenure affects performance in dynamic versus stable industries in this longitudinal strategy paper. In the stable food industry, performance improved steadily with tenure, declining only after 10-15 years; in the dynamic computer industry, performance declined steadily from the start. The paper demonstrates that the relationship between CEO tenure and performance is contingent on industry dynamism.

Survey (3)

Angrist, J. D., & Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion.

Princeton University PressDOI: 10.1515/9781400829828

Angrist and Pischke write one of the most influential modern textbooks on applied econometrics, organizing the field around a design-based approach to causal inference. The book provides essential treatments of instrumental variables, difference-in-differences, and regression discontinuity, each grounded in the potential outcomes framework. It remains the standard reference for graduate students learning to evaluate and implement identification strategies.

Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and Applications.

Cambridge University PressDOI: 10.1017/CBO9780511811241

Cameron and Trivedi cover panel data methods comprehensively in Chapter 21, including fixed effects, random effects, and dynamic panel models. A standard graduate-level reference for microeconometric methods.

Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data.

MIT Press

Wooldridge's graduate textbook is the standard reference for cross-section and panel data econometrics. Chapters 10-11 provide a thorough treatment of fixed effects, random effects, and related panel data methods, while later chapters cover general estimation methodology (MLE, GMM, M-estimation) with panel data applications throughout. The book covers both linear and nonlinear models with careful attention to assumptions.

Tags

panelcontinuous-outcometime-invariant-confounders