Fixed Effects (Two-Way FE)
Removes time-invariant unobserved confounders by exploiting within-unit variation over time.
One-Line Implementation
feols(y ~ x1 | unit_id + year, data = df, vcov = ~unit_id)reghdfe y x1, absorb(unit_id year) vce(cluster unit_id)PanelOLS(df['y'], df[['x1']], entity_effects=True, time_effects=True).fit(cov_type='clustered', cluster_entity=True)Download Full Analysis Code
Complete scripts with diagnostics, robustness checks, and result export.
Motivating Example: The Effect of Unionization on Wages
You want to know whether joining a union raises a worker's wages. You have panel data tracking thousands of workers over ten years, observing their union status and wages each year. A simple cross-sectional OLS regression of wages on union membership is almost certainly biased: workers who join unions differ systematically from workers who do not — in ability, motivation, industry, occupation, and dozens of other ways you cannot fully measure.
But here is the key insight: with panel data, you can compare the same worker before and after they join a union. Whatever make worker different from worker — ability, motivation, family background — those characteristics are constant over time for the same worker. The fixed effects model strips all of that confounding out.
This within-unit comparison is the power of fixed effects: it exploits within-unit variation over time, effectively asking "what happens to the same worker's wages when their union status changes?" rather than "do workers in unions earn more than workers not in unions?"
AOverview
The Panel Data Model
Consider a simple panel model:
where:
- is the outcome (wages) for unit at time
- is the treatment/covariate of interest (union status)
- is a unit fixed effect — everything about unit that does not change over time (ability, gender, race, permanent personality traits)
- is a time fixed effect — everything that changes over time but affects all units the same way (macroeconomic conditions, inflation, policy changes)
- is the idiosyncratic error
What Fixed Effects Do
Including (unit fixed effects) is equivalent to including a separate dummy variable for each unit. This inclusion absorbs all time-invariant differences between units — both observed and unobserved.
Including (time fixed effects) absorbs all temporal shocks that are common across units.
Together, two-way fixed effects (TWFE) control for unit-level permanent differences and common time trends. The identifying variation that remains is within-unit, over-time variation that deviates from common trends.
What Fixed Effects Do NOT Do
Common Confusions
BIdentification
The Math Behind "Demeaning"
The within transformation works by subtracting the unit-specific mean from each variable. Define:
Subtracting the unit mean from the original equation:
The fixed effect has vanished — because . What remains is the within-unit variation, which is why this approach is called the "within estimator."
The Key Assumption: Strict Exogeneity
For the FE estimator to be consistent, you need :
This condition means the error at time is uncorrelated with the regressors at all time periods — past, present, and future. This restriction rules out feedback effects (where past outcomes affect future treatment). If there are lagged dependent variables, strict exogeneity fails, and FE is biased in short panels () (Nickell, 1981). The general formula for the bias on the AR(1) coefficient is approximately , where is the true autoregressive parameter. When , this simplifies to .
The Hausman Test: FE vs. RE
The compares FE and random effects (RE) estimates. Under the null hypothesis that RE is consistent (i.e., the unit effect is uncorrelated with the regressors), both FE and RE are consistent but RE is more efficient. Under the alternative, only FE is consistent.
A significant test statistic rejects RE in favor of FE. In practice, this test requires both estimators to use the same variance estimator. When using cluster-robust or heteroscedasticity-robust standard errors, the standard Hausman test is invalid; use a robust version instead (Wooldridge, 2010).
CVisual Intuition
Imagine a scatterplot of wages (Y) vs. years of experience (X) for multiple workers. Without fixed effects, you would draw one regression line through all the data. But each worker has a different starting wage (some are high-ability, some low-ability), creating parallel clouds at different heights.
Fixed effects is like drawing a separate regression line for each worker (or equivalently, centering each worker's data on their own mean). You are no longer comparing high-ability workers to low-ability workers. You are looking at how each worker's own wages change when their experience (or union status) changes.
The slope you estimate is the average of these within-person slopes — purged of all between-person differences.
DMathematical Derivation
Don't worry about the notation yet — here's what this means in words: Subtracting unit-specific means from all variables removes the unobserved fixed effect, leaving only within-unit variation for identification.
Start with the model:
Take the unit-specific time average:
Subtract:
where (the demeaned variable).
The OLS estimator on the demeaned data is:
Standard errors: Because demeaning introduces serial correlation in the transformed error , it is recommended to at the unit level. This clustering also handles arbitrary within-unit heteroskedasticity and autocorrelation.
Degrees of freedom: FE estimation uses degrees of freedom, where is the number of units. With many units and short panels, this correction matters.
Equivalence: The FE estimator is numerically identical to OLS with unit dummy variables (the Least Squares Dummy Variable estimator, LSDV). Software uses the within transformation for computational efficiency.
EImplementation
# Requires: fixest, plm
# fixest: fast fixed-effects estimation with multi-way clustering (Berge)
library(fixest)
# --- Step 1: One-way fixed effects (unit FE only) ---
# feols() absorbs fixed effects via | worker_id (demeaning)
# Unit FE controls for all time-invariant worker characteristics (ability, etc.)
# vcov = ~worker_id: cluster SEs at the unit level to account for serial correlation
fe1 <- feols(wages ~ union + tenure + hours_worked | worker_id,
data = df, vcov = ~worker_id)
summary(fe1)
# Coefficient on union: within-worker effect of joining/leaving a union
# --- Step 2: Two-way fixed effects (unit + time FE) ---
# Adding year FE controls for common time shocks (recessions, inflation, etc.)
# This prevents attributing common wage trends to changes in union status
fe2 <- feols(wages ~ union + tenure + hours_worked | worker_id + year,
data = df, vcov = ~worker_id)
summary(fe2)
# --- Step 3: Hausman test (FE vs. RE) ---
# plm: panel linear models for R
library(plm)
pdata <- pdata.frame(df, index = c("worker_id", "year"))
fe_plm <- plm(wages ~ union + tenure + hours_worked, data = pdata, model = "within")
re_plm <- plm(wages ~ union + tenure + hours_worked, data = pdata, model = "random")
# H0: RE is consistent (unit effects uncorrelated with regressors)
# Rejection => use FE (unobserved heterogeneity is correlated with regressors)
phtest(fe_plm, re_plm)FDiagnostics
Is There Enough Within Variation?
If your key variable barely changes within units over time, FE will have very low power. Check the within variation:
- In Stata:
xtsum variablereports between and within standard deviations. - If the within SD is tiny relative to the between SD, FE is identifying off very little variation, and your estimates will be imprecise.
F-Test for Joint Significance of Fixed Effects
Test whether the unit fixed effects are jointly significant. If not, pooled OLS may be preferred. In Stata: test after areg or check the F-stat from reghdfe.
Cluster Your Standard Errors
Robustness to Functional Form
Report results with and without time fixed effects. If the coefficient changes dramatically when you add time FE, common time trends were confounding your estimate. Also consider adding unit-specific linear trends if you worry about differential trends.
Interpreting Your Results
- The FE coefficient measures the within-unit effect: "when union status changes for a given worker, their wages change by on average."
- This coefficient is not the same as the cross-sectional comparison: "workers in unions earn more than workers not in unions."
- If the FE estimate is much smaller than the OLS estimate, it suggests that positive selection (higher-ability workers select into unions) was inflating the OLS coefficient.
- Time fixed effects control for common shocks. If wages grew nationally during your sample period, time FE prevent you from attributing that growth to changes in union status.
GWhat Can Go Wrong
| Problem | What It Does | How to Fix It |
|---|---|---|
| Time-varying confounders | FE does not remove them; estimates are biased | Add controls, use DiD design, or conduct sensitivity analysis |
| Nickell bias | Including lagged dependent variables with FE produces biased estimates in short panels | Use Arellano-Bond or long panels |
| Low within variation | Estimates are very imprecise, wide confidence intervals | Check xtsum; consider RE if the Hausman test supports it |
| Not clustering SEs | Dramatically understated standard errors | Cluster at the unit level (at minimum) |
| Cannot estimate time-invariant effects | Perfectly collinear with unit dummies | Use RE, Mundlak approach, or Hausman-Taylor |
| TWFE with staggered treatment | Biased if treatment effects are heterogeneous | Use modern estimators (Callaway & Sant'Anna, 2021), Sun-Abraham |
Time-Varying Confounders Bias FE Estimates
Researcher includes controls for time-varying confounders (job changes, hours worked) alongside worker FE
FE estimate of union wage premium: 0.08 (SE = 0.02). Adding controls for concurrent job changes reduces the estimate to 0.06, suggesting some time-varying confounding. Both estimates are reported for transparency.
Nickell Bias from Lagged Dependent Variables
Panel has T = 30 periods and does not include a lagged dependent variable, or uses Arellano-Bond GMM for dynamic specifications
FE estimate of union premium: 0.08 (SE = 0.02). No lagged dependent variable is included. Specification with Arellano-Bond GMM for the dynamic model gives a similar estimate of 0.07.
Insufficient Within Variation
Key variable (union status) changes for 30% of workers during the panel, providing substantial within-variation for identification
FE estimate: 0.08 (SE = 0.02). The within standard deviation of union status is 0.35 (compared to between SD of 0.46). The estimate is precisely identified with adequate variation.
You estimate the effect of union membership on wages using (1) pooled OLS and (2) worker fixed effects. The OLS coefficient is 0.18 and the FE coefficient is 0.08. What is the most likely explanation for the difference?
HPractice
A researcher includes worker fixed effects and claims her estimate is 'free of omitted variable bias.' She finds that workers who get promoted earn 15% more. A colleague points out that promotions coincide with relocations to higher-cost-of-living cities. Is the colleague's concern valid?
You want to estimate the effect of a worker's gender on wages using panel data. Can you include gender in a worker fixed effects model?
A study uses firm fixed effects and year fixed effects (two-way FE) to estimate the impact of adopting a new technology on firm productivity. The researcher has 500 firms over 20 years, but only 15 firms adopt the technology during the sample period. What concern should you raise?
A researcher adds lagged wages as a control variable in a worker fixed effects model with T = 5 periods. The coefficient on union membership drops from 0.08 to 0.02. What is the most likely explanation?
Read the paper summary below and write a brief referee critique (2-3 sentences) of the identification strategy.
Paper Summary
A study examines whether CEO education (MBA vs. no MBA) affects firm performance. Using a panel of 2,000 firms over 15 years, the authors regress firm ROA on a CEO MBA dummy, controlling for firm size, industry, and year. They include firm fixed effects to control for unobserved firm characteristics.
Key Table
| Variable | Coefficient | SE | p-value |
|---|---|---|---|
| CEO has MBA | 0.024 | 0.009 | 0.008 |
| Firm size (log) | 0.031 | 0.012 | 0.010 |
| Firm FE | Yes | ||
| Year FE | Yes | ||
| Clustered SEs | Firm level | ||
| N | 18,500 | ||
| R-squared (within) | 0.12 |
Authors' Identification Claim
Firm fixed effects control for all time-invariant firm characteristics. Therefore, the coefficient on CEO MBA captures the causal effect of hiring an MBA-educated CEO.
Applying Fixed Effects: Teacher Effectiveness and Student Test Scores
A district administrator wants to know whether students taught by teachers with a master's degree score higher on standardized tests. She has panel data on 800 teachers observed across 5 years, and plans to estimate a model with teacher fixed effects and year fixed effects.
Read the analysis below carefully and identify the errors.
A labor economist studies whether right-to-work (RTW) laws reduce wages. Using state-level panel data (50 states, 20 years), they estimate:
reghdfe avg_wage rtw_law unemployment_rate, absorb(state year) vce(robust)
They find: coefficient on RTW law = -0.034 (SE = 0.012, p = 0.005). They write: "After controlling for state and year fixed effects, right-to-work laws reduce average wages by 3.4%. Standard errors are robust to heteroskedasticity."
Select all errors you can find:
Read the analysis below carefully and identify the errors.
A management researcher studies whether firms that adopt enterprise resource planning (ERP) systems experience higher productivity. Using a panel of 3,000 manufacturing firms over 8 years, they run:
reghdfe log_productivity erp_adopted firm_size, absorb(firm_id) vce(cluster firm_id)
They report: "The coefficient on ERP adoption is 0.15 (SE = 0.04, p < 0.001). With firm fixed effects, we compare each firm to itself before and after ERP adoption, eliminating all confounders. We also include lagged productivity as a control to account for mean reversion." In a robustness check, they add lagged log_productivity to the model and find the coefficient drops to 0.04.
Select all errors you can find:
ISwap-In: When to Use Something Else
- Random Effects: When you believe unobserved unit effects are uncorrelated with regressors (Hausman test does not reject). More efficient than FE and can estimate time-invariant effects.
- Correlated Random Effects (Mundlak): Add group means of time-varying regressors to the RE model. Gives FE-equivalent coefficients for time-varying variables while also estimating time-invariant effects.
- First differencing: Instead of demeaning, take first differences: . Equivalent to FE with two periods; with more periods, FE is generally more efficient unless errors follow a random walk. Under homoscedastic, serially uncorrelated errors, FE is more efficient than first differencing. If errors follow a random walk, first differencing is preferred. Testing for serial correlation in the first-differenced errors can help choose between the two (Wooldridge, 2010).
- Arellano-Bond generalized method of moments (GMM): For dynamic panels (lagged dependent variable) where FE is biased. Uses past levels as instruments for first-differenced equations.
JReviewer Checklist
Critical Reading Checklist
Paper Library
Foundational (7)
Chamberlain, G. (1980). Analysis of Covariance with Qualitative Data.
Chamberlain extends the fixed effects approach to nonlinear models like logit, showing how to condition out the fixed effects in discrete choice settings. This work is fundamental for researchers who need fixed effects in models where the dependent variable is binary or categorical.
Correia, S. (2017). Linear Models with High-Dimensional Fixed Effects: An Efficient and Feasible Estimator.
Correia develops an efficient iterative demeaning estimator for linear models with multiple high-dimensional fixed effects that scales to very large datasets. The estimator handles arbitrary numbers of fixed-effect dimensions and supports cluster-robust standard errors. Its implementation as the reghdfe Stata command has become the standard tool for applied researchers working with high-dimensional fixed effects in panel data.
de Chaisemartin, C., & D'Haultfoeuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.
De Chaisemartin and D'Haultfoeuille show that the TWFE estimator can assign negative weights to some treatment effects, potentially producing estimates with the wrong sign. They propose an alternative estimator and a decomposition that reveals which group-time effects receive negative weights.
Hausman, J. A. (1978). Specification Tests in Econometrics.
Hausman develops a general framework for specification testing based on comparing two estimators: one consistent under a broad set of assumptions and one efficient under a narrower null hypothesis. The test's most well-known application compares fixed effects (consistent if unit effects are correlated with regressors) against random effects (efficient under the null of no correlation), but the framework applies broadly to IV, simultaneous equations, and time-series cross-section models. The test statistic has a chi-squared distribution under the null and remains one of the most widely used diagnostic tools in applied econometrics.
Imai, K., & Kim, I. S. (2019). When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?.
Imai and Kim provide a modern causal-inference framework for understanding when unit fixed effects regression yields unbiased estimates with longitudinal data. They clarify the often-implicit assumptions about treatment history and carryover effects, offering a more rigorous foundation for applied fixed effects analysis.
Mundlak, Y. (1978). On the Pooling of Time Series and Cross Section Data.
Mundlak shows that the fixed effects estimator can be understood as an OLS regression that includes the group means of all time-varying regressors. This 'correlated random effects' interpretation bridges the fixed effects and random effects models and clarifies exactly what assumption is being relaxed.
Nickell, S. (1981). Biases in Dynamic Models with Fixed Effects.
Nickell shows that including a lagged dependent variable in a fixed effects regression creates a bias that does not vanish as the number of cross-sectional units grows. This 'Nickell bias' is a critical concern for researchers using fixed effects in dynamic panel models with short time series.
Application (5)
Abowd, J. M., Kramarz, F., & Margolis, D. N. (1999). High Wage Workers and High Wage Firms.
Abowd, Kramarz, and Margolis use worker and firm fixed effects jointly to decompose wage variation into worker ability and firm pay premia in this landmark paper. The 'AKM' model has become the standard framework for studying labor market sorting, wage inequality, and the role of firms in wage-setting.
Bertrand, M., & Schoar, A. (2003). Managing with Style: The Effect of Managers on Firm Policies.
Bertrand and Schoar use manager fixed effects (tracking CEOs who moved between firms) to show that individual managerial 'style' explains a significant portion of the variation in corporate investment, financial, and organizational practices. This paper is a key reference linking fixed effects methods to management questions.
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates.
Chetty, Friedman, and Rockoff use teacher fixed effects (value-added models) and quasi-experimental validation to measure individual teachers' causal impacts on student outcomes. They demonstrate that teacher fixed effects capture real causal effects, not just selection, and their work has influenced education policy worldwide.
Freeman, R. B., & Medoff, J. L. (1984). What Do Unions Do?.
Freeman and Medoff examine the effects of unions on wages, productivity, inequality, and workplace governance, drawing on a wide range of data sources and econometric methods including longitudinal analysis. The book argues that unions have both a monopoly face (raising wages above competitive levels) and a collective voice face (improving workplace communication and reducing turnover). It remains influential as a comprehensive empirical assessment of union effects and a common pedagogical motivation for fixed effects methods in labor economics.
Henderson, A. D., Miller, D., & Hambrick, D. C. (2006). How Quickly Do CEOs Become Obsolete? Industry Dynamism, CEO Tenure, and Company Performance.
Henderson, Miller, and Hambrick study how CEO tenure affects performance in dynamic versus stable industries in this longitudinal strategy paper. In the stable food industry, performance improved steadily with tenure, declining only after 10-15 years; in the dynamic computer industry, performance declined steadily from the start. The paper demonstrates that the relationship between CEO tenure and performance is contingent on industry dynamism.
Survey (3)
Angrist, J. D., & Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion.
Angrist and Pischke write one of the most influential modern textbooks on applied econometrics, organizing the field around a design-based approach to causal inference. The book provides essential treatments of instrumental variables, difference-in-differences, and regression discontinuity, each grounded in the potential outcomes framework. It remains the standard reference for graduate students learning to evaluate and implement identification strategies.
Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and Applications.
Cameron and Trivedi cover panel data methods comprehensively in Chapter 21, including fixed effects, random effects, and dynamic panel models. A standard graduate-level reference for microeconometric methods.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data.
Wooldridge's graduate textbook is the standard reference for cross-section and panel data econometrics. Chapters 10-11 provide a thorough treatment of fixed effects, random effects, and related panel data methods, while later chapters cover general estimation methodology (MLE, GMM, M-estimation) with panel data applications throughout. The book covers both linear and nonlinear models with careful attention to assumptions.