- Home
- Bibliography
Bibliography
All papers referenced across Method Atlas, formatted in APA 7th edition. Search, filter, sort, and export the collection.
Filters
Showing 380 of 380 references
- 0320
Abadie, A., & Gardeazabal, J. (2003). The Economic Costs of Conflict: A Case Study of the Basque Country. American Economic Review, 93(1), 113–132.
doi.org/10.1257/000282803321455188
Foundationalon synthetic controlterrorismBasque-Countryeconomic-costsAnnotation
Abadie and Gardeazabal introduce the synthetic control idea in the context of estimating the economic costs of terrorism in the Basque Country. They construct a synthetic Basque Country from other Spanish regions and show that terrorism reduced GDP per capita by about 10 percentage points.
- 0620
Abadie, A., & Imbens, G. W. (2006). Large Sample Properties of Matching Estimators for Average Treatment Effects. Econometrica, 74(1), 235–267.
doi.org/10.1111/j.1468-0262.2006.00655.x
Foundationalon matching methodsnearest-neighborlarge-sample-theoryvariance-estimationAnnotation
Abadie and Imbens derive the large-sample properties of nearest-neighbor matching estimators, showing that such estimators are not root-N consistent in general and do not attain the semiparametric efficiency bound. Their main practical contribution is a consistent analytical variance estimator that does not require nonparametric estimation of unknown functions. Bootstrap invalidity for matching is established separately in Abadie and Imbens (2008), and the bias-corrected matching estimator is developed in Abadie and Imbens (2011).
- 0820
Abadie, A., & Imbens, G. W. (2008). On the Failure of the Bootstrap for Matching Estimators. Econometrica, 76(6), 1537–1557.
Foundationalon matching methodsbootstrapmatching-inferencevariance-estimationAnnotation
Abadie and Imbens show that the standard bootstrap is inconsistent for nearest-neighbor matching estimators with a fixed number of matches, even though these estimators are asymptotically normal. Researchers should use the analytical variance estimator from Abadie and Imbens (2006) instead of bootstrapping.
- 1020
Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program. Journal of the American Statistical Association, 105(490), 493–505.
doi.org/10.1198/jasa.2009.ap08746
Foundationalon synthetic control, synthetic difference in differencessynthetic-controltobacco-policyCaliforniaAnnotation
Abadie, Diamond, and Hainmueller formalize and popularize the synthetic control method, which constructs a weighted combination of control units to approximate the counterfactual for a single treated unit. The application to California's Proposition 99 tobacco control program becomes the canonical example of the method.
- 1120
Abadie, A., & Imbens, G. W. (2011). Bias-Corrected Matching Estimators for Average Treatment Effects. Journal of Business & Economic Statistics, 29(1), 1–11.
doi.org/10.1198/jbes.2009.07333
Foundationalon matching methodsbias-correctionnearest-neighborregression-adjustmentAnnotation
Abadie and Imbens develop bias-corrected matching estimators that adjust for the finite-sample bias inherent in nearest-neighbor matching when matching is not exact. Their bias correction uses a regression adjustment within matched pairs and has become a standard recommendation for applied researchers using matching methods.
- 1520
Abadie, A., Diamond, A., & Hainmueller, J. (2015). Comparative Politics and the Synthetic Control Method. American Journal of Political Science, 59(2), 495–510.
Applicationon synthetic controlGerman-reunificationcomparative-politicspermutation-testAnnotation
Abadie, Diamond, and Hainmueller apply the synthetic control method to estimate the economic impact of German reunification, constructing a synthetic West Germany from OECD countries. They demonstrate the method's applicability to major political events and discuss its use in comparative politics as a bridge between quantitative and qualitative approaches. The application illustrates synthetic control's value for case studies where only one unit is treated.
- 2020
Abadie, A., Athey, S., Imbens, G. W., & Wooldridge, J. M. (2020). Sampling-Based versus Design-Based Uncertainty in Regression Analysis. Econometrica, 88(1), 265–296.
Foundationalon ols regressionclusteringstandard-errorsinferenceresearch-designAnnotation
Abadie et al. distinguish between sampling-based uncertainty (from drawing a sample from a population) and design-based uncertainty (from treatment assignment) in regression analysis. They show that conventional standard errors can be conservative when the sample includes a substantial fraction of the population, providing a rigorous framework for understanding what regression standard errors actually measure. This paper clarifies the conceptual foundations for inference in empirical work and complements their separate 2023 QJE paper on clustering.
- 2120
Abadie, A. (2021). Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects. Journal of Economic Literature, 59(2), 391–425.
Surveyon synthetic controlsurveyplacebo-testsmethodologyAnnotation
Abadie provides a comprehensive methodological overview of synthetic control, covering data requirements, inference via placebo tests, extensions to multiple treated units, and common pitfalls. This paper is the authoritative practitioner's guide to the method.
- 2320
Abadie, A., Athey, S., Imbens, G. W., & Wooldridge, J. M. (2023). When Should You Adjust Standard Errors for Clustering?. Quarterly Journal of Economics, 138(1), 1–35.
Foundationalon difference in differencesclusteringstandard-errorsinferencedesign-basedAnnotation
Abadie et al. provide guidance on when clustering standard errors is necessary. They show that clustering can be motivated by sampling-based uncertainty (e.g., two-stage sampling of clusters then units) or design-based uncertainty (e.g., treatment assigned at the cluster level), and that whether to cluster, and at what level, is a substantive question tied to the sampling and assignment process — not a purely mechanical rule.
- 9919
Abowd, J. M., Kramarz, F., & Margolis, D. N. (1999). High Wage Workers and High Wage Firms. Econometrica, 67(2), 251–333.
doi.org/10.1111/1468-0262.00020
Applicationon fixed effectsworker-fixed-effectsfirm-fixed-effectswage-decompositionAnnotation
Abowd, Kramarz, and Margolis use worker and firm fixed effects jointly to decompose wage variation into worker ability and firm pay premia in this landmark paper. The 'AKM' model has become the standard framework for studying labor market sorting, wage inequality, and the role of firms in wage-setting.
- 0120
Acemoglu, D., Johnson, S., & Robinson, J. A. (2001). The Colonial Origins of Comparative Development: An Empirical Investigation. American Economic Review, 91(5), 1369–1401.
Applicationon instrumental variablesinstitutionseconomic-developmentcolonial-historyAnnotation
Acemoglu, Johnson, and Robinson use historical settler mortality as an instrument for institutional quality to estimate the causal effect of institutions on economic development in this celebrated paper. It is one of the most influential IV applications in economics and demonstrates the creativity required to find a plausible instrument.
- 1620
Acharya, A., Blackwell, M., & Sen, M. (2016). Explaining Causal Findings Without Bias: Detecting and Assessing Direct Effects. American Political Science Review, 110(3), 512–529.
doi.org/10.1017/S0003055416000216
Foundationalon causal mediation analysiscontrolled-direct-effectssequential-g-estimationobservational-studiescollider-biasAnnotation
Acharya, Blackwell, and Sen develop a sequential g-estimation approach for estimating controlled direct effects in observational studies, addressing the problem that conditioning on a post-treatment mediator can introduce collider bias. Their method is particularly useful in political science and social science settings where intermediate confounders make standard mediation analysis unreliable.
- 2020
Acquisti, A., & Fong, C. M. (2020). An Experiment in Hiring Discrimination via Online Social Networks. Management Science, 66(3), 1005–1024.
doi.org/10.1287/mnsc.2018.3269
audit-studydiscriminationsocial-mediahiringfield-experimentAnnotation
Acquisti and Fong conduct a correspondence experiment using social media profiles to study hiring discrimination based on religion and sexual orientation. They find no significant national-level discrimination against Muslim or gay candidates, but significant anti-Muslim discrimination emerges in Republican-leaning areas. The paper illustrates how online information creates new channels for employment discrimination that vary with local attitudes.
- 1920
Adao, R., Kolesar, M., & Morales, E. (2019). Shift-Share Designs: Theory and Inference. Quarterly Journal of Economics, 134(4), 1949–2010.
Foundationalon shift share instrumentsinferencestandard-errorsspatial-correlationAnnotation
Adao, Kolesar, and Morales show that standard errors in shift-share regressions are too small when computed with conventional clustering because residuals are correlated across regions that share similar industry compositions. They propose an inference procedure that accounts for this dependence.
- 0520
Aguinis, H., Beaty, J. C., Boik, R. J., & Pierce, C. A. (2005). Effect Size and Power in Assessing Moderating Effects of Categorical Variables Using Multiple Regression: A 30-Year Review. Journal of Applied Psychology, 90(1), 94–107.
doi.org/10.1037/0021-9010.90.1.94
Applicationon power analysismoderationinteraction-effectsapplied-psychologyAnnotation
Aguinis, Beaty, Boik, and Pierce review 30 years of moderator analysis in applied psychology and management, finding that most studies are severely underpowered to detect interaction effects. They provide guidelines for computing power for moderated regression.
- 1320
Aguinis, H., Gottfredson, R. K., & Culpepper, S. A. (2013). Best-Practice Recommendations for Estimating Cross-Level Interaction Effects Using Multilevel Modeling. Journal of Management, 39(6), 1490–1528.
doi.org/10.1177/0149206313478188
cross-level-interactionsmultilevelbest-practicesAnnotation
Aguinis, Gottfredson, and Culpepper provide detailed guidance for management researchers on estimating cross-level interaction effects in multilevel models. They address common problems including insufficient statistical power, centering decisions, and effect size reporting that frequently lead to unreliable results in organizational research. The paper offers concrete recommendations for sample size, model specification, and interpretation that improve the credibility of multilevel interaction analyses.
- 1720
Aguinis, H., Edwards, J. R., & Bradley, K. J. (2017). Improving Our Understanding of Moderation and Mediation in Strategic Management Research. Organizational Research Methods, 20(4), 665–685.
doi.org/10.1177/1094428115627498
management-methodologymoderationbest-practicesAnnotation
Aguinis, Edwards, and Bradley review how mediation and moderation analyses are conducted in strategic management research and identify common errors. They provide recommendations for improving practice, including using causal mediation frameworks and proper inference procedures.
- 1820
Aguinis, H., Ramani, R. S., & Alabduljader, N. (2018). What You See Is What You Get? Enhancing Methodological Transparency in Management Research. Academy of Management Annals, 12(1), 83–110.
doi.org/10.5465/annals.2016.0011
management-methodologytransparencyopen-scienceAnnotation
Aguinis, Ramani, and Alabduljader review methodological transparency in management research and advocate for pre-registration, open data, and open materials. They document the extent of undisclosed analytical flexibility in management studies and propose concrete steps for improvement.
- 0020
Ahuja, G. (2000). Collaboration Networks, Structural Holes, and Innovation: A Longitudinal Study. Administrative Science Quarterly, 45(3), 425–455.
networksstructural-holesinnovationpatentsnegative-binomial+1Annotation
Ahuja uses a random effects Poisson model (following Hausman, Hall, and Griliches 1984) to model patent counts as a function of collaboration network structure in this landmark network study. He finds that direct ties and indirect ties both increase innovation, while structural holes (gaps between partners) decrease it — challenging Burt's structural holes theory in the context of innovation. The paper demonstrates the use of count models with panel data in management research, with fixed effects Poisson estimated as a robustness check.
- 0320
Ai, C., & Norton, E. C. (2003). Interaction Terms in Logit and Probit Models. Economics Letters, 80(1), 123–129.
doi.org/10.1016/S0165-1765(03)00032-6
Foundationalon logit probitinteraction-effectsmarginal-effectsnonlinear-modelsAnnotation
Ai and Norton show that the interpretation of interaction terms in nonlinear models like logit and probit is much more complicated than in linear models. The marginal effect of an interaction is not simply the coefficient on the interaction term, a mistake that is widespread in applied research.
- 8219
Akerlof, G. A. (1982). Labor Contracts as Partial Gift Exchange. Quarterly Journal of Economics, 97(4), 543–569.
Foundationalon lab experiment replicationAnnotation
Akerlof proposes the gift exchange model of labor markets, in which firms pay above-market wages and workers reciprocate with above-minimum effort. This framework provides a behavioral foundation for efficiency wages and has been tested extensively in laboratory and field experiments.
- 1220
Albouy, D. Y. (2012). The Colonial Origins of Comparative Development: An Empirical Investigation: Comment. American Economic Review, 102(6), 3059–3076.
doi.org/10.1257/aer.102.6.3059
Applicationon instrumental variablesinstrument-validityreplicationcolonial-originssensitivityAnnotation
Albouy critically re-examines the settler mortality instrument used in Acemoglu et al. (2001), showing that the original results are sensitive to data coding decisions and the sample of countries included. This comment is a cautionary tale about instrument validity and the fragility of influential IV estimates.
- 9919
Allison, P. D. (1999). Comparing Logit and Probit Coefficients Across Groups. Sociological Methods & Research, 28(2), 186–208.
doi.org/10.1177/0049124199028002003
Foundationalon logit probitlogitprobitgroup-comparisonscoefficient-scalingAnnotation
Allison shows that naive comparisons of logit or probit coefficients across groups are misleading because differences in residual variation across groups rescale the coefficients. He proposes a method to adjust for this confound, which is essential for interpreting interaction effects and group comparisons in nonlinear models.
- 0920
Allison, P. D. (2009). Fixed Effects Regression Models. SAGE Publications.
Surveyon random effectsfixed-vs-randompanel-datatextbookpractical-guidanceAnnotation
Allison's concise and accessible monograph compares fixed effects and random effects models for panel data, providing practical guidance on model selection, estimation, and interpretation. It is particularly useful for social scientists seeking an intuitive understanding of when each approach is appropriate.
- 0520
Altonji, J. G., Elder, T. E., & Taber, C. R. (2005). Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools. Journal of Political Economy, 113(1), 151–184.
Foundationalon sensitivity analysisselection-on-observablesCatholic-schoolsboundingAnnotation
Altonji, Elder, and Taber develop the idea that if selection on observables is informative about selection on unobservables, one can bound the bias from omitted variables. Their approach becomes the basis for the widely used Oster (2019) sensitivity framework.
- 8119
Amemiya, T. (1981). Qualitative Response Models: A Survey. Journal of Economic Literature, 19(4), 1483–1536.
Foundationalon logit probitsurveyqualitative-responsemaximum-likelihoodAnnotation
Amemiya provides a comprehensive survey of qualitative response models including logit, probit, and tobit. This survey organizes the theoretical properties, estimation methods, and specification tests for binary and multinomial choice models and becomes a standard reference for applied researchers.
- 0820
Anderson, M. L. (2008). Multiple Inference and Gender Differences in the Effects of Early Intervention: A Reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects. Journal of the American Statistical Association, 103(484), 1481–1495.
doi.org/10.1198/016214508000000841
Foundationalon multiple testingindex-testsWestfall-Youngprogram-evaluationAnnotation
Anderson proposes using summary index tests and familywise error rate corrections to address multiple inference in program evaluation. Reanalyzing the Abecedarian, Perry Preschool, and Early Training Projects, he finds that girls garner substantial short- and long-term benefits from early interventions, but there are no significant long-term benefits for boys after correcting for multiple testing.
- 1920
Andrews, I., Stock, J. H., & Sun, L. (2019). Weak Instruments in Instrumental Variables Regression: Theory and Practice. Annual Review of Economics, 11, 727–753.
doi.org/10.1146/annurev-economics-080218-025643
Surveyon instrumental variablesweak-instrumentssurveyrobust-inferenceAnnotation
Andrews, Stock, and Sun provide an up-to-date review of the weak instruments problem, covering modern diagnostic tests, robust inference procedures, and practical recommendations. It is an excellent starting point for understanding the current best practices in IV estimation.
- 9019
Angrist, J. D. (1990). Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records. American Economic Review, 80(3), 313–336.
Foundationalon instrumental variablesinstrumental-variablesnatural-experimentdraft-lotteryLATEAnnotation
Angrist uses the Vietnam-era draft lottery as a natural experiment in this landmark application of instrumental variables. He shows that randomly assigned lottery numbers provide an instrument for military service, allowing causal estimation of the earnings effect of military service.
- 9119
Angrist, J. D., & Krueger, A. B. (1991). Does Compulsory School Attendance Affect Schooling and Earnings?. Quarterly Journal of Economics, 106(4), 979–1014.
Foundationalon instrumental variablesreturns-to-educationquarter-of-birthcompulsory-schoolingAnnotation
Angrist and Krueger use quarter of birth as an instrument for years of schooling, exploiting the fact that compulsory schooling laws interact with birth timing. This paper is one of the most-taught examples of instrumental variables in economics and also sparks important debates about weak instruments.
- 9619
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association, 91(434), 444–455.
doi.org/10.1080/01621459.1996.10476902
Foundationalon experimental design, instrumental variablesLATEcompliersinstrumental-variablespotential-outcomesAnnotation
Angrist, Imbens, and Rubin formalize the LATE framework — originally introduced in Imbens and Angrist (1994) — within the Rubin Causal Model, providing a detailed treatment of the assumptions required for causal interpretation of IV estimates. This paper introduces the complier taxonomy (always-takers, never-takers, compliers, defiers) that is now standard in the IV literature. The practical implication is that IV estimates should be interpreted as local to the complier subpopulation, not as average effects for the entire population.
- 9919
Angrist, J. D., & Lavy, V. (1999). Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement. Quarterly Journal of Economics, 114(2), 533–575.
doi.org/10.1162/003355399556061
Applicationon regression discontinuity fuzzyclass-sizeeducationMaimonides-ruleAnnotation
Angrist and Lavy exploit a rule that caps class sizes at 40 students, creating discontinuities in class size as enrollment crosses multiples of 40. The imperfect compliance with the rule makes this a fuzzy RDD. This paper is one of the most widely taught examples of the fuzzy RDD approach.
- 0120
Angrist, J. D., & Krueger, A. B. (2001). Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Journal of Economic Perspectives, 15(4), 69–85.
Surveyon instrumental variableshistory-of-IVnatural-experimentssupply-and-demandidentificationAnnotation
Angrist and Krueger trace the evolution of IV from its origins in supply-and-demand estimation to modern natural experiments in this historical survey. They provide valuable context for understanding how IV methodology developed and why it becomes central to applied economics.
- 0620
Angrist, J. D., Chernozhukov, V., & Fernandez-Val, I. (2006). Quantile Regression under Misspecification, with an Application to the U.S. Wage Structure. Econometrica, 74(2), 539–563.
doi.org/10.1111/j.1468-0262.2006.00671.x
Applicationon quantile treatment effectsapplicationwage-structurereturns-to-educationAnnotation
Angrist, Chernozhukov, and Fernandez-Val study quantile regression under misspecification, showing that QR coefficients minimize a weighted mean-squared specification-error loss and deriving an omitted-variable-bias formula for quantile regression. Applying this framework to U.S. Census wage data, they document continued residual inequality growth in the 1990s, primarily in the upper half of the distribution.
- 0620
Angrist, J., Bettinger, E., & Kremer, M. (2006). Long-Term Educational Consequences of Secondary School Vouchers: Evidence from Administrative Records in Colombia. American Economic Review, 96(3), 847–862.
Applicationon lee boundsschool-vouchersattritionColombiaAnnotation
Angrist, Bettinger, and Kremer use administrative records to study the long-term effects of Colombia's PACES school voucher lottery, finding that vouchers increase secondary school completion rates by 15-20% and raise college admissions test scores by 0.2 standard deviations. They correct for differential test-taking rates between lottery winners and losers using bounding methods. The paper demonstrates how administrative data and lottery-based instruments enable credible long-term policy evaluation.
- 0920
Angrist, J. D., & Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press.
Surveyon difference in differences, staggered difference in differences, doubly robust estimation +11textbookcausal-inferencedesign-basedcredibility-revolutionAnnotation
Angrist and Pischke write one of the most influential modern textbooks on applied econometrics, organizing the field around a design-based approach to causal inference. The book provides essential treatments of instrumental variables, difference-in-differences, and regression discontinuity, each grounded in the potential outcomes framework. It remains the standard reference for graduate students learning to evaluate and implement identification strategies.
- 1020
Angrist, J. D., & Pischke, J.-S. (2010). The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics. Journal of Economic Perspectives, 24(2), 3–30.
Surveyon ols regressioncredibility-revolutionresearch-designcausal-inferencemethodologyAnnotation
Angrist and Pischke provide the intellectual context for why applied economics moved from 'throw variables into OLS and see what sticks' to design-based causal inference. They help researchers understand where OLS fits in the larger methodological landscape and why credible identification strategies matter.
- 2120
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic Difference-in-Differences. American Economic Review, 111(12), 4088–4118.
Foundationalon synthetic difference in differencessynthetic-DIDunit-weightstime-weightsAnnotation
Arkhangelsky et al. introduce the synthetic difference-in-differences estimator, which combines the strengths of DID (parallel trends assumption) and synthetic control (re-weighting to improve pre-treatment fit). The method uses both unit weights and time weights to construct a more credible counterfactual, and provides valid inference without requiring a large donor pool.
- 2220
Arkhangelsky, D., & Imbens, G. W. (2022). Doubly Robust Identification for Causal Panel Data Models. Econometrics Journal, 25(3), 649–674.
Foundationalon synthetic difference in differencesdoubly-robustcausal-panel-dataSDID-extensionAnnotation
Arkhangelsky and Imbens develop doubly robust identification strategies for causal panel data models, combining outcome modeling with re-weighting to provide consistent estimates if either the outcome model or the weighting scheme is correctly specified. The framework is broader than synthetic DID specifically but directly relevant to it, strengthening the theoretical foundations for panel-data treatment effect estimation.
- 7819
Ashenfelter, O. (1978). Estimating the Effect of Training Programs on Earnings. Review of Economics and Statistics, 60(1), 47–57.
Foundationalon difference in differencestraining-programsearningsearly-DIDAnnotation
Ashenfelter provides one of the earliest applications of the difference-in-differences logic, comparing the earnings of trainees before and after a job training program to a comparison group. The key insight is that differencing removes time-invariant unobserved differences between treatment and control groups. This paper also documents the 'Ashenfelter dip' — the pre-program earnings decline among trainees — which becomes a canonical example of why parallel trends cannot be taken for granted.
- 1620
Athey, S., & Imbens, G. W. (2016). Recursive Partitioning for Heterogeneous Causal Effects. Proceedings of the National Academy of Sciences, 113(27), 7353–7360.
doi.org/10.1073/pnas.1510489113
Foundationalon causal forestscausal-treeshonest-estimationheterogeneous-effectsAnnotation
Athey and Imbens introduce causal trees, adapting the CART algorithm to estimate heterogeneous treatment effects with valid inference. They propose the honest estimation approach, where one subsample is used for tree construction and another for estimation, ensuring valid confidence intervals.
- 1720
Athey, S., & Imbens, G. W. (2017). The Econometrics of Randomized Experiments. Handbook of Economic Field Experiments, 1, 73–140.
doi.org/10.1016/bs.hefe.2016.10.003
Foundationalon experimental design, randomization inferencefield-experimentsrandomization-inferencedesignAnnotation
Athey and Imbens provide a modern, rigorous treatment of the econometrics behind randomized experiments. They cover design, analysis, and inference issues such as stratification, clustering, and multiple hypothesis testing. It is an excellent reference for researchers running field experiments.
- 1920
Athey, S., & Imbens, G. W. (2019). Machine Learning Methods That Economists Should Know About. Annual Review of Economics, 11, 685–725.
doi.org/10.1146/annurev-economics-080217-053433
Surveyon double debiased machine learningsurveymachine-learningeconomicsAnnotation
Athey and Imbens provide a broad survey of machine learning methods relevant to economists, covering supervised learning, unsupervised learning, matrix completion, and methods at the intersection of ML and causal inference including DML and causal forests. The paper explains when and why machine learning methods can improve both prediction and causal inference in economics. It serves as an accessible entry point for applied researchers seeking to understand the full landscape of ML tools available for economic applications.
- 1920
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized Random Forests. Annals of Statistics, 47(2), 1148–1178.
generalized-random-forestsestimating-equationsgrf-packageAnnotation
Athey, Tibshirani, and Wager introduce the generalized random forest (GRF) framework, which extends causal forests to a broad class of estimating equations including quantile regression, IV, and local average treatment effects. GRF provides the theoretical foundation and the widely used grf R package.
- 0320
Autor, D. H. (2003). Outsourcing at Will: The Contribution of Unjust Dismissal Doctrine to the Growth of Employment Outsourcing. Journal of Labor Economics, 21(1), 1–42.
Applicationon difference in differencesemployment-lawoutsourcingstaggered-adoptionAnnotation
Autor uses a DID design that exploits the staggered adoption of wrongful-discharge protections across U.S. states. He finds that stronger employment protections led firms to outsource more jobs. This paper is a model for using staggered state-level policy changes in a DID framework.
- 1320
Autor, D. H., Dorn, D., & Hanson, G. H. (2013). The China Syndrome: Local Labor Market Effects of Import Competition in the United States. American Economic Review, 103(6), 2121–2168.
doi.org/10.1257/aer.103.6.2121
Applicationon shift share instrumentsChina-shocktradelabor-marketsAnnotation
Autor, Dorn, and Hanson use a shift-share instrument to study how Chinese import competition affected U.S. local labor markets, instrumenting U.S. import exposure with Chinese exports to other high-income countries. This paper is one of the most influential and widely discussed shift-share applications.
- 1020
Azoulay, P., Graff Zivin, J. S., & Wang, J. (2010). Superstar Extinction. Quarterly Journal of Economics, 125(2), 549–589.
doi.org/10.1162/qjec.2010.125.2.549
Applicationon matching methodssuperstar-scientistscollaborationinnovationscience-of-scienceAnnotation
Azoulay and coauthors exploit the premature and unexpected deaths of 112 academic superstars as a natural experiment, using coarsened exact matching to construct a control group of comparable collaborators. They find that the death of a superstar leads to a lasting 5-8% decline in the quality-adjusted publication rates of their collaborators, with spillovers circumscribed in idea space but less so in physical or social space. This study is an elegant application of a natural experiment combined with matching in the economics of science and innovation.
- 1420
Azoulay, P., Stuart, T., & Wang, Y. (2014). Matthew: Effect or Fable?. Management Science, 60(1), 92–109.
doi.org/10.1287/mnsc.2013.1755
matchingcoarsened-exact-matchingMatthew-effectcumulative-advantagescience-of-scienceAnnotation
Azoulay, Stuart, and Wang investigate whether mid-career recognition (Howard Hughes Medical Institute appointment) creates a cumulative advantage or 'Matthew effect' in science. They use coarsened exact matching to construct a comparison group of equally productive scientists, addressing the selection problem inherent in studying prestigious awards. The study finds a small, short-lived citation boost to papers published before HHMI appointment, suggesting a status or halo effect on pre-existing work rather than a sustained productivity advantage.
- 2220
Bach, P., Chernozhukov, V., Kurz, M. S., & Spindler, M. (2022). DoubleML – An Object-Oriented Implementation of Double Machine Learning in Python. Journal of Machine Learning Research, 23(53), 1–6.
softwarePythonRimplementationAnnotation
Bach and colleagues develop the DoubleML Python package, providing a user-friendly object-oriented implementation of the DML framework. The package supports partially linear, interactive, and instrumental variable models with a variety of machine learning methods for nuisance estimation. A companion R package is described separately.
- 2220
Baker, A. C., Larcker, D. F., & Wang, C. C. Y. (2022). How Much Should We Trust Staggered Difference-in-Differences Estimates?. Journal of Financial Economics, 144(2), 370–395.
doi.org/10.1016/j.jfineco.2022.01.004
Applicationon staggered difference in differencesfinancereplicationTWFE-biasAnnotation
Baker, Larcker, and Wang demonstrate that the staggered DID problems identified in the econometrics literature are empirically relevant in finance research. They re-analyzed prominent finance studies and show that results can change substantially when using robust estimators.
- 2120
Baltagi, B. H. (2021). Econometric Analysis of Panel Data. Springer, 6th edition.
doi.org/10.1007/978-3-030-53953-5
Surveyon random effectstextbookpanel-dataerror-componentsdynamic-panelsAnnotation
Baltagi provides the standard graduate-level textbook on panel data econometrics, covering fixed effects, random effects, error component models, and extensions to unbalanced panels and dynamic models. The book offers comprehensive treatment of both the theoretical foundations of panel data estimators and their practical implementation across statistical software. It is the primary reference for researchers who need to understand the assumptions, properties, and trade-offs of different panel data methods.
- 0520
Bandiera, O., Barankay, I., & Rasul, I. (2005). Social Preferences and the Response to Incentives: Evidence from Personnel Data. Quarterly Journal of Economics, 120(3), 917–962.
Applicationon experimental designincentivesfield-experimentpersonnel-economicsAnnotation
Bandiera, Barankay, and Rasul use a field experiment in a fruit-picking firm to study how switching from relative to piece-rate pay affects productivity. They demonstrate that social preferences among workers matter for incentive design, bridging experimental economics and management.
- 1520
Banerjee, A., Duflo, E., Goldberg, N., Karlan, D., Osei, R., Pariente, W., Shapiro, J., Thuysbaert, B., & Udry, C. (2015). A Multifaceted Program Causes Lasting Progress for the Very Poor: Evidence from Six Countries. Science, 348(6236), 1260799.
doi.org/10.1126/science.1260799
Applicationon experimental designdevelopment-economicsmulti-country-RCTpovertyAnnotation
Banerjee, Duflo, and colleagues conduct a large-scale RCT across six countries, demonstrating that a multifaceted anti-poverty program produces sustained economic gains for the ultra-poor. The study is notable for its multi-site design, which provides rare multi-country evidence on how the same intervention performs across diverse contexts. It demonstrates both the power of randomized evaluation at scale and the importance of bundled interventions when individual components may be insufficient.
- 0520
Bang, H., & Robins, J. M. (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61(4), 962–973.
doi.org/10.1111/j.1541-0420.2005.00377.x
Foundationalon doubly robust estimationdouble-robustnesssimulationtutorialAnnotation
Bang and Robins provide an accessible exposition of doubly robust estimators, demonstrating their properties through simulations and clarifying when the double robustness property provides meaningful protection. This paper helps make the method more accessible to applied researchers.
- 8619
Baron, R. M., & Kenny, D. A. (1986). The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182.
doi.org/10.1037/0022-3514.51.6.1173
Foundationalon causal mediation analysismediationmoderator-mediatorsocial-psychologyAnnotation
Baron and Kenny introduce the widely used four-step approach to testing mediation, comparing total, direct, and indirect effects using sequential regressions. While later work has identified limitations of this approach, it remains one of the most cited papers in all of social science.
- 1120
Barone-Adesi, F., Gasparrini, A., Vizzini, L., Merletti, F., & Richiardi, L. (2011). Effects of Italian Smoking Regulation on Rates of Hospital Admission for Acute Coronary Events: A Country-Wide Study. PLoS ONE, 6(3), e17419.
doi.org/10.1371/journal.pone.0017419
Applicationon lab its replicationAnnotation
Barone-Adesi et al. use an interrupted time series design to estimate the effect of Italy's 2005 smoking ban on acute coronary event admissions, finding a significant reduction among those under 70 in the months following implementation.
- 9119
Bartik, T. J. (1991). Who Benefits from State and Local Economic Development Policies?. W.E. Upjohn Institute for Employment Research.
doi.org/10.17848/9780585223940
Foundationalon shift share instrumentsBartik-instrumentlocal-labor-marketsemploymentAnnotation
Bartik introduces the shift-share instrument—constructing predicted local employment growth from national industry growth rates interacted with initial local industry composition. This 'Bartik instrument' has become one of the most widely used instruments in labor and urban economics.
- 0820
Battistin, E., & Rettore, E. (2008). Ineligibles and Eligible Non-Participants as a Double Comparison Group in Regression-Discontinuity Designs. Journal of Econometrics, 142(2), 715–730.
doi.org/10.1016/j.jeconom.2007.05.006
Foundationalon regression discontinuity fuzzyimperfect-compliancedouble-comparisonboundsfuzzy-RDDAnnotation
Battistin and Rettore propose using ineligible units and eligible non-participants as a double comparison group in regression-discontinuity designs. This specification-testing strategy allows researchers to assess the validity of RDD assumptions by checking whether the two comparison groups yield consistent estimates, strengthening the credibility of RDD-based inference.
- 1520
Bell, A., & Jones, K. (2015). Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data. Political Science Research and Methods, 3(1), 133–153.
Foundationalon random effectswithin-betweenpanel-datamodel-choiceAnnotation
Bell and Jones argue that the 'within-between' random-effects model (closely related to the Mundlak approach) can outperform pure fixed effects in certain settings because it allows explicit decomposition of within- and between-unit effects while accounting for unobserved heterogeneity. This approach retains the unbiasedness of the within estimator for time-varying regressors while also estimating between-unit effects that fixed effects discard. The paper provides practical guidance for researchers who need to estimate both types of effects or who have time-invariant regressors that fixed effects cannot identify.
- 1420
Belloni, A., Chernozhukov, V., & Hansen, C. (2014). Inference on Treatment Effects after Selection among High-Dimensional Controls. Review of Economic Studies, 81(2), 608–650.
Foundationalon double debiased machine learningLASSOpost-double-selectionhigh-dimensionalAnnotation
Belloni, Chernozhukov, and Hansen introduce the post-double-selection LASSO method for inference on treatment effects with many potential controls. This paper is a key precursor to DML, demonstrating how regularized selection in both the treatment and outcome equations can yield valid inference.
- 2120
Ben-Michael, E., Feller, A., & Rothstein, J. (2021). The Augmented Synthetic Control Method. Journal of the American Statistical Association, 116(536), 1789–1803.
doi.org/10.1080/01621459.2021.1929245
Foundationalon synthetic control, synthetic difference in differencesaugmented-SCMbias-reductiondoubly-robustAnnotation
Ben-Michael, Feller, and Rothstein propose augmenting the synthetic control estimator with an outcome model to reduce bias when the synthetic control does not achieve perfect pre-treatment fit. The resulting doubly robust estimator is consistent if either the outcome model or the weighting is correct, providing a practical improvement for applied synthetic control studies.
- 2220
Ben-Michael, E., Feller, A., & Rothstein, J. (2022). Synthetic Controls with Staggered Adoption. Journal of the Royal Statistical Society: Series B, 84(2), 351–381.
Foundationalon synthetic difference in differencesstaggered-adoptioncollective-bargainingeducation-policyAnnotation
Ben-Michael, Feller, and Rothstein extend synthetic control and synthetic DID methods to staggered adoption settings where multiple units adopt treatment at different times. They demonstrate the approach by estimating the effects of teacher collective bargaining laws on school spending across U.S. states, showing how synthetic DID-style reweighting improves counterfactual estimation when treatment rolls out over time.
- 9519
Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B, 57(1), 289–300.
doi.org/10.1111/j.2517-6161.1995.tb02031.x
Foundationalon multiple testingFDRstep-up-procedurefalse-discovery-rateAnnotation
Benjamini and Hochberg introduce the false discovery rate (FDR) as an alternative to family-wise error rate control. Their step-up procedure for controlling FDR is less conservative than Bonferroni while still providing meaningful protection against false positives, and has become the standard in many fields.
- 0720
Bennedsen, M., Nielsen, K. M., Pérez-González, F., & Wolfenzon, D. (2007). Inside the Family Firm: The Role of Families in Succession Decisions and Performance. Quarterly Journal of Economics, 122(2), 647–691.
doi.org/10.1162/qjec.122.2.647
Applicationon instrumental variablescorporate-governanceCEO-successionnatural-experimentfamily-firmsAnnotation
Bennedsen et al. use the gender of the controlling family's firstborn child as an instrument for whether the successor CEO is a family member or a professional outsider. They find that family successions cause a large negative impact on firm performance, with operating profitability falling by at least four percentage points. The paper demonstrates how a creative natural experiment can address endogeneity in corporate governance research.
- 0320
Bertrand, M., & Schoar, A. (2003). Managing with Style: The Effect of Managers on Firm Policies. Quarterly Journal of Economics, 118(4), 1169–1208.
doi.org/10.1162/003355303322552775
Applicationon fixed effectsmanager-fixed-effectsCEO-stylecorporate-policyAnnotation
Bertrand and Schoar use manager fixed effects (tracking CEOs who moved between firms) to show that individual managerial 'style' explains a significant portion of the variation in corporate investment, financial, and organizational practices. This paper is a key reference linking fixed effects methods to management questions.
- 0420
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How Much Should We Trust Differences-in-Differences Estimates?. Quarterly Journal of Economics, 119(1), 249–275.
doi.org/10.1162/003355304772839588
Foundationalon difference in differencesserial-correlationclustered-standard-errorsinferenceAnnotation
Bertrand, Duflo, and Mullainathan show that standard errors in DID studies are often far too small because they ignore serial correlation within units over time. They propose clustering standard errors at the group level as a simple fix, which is now widely recommended practice in DID applications.
- 0420
Bertrand, M., & Mullainathan, S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94(4), 991–1013.
doi.org/10.1257/0002828042002561
Applicationon experimental designaudit-studydiscriminationlabor-marketfield-experimentAnnotation
Bertrand and Mullainathan send fictitious resumes with randomly assigned names to employers and find that 'white-sounding' names receive 50% more callbacks in this famous audit study. It is one of the most widely cited field experiments in social science and a powerful example of how randomization can identify discrimination.
- 0620
Bitler, M. P., Gelbach, J. B., & Hoynes, H. W. (2006). What Mean Impacts Miss: Distributional Effects of Welfare Reform Experiments. American Economic Review, 96(4), 988–1012.
Applicationon quantile treatment effectsapplicationwelfare-reformdistributional-effectsAnnotation
Bitler, Gelbach, and Hoynes apply quantile treatment effects to experimental data from the Connecticut Jobs First welfare reform program. They show that the average treatment effect masks dramatic heterogeneity: the program had no impact at the bottom of the earnings distribution, increased earnings in the middle, and decreased earnings at the top. The paper demonstrates why distributional analysis is essential for evaluating social programs whose effects vary across the outcome distribution.
- 9219
Blanchard, O. J., & Katz, L. F. (1992). Regional Evolutions. Brookings Papers on Economic Activity, 1992(1), 1–76.
Applicationon shift share instrumentsregional-adjustmentmigrationlabor-marketsAnnotation
Blanchard and Katz study regional labor market adjustment in the United States, analyzing how local employment shocks affect wages, unemployment, and migration. They construct a predicted-employment instrument using national industry growth interacted with local industry shares—the approach the subsequent literature calls the Bartik or shift-share instrument.
- 2120
Blomquist, S., Newey, W. K., Kumar, A., & Liang, C.-Y. (2021). On Bunching and Identification of the Taxable Income Elasticity. Journal of Political Economy, 129(8), 2320–2343.
Foundationalon bunching estimationidentificationtaxable-income-elasticitycritiquenonparametricAnnotation
Blomquist, Newey, Kumar, and Liang provide a critical examination of the identification assumptions underlying bunching estimation. They show that the standard bunching estimator identifies the elasticity only under strong assumptions about the functional form of the counterfactual density and the distribution of preferences. Without these assumptions, the amount of bunching is consistent with a range of elasticities. The paper sparks an important methodological debate about what bunching can and cannot identify, and motivates subsequent work on tightening identification in bunching designs.
- 9519
Bloom, H. S. (1995). Minimum Detectable Effects: A Simple Way to Report the Statistical Power of Experimental Designs. Evaluation Review, 19(5), 547–556.
doi.org/10.1177/0193841X9501900504
Foundationalon power analysisMDEminimum-detectable-effectprogram-evaluationAnnotation
Bloom introduces the minimum detectable effect (MDE) framework, which reports the smallest effect size a study can reliably detect given its design and sample size. This approach is now the standard way to discuss statistical power in program evaluation and experimental economics.
- 0720
Bloom, N., & Van Reenen, J. (2007). Measuring and Explaining Management Practices Across Firms and Countries. Quarterly Journal of Economics, 122(4), 1351–1408.
doi.org/10.1162/qjec.2007.122.4.1351
Applicationon instrumental variablesmanagement-practicesproductivityfirm-performanceAnnotation
Bloom and Van Reenen develop a survey-based measure of management practices and document that better management is strongly associated with higher productivity, profitability, and growth. They use IV strategies (including product market competition and primogeniture rules for family management succession) to investigate why management quality varies, finding that poor management is more prevalent when competition is weak and when family firms follow primogeniture. The paper is foundational for the measurement of management practices; the IV analysis is one component of a broader measurement and descriptive study.
- 1520
Bloom, N., Liang, J., Roberts, J., & Ying, Z. J. (2015). Does Working from Home Work? Evidence from a Chinese Experiment. Quarterly Journal of Economics, 130(1), 165–218.
Applicationon experimental designremote-workproductivityfield-experimentmanagementAnnotation
Bloom and colleagues conduct a large-scale randomized experiment at a Chinese travel agency, finding that working from home leads to a 13% performance increase. The study becomes a landmark reference in management and labor economics for its clean experimental design applied to a practical workplace question.
- 3619
Bonferroni, C. E. (1936). Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8, 3–62.
Foundationalon multiple testingBonferroni-correctionFWERclassicalAnnotation
Bonferroni develops the classical correction for multiple comparisons, which controls the family-wise error rate by dividing the significance level by the number of tests. While conservative, the Bonferroni correction remains widely used due to its simplicity and broad applicability.
- 2220
Borusyak, K., Hull, P., & Jaravel, X. (2022). Quasi-Experimental Shift-Share Research Designs. Review of Economic Studies, 89(1), 181–213.
doi.org/10.1093/restud/rdab030
Foundationalon shift share instrumentsshock-exogeneitymany-shocksidentificationAnnotation
Borusyak, Hull, and Jaravel provide an alternative framework where identification comes from the exogeneity of the shocks rather than the shares. They show that with many independent shocks, the instrument is valid even if shares are endogenous, greatly expanding the range of credible applications.
- 2420
Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253–3285.
doi.org/10.1093/restud/rdae007
Foundationalon staggered difference in differences, event studiesimputation-estimatorefficiencyevent-studyAnnotation
Borusyak, Jaravel, and Spiess propose an imputation estimator for staggered DID that first estimates unit and time fixed effects from untreated observations, then imputes the counterfactual outcomes. This approach is efficient, flexible, and avoids the negative weighting problem of TWFE.
- 9519
Bound, J., Jaeger, D. A., & Baker, R. M. (1995). Problems with Instrumental Variables Estimation When the Correlation Between the Instruments and the Endogenous Explanatory Variable Is Weak. Journal of the American Statistical Association, 90(430), 443–450.
doi.org/10.1080/01621459.1995.10476536
Foundationalon instrumental variablesweak-instrumentsIV-biasfinite-samplefirst-stage-FAnnotation
Bound, Jaeger, and Baker demonstrate that instrumental variables estimates can be severely biased when instruments are weakly correlated with the endogenous regressor. They show that with weak instruments, the finite-sample bias of IV approaches that of OLS, and that the standard IV confidence intervals can have coverage far below their nominal levels. The paper motivates the widespread practice of reporting first-stage F-statistics as a diagnostic for instrument strength.
- 2120
Brand, J. E., Xu, J., Koch, B., & Geraldo, P. (2021). Uncovering Sociological Effect Heterogeneity Using Tree-Based Machine Learning. Sociological Methodology, 51(2), 189–223.
doi.org/10.1177/0081175021993503
Applicationon causal forestssocial-sciencereturns-to-educationvariable-importanceAnnotation
Brand and colleagues provide a practical guide to using causal trees and forests in social science research. They discuss honest estimation, variable importance for understanding which covariates drive heterogeneity, and apply the methods to study heterogeneous returns to college education.
- 1720
Brinch, C. N., Mogstad, M., & Wiswall, M. (2017). Beyond LATE with a Discrete Instrument. Journal of Political Economy, 125(4), 985–1039.
Foundationalon marginal treatment effectsMTEdiscrete-instrumentsemiparametricquantity-qualityLATEAnnotation
Brinch, Mogstad, and Wiswall show how to estimate the MTE curve semiparametrically even with a discrete (binary or multivalued) instrument, which is a common case in practice. They demonstrate that the local IV approach can be implemented with discrete instruments by imposing additive separability between observed covariates and unobserved heterogeneity along with a parametric structure on the MTE. Applied to the quantity-quality tradeoff of children using twin births and sibling sex composition as instruments for family size, they find that MTE varies with unobserved resistance to having additional children, demonstrating how discrete instruments can recover policy-relevant heterogeneity beyond LATE.
- 8519
Brown, S. J., & Warner, J. B. (1985). Using Daily Stock Returns: The Case of Event Studies. Journal of Financial Economics, 14(1), 3–31.
doi.org/10.1016/0304-405X(85)90042-X
Foundationalon event studiesdaily-returnstest-statisticssimulationAnnotation
Brown and Warner extend the event study framework from monthly to daily stock returns and examine the statistical properties of various test statistics. Their simulations show that simple methods perform well in most settings, providing practical reassurance for applied researchers.
- 0920
Bruhn, M., & McKenzie, D. (2009). In Pursuit of Balance: Randomization in Practice in Development Field Experiments. American Economic Journal: Applied Economics, 1(4), 200–232.
Foundationalon experimental designstratificationbalancerandomization-methodsfield-experimentsAnnotation
Bruhn and McKenzie compare different randomization methods—simple, stratified, and pairwise—in practice and show that stratified randomization substantially improves balance on baseline covariates and increases statistical power. They provide practical recommendations for choosing among randomization procedures in field experiments.
- 1820
Buchanan, A. L., Hudgens, M. G., Cole, S. R., Mollan, K. R., Sax, P. E., Daar, E. S., Adimora, A. A., Eron, J. J., & Mugavero, M. J. (2018). Generalizing Evidence from Randomized Trials Using Inverse Probability of Sampling Weights. Journal of the Royal Statistical Society: Series A, 181(4), 1193–1209.
Foundationalon external validityAnnotation
Buchanan and colleagues develop inverse probability of sampling weighted (IPSW) estimators for generalizing treatment effect estimates from randomized trials to well-defined target populations, and derive consistent sandwich-type variance estimators. The method models the probability of trial participation as a function of observed covariates, reweighting trial outcomes to represent the target population. Researchers seeking to transport trial results to a broader population can apply IPSW when a probability sample or census of the target population is available for comparison.
- 2220
Busenbark, J. R., Yoon, H., Gamache, D. L., & Withers, M. C. (2022). Omitted Variable Bias: Examining Management Research with the Impact Threshold of a Confounding Variable (ITCV). Journal of Management, 48(1), 17–48.
doi.org/10.1177/01492063211006458
management-methodologyITCVbest-practicesAnnotation
Busenbark and colleagues provide a practical guide to conducting sensitivity analysis in management research using the ITCV framework. They review its application in strategic management and organizational behavior, and demonstrate how to interpret and report results for management audiences.
- 0720
Bushway, S., Johnson, B. D., & Slocum, L. A. (2007). Is the Magic Still There? The Use of the Heckman Two-Step Correction for Selection Bias in Criminology. Journal of Quantitative Criminology, 23(2), 151–178.
doi.org/10.1007/s10940-007-9024-4
Surveyon heckman selection modelsurveycriminologymisapplicationAnnotation
Bushway, Johnson, and Slocum review Heckman model applications in criminology and find widespread misapplication. Emphasizes that without a credible exclusion restriction, the Heckman correction provides no improvement over naive OLS and may even increase bias.
- 2120
Callaway, B., & Sant'Anna, P. H. C. (2021). Difference-in-Differences with Multiple Time Periods. Journal of Econometrics, 225(2), 200–230.
doi.org/10.1016/j.jeconom.2020.12.001
Foundationalon staggered difference in differences, event studies, synthetic difference in differencesgroup-time-ATTheterogeneous-effectsaggregationAnnotation
Callaway and Sant'Anna propose group-time average treatment effects (ATT(g,t)) that avoid the problematic comparisons in TWFE. Their framework allows for heterogeneous treatment effects across groups and time and provides aggregation schemes for summary parameters.
- 1420
Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014). Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs. Econometrica, 82(6), 2295–2326.
Foundationalon regression discontinuity fuzzy, regression discontinuity sharpbias-correctionbandwidthrdrobustinferenceAnnotation
Calonico, Cattaneo, and Titiunik develop bias-corrected confidence intervals for RDD that address the problem of conventional confidence intervals being invalid when using optimal bandwidth selectors. Their rdrobust software package has become the standard tool for implementing RDD in practice.
- 8619
Cameron, A. C., & Trivedi, P. K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1(1), 29–53.
doi.org/10.1002/jae.3950010104
Foundationalon poisson negative binomialoverdispersionmodel-selectioncount-dataAnnotation
Cameron and Trivedi compare Poisson, negative binomial, and other count data models, providing tests for overdispersion and guidance on model selection. This paper helps establish the practical toolkit for applied researchers working with count outcomes.
- 9019
Cameron, A. C., & Trivedi, P. K. (1990). Regression-based Tests for Overdispersion in the Poisson Model. Journal of Econometrics, 46(3), 347–364.
doi.org/10.1016/0304-4076(90)90014-K
Foundationalon poisson negative binomialoverdispersionPoissonmodel-selectioncount-dataAnnotation
Cameron and Trivedi develop regression-based tests for overdispersion in count data models, enabling formal testing of whether the Poisson equidispersion assumption holds. Their tests compare the observed variance to the Poisson-implied mean, providing the foundation for model selection between Poisson and negative binomial specifications. Researchers working with count outcomes should use these tests before defaulting to either model.
- 0520
Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and Applications. Cambridge University Press.
doi.org/10.1017/CBO9780511811241
Surveyon fixed effects, logit probittextbookpanel-datamicroeconometricsdynamic-panelsAnnotation
Cameron and Trivedi cover panel data methods comprehensively in Chapter 21, including fixed effects, random effects, and dynamic panel models. A standard graduate-level reference for microeconometric methods.
- 0820
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-Based Improvements for Inference with Clustered Errors. Review of Economics and Statistics, 90(3), 414–427.
Foundationalon ols regression, randomization inferencecluster-bootstrapfew-clustersinferencewild-bootstrapAnnotation
Cameron, Gelbach, and Miller address what happens when clustering is necessary but the number of clusters is small (fewer than 30-50). They propose the wild cluster bootstrap as a solution, which has become the standard approach when researchers have too few clusters for asymptotic cluster-robust standard errors to be reliable.
- 1120
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference with Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238–249.
doi.org/10.1198/jbes.2010.07136
Foundationalon clustering inferencetwo-way-clusteringmultiwayAnnotation
Cameron, Gelbach, and Miller extend cluster-robust variance estimation to settings with two-way (or multi-way) clustering. The variance estimator adds the two one-way cluster-robust variance matrices and subtracts the heteroscedasticity-robust matrix.
- 1320
Cameron, A. C., & Trivedi, P. K. (2013). Regression Analysis of Count Data. Cambridge University Press.
doi.org/10.1017/CBO9781139013567
Surveyon poisson negative binomialtextbookcount-datazero-inflationpanel-dataAnnotation
Cameron and Trivedi provide the standard reference on count data regression, covering Poisson, negative binomial, zero-inflated, hurdle, and panel count models. They provide both the theoretical foundations and practical implementation guidance that applied researchers need.
- 1520
Cameron, A. C., & Miller, D. L. (2015). A Practitioner's Guide to Cluster-Robust Inference. Journal of Human Resources, 50(2), 317–372.
Surveyon ols regressioncluster-robuststandard-errorsinferencepractical-guideAnnotation
Cameron and Miller cover all aspects of cluster-robust inference in OLS regression in this highly practical survey, including when to cluster, at what level, and what to do when the number of clusters is small. It has become the essential reference for applied researchers deciding how to handle clustered data.
- 2020
Camuffo, A., Cordova, A., Gambardella, A., & Spina, C. (2020). A Scientific Approach to Entrepreneurial Decision Making: Evidence from a Randomized Control Trial. Management Science, 66(2), 564–586.
doi.org/10.1287/mnsc.2018.3249
RCTentrepreneurshipdecision-makingscientific-methodAnnotation
Camuffo and colleagues conduct a randomized controlled trial with 116 Italian startups, randomly assigning half to receive training in a 'scientific' approach to entrepreneurial decision-making (formulating and testing hypotheses before committing resources). Treated startups perform better, are more likely to pivot, and are not more likely to drop out, providing experimental evidence that structured decision-making improves entrepreneurial outcomes.
- 2420
Camuffo, A., Gambardella, A., Messinese, D., Novelli, E., Paolucci, E., & Spina, C. (2024). A Scientific Approach to Entrepreneurial Decision-Making: Large-Scale Replication and Extension. Strategic Management Journal, 45(6), 1209–1237.
RCTreplicationentrepreneurshipscientific-methodexternal-validityAnnotation
Camuffo and colleagues conduct four randomized controlled trials with 759 firms across Italy, the UK, and India, replicating and extending their earlier finding that training entrepreneurs to adopt a 'scientific' approach to decision-making improves venture performance. The multi-site, multi-country design provides strong evidence on the external validity of the original RCT findings.
- 0220
Capron, L., & Pistre, N. (2002). When Do Acquirers Earn Abnormal Returns?. Strategic Management Journal, 23(9), 781–794.
M&Aacquirer-returnsstrategyAnnotation
Capron and Pistre use event study methodology to examine when acquiring firms earn positive abnormal returns from mergers and acquisitions. They find that acquirers earn positive returns only when they are the primary source of value creation, contributing to the M&A strategy literature.
- 9419
Card, D., & Krueger, A. B. (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania. American Economic Review, 84(4), 772–793.
Foundationalon difference in differencesminimum-wageemploymentnatural-experimentAnnotation
Card and Krueger compare fast-food employment in New Jersey (which raised its minimum wage) with neighboring Pennsylvania (which did not) in perhaps the most famous DID study in economics. They find no negative employment effect, challenging the standard textbook prediction. This paper popularizes DID as a research design.
- 0120
Card, D. (2001). Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration. Journal of Labor Economics, 19(1), 22–64.
Applicationon shift share instrumentsimmigrationenclave-instrumentlabor-marketsAnnotation
Card uses a shift-share instrument based on historical settlement patterns of immigrant groups to predict current immigration flows to U.S. cities. This 'enclave instrument' is adopted in hundreds of subsequent immigration studies and is a classic example of the shift-share approach.
- 1520
Card, D., Lee, D. S., Pei, Z., & Weber, A. (2015). Inference on Causal Effects in a Generalized Regression Kink Design. Econometrica, 83(6), 2453–2483.
Foundationalon regression kink designRKD-foundationskink-designunemployment-insurancederivative-ratioAnnotation
Card, Lee, Pei, and Weber formalize the regression kink design, establishing conditions under which a kink in the treatment assignment function identifies causal effects. They show that the estimand is the ratio of the change in the slope of the conditional expectation of the outcome to the change in the slope of the treatment function at the kink point. The paper develops inference procedures and applies the method to estimate the effect of unemployment insurance benefits on unemployment duration.
- 1820
Card, D., Kluve, J., & Weber, A. (2018). What Works? A Meta Analysis of Recent Active Labor Market Program Evaluations. Journal of the European Economic Association, 16(3), 894–931.
Surveyon power analysisAnnotation
Card, Kluve, and Weber conduct a meta-analysis of over 200 active labor market program evaluations across multiple countries, classifying estimates by program type, participant group, and post-program time horizon. They find that average impacts are near zero in the short run but become more positive two to three years after program completion, with human capital programs showing the largest medium-term gains and public employment subsidies proving less effective. Policy researchers designing labor market interventions should consider program type and evaluation time horizon when interpreting treatment effect estimates.
- 1120
Carneiro, P., Heckman, J. J., & Vytlacil, E. J. (2011). Estimating Marginal Returns to Education. American Economic Review, 101(6), 2754–2781.
doi.org/10.1257/aer.101.6.2754
Applicationon lab mte replicationAnnotation
Carneiro, Heckman, and Vytlacil use the marginal treatment effect framework to estimate heterogeneous returns to college. They find a declining MTE curve -- individuals most likely to attend college benefit the most -- demonstrating that conventional treatment effect parameters (ATE, ATT, LATE) differ substantially due to essential heterogeneity.
- 1220
Casey, K., Glennerster, R., & Miguel, E. (2012). Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Plan. Quarterly Journal of Economics, 127(4), 1755–1812.
Applicationon multiple testing, pre registrationfield-experimentpre-analysis-plandevelopment-economicsWestfall-YoungAnnotation
Casey, Glennerster, and Miguel pre-registered their analysis plan for a community-driven development program in Sierra Leone and apply multiple testing corrections (including the Westfall-Young step-down procedure and family-wise error rate adjustments) across outcome families. This paper is one of the most prominent examples of rigorous multiple testing adjustment in a field experiment, demonstrating that many individually significant effects lose significance after correction.
- 1320
Cattaneo, M. D., Drukker, D. M., & Holland, A. D. (2013). Estimation of Multivalued Treatment Effects Under Conditional Independence. Stata Journal, 13(3), 407–450.
doi.org/10.1177/1536867X1301300301
Foundationalon matching methodsmulti-valued-treatmentdose-responseinverse-probability-weightingStataAnnotation
Cattaneo, Drukker, and Holland extend matching and inverse probability weighting methods to settings with multi-valued (rather than binary) treatments, developing estimators for dose-response functions under conditional independence. Their accompanying Stata implementation made these methods readily accessible to applied researchers.
- 1520
Cattaneo, M. D., Frandsen, B. R., & Titiunik, R. (2015). Randomization Inference in the Regression Discontinuity Design: An Application to Party Advantages in the U.S. Senate. Journal of Causal Inference, 3(1), 1–24.
Foundationalon randomization inferenceRDDlocal-randomizationelectionsfinite-sampleAnnotation
Cattaneo, Frandsen, and Titiunik develop a randomization inference framework for regression discontinuity designs, exploiting the local randomization interpretation of close elections. They apply the method to estimate party advantages in U.S. Senate elections, demonstrating how Fisher-style permutation tests can provide finite-sample exact inference in RDD settings where asymptotic approximations may be unreliable.
- 1920
Cattaneo, M. D., Titiunik, R., & Vazquez-Bare, G. (2019). Power Calculations for Regression-Discontinuity Designs. Stata Journal, 19(1), 210–245.
doi.org/10.1177/1536867X19830919
power-calculationssample-sizestudy-designsoftwareAnnotation
Cattaneo, Titiunik, and Vazquez-Bare provide methods and software for power calculations in RDD, essential for study design and determining adequate sample sizes near the cutoff. The associated rdsampsi command enables researchers to plan appropriately powered RDD studies before data collection.
- 2020
Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2020). A Practical Introduction to Regression Discontinuity Designs: Foundations. Cambridge University Press.
RDD-practical-guiderdrobusttextbookfuzzy-RDDAnnotation
Cattaneo, Idrobo, and Titiunik provide a practical and accessible guide to implementing regression discontinuity designs, covering both sharp and fuzzy cases with worked examples and code. Part of the Cambridge Elements series, it provides step-by-step guidance on bandwidth selection, estimation, and inference using the rdrobust toolkit.
- 2020
Cattaneo, M. D., Jansson, M., & Ma, X. (2020). Simple Local Polynomial Density Estimators. Journal of the American Statistical Association, 115(531), 1449–1455.
doi.org/10.1080/01621459.2019.1635480
Foundationalon regression discontinuity fuzzymanipulation-testingdensity-estimationrddensityAnnotation
Cattaneo, Jansson, and Ma propose a local polynomial density estimator for manipulation testing in regression discontinuity designs. Implemented in the rddensity package, it provides a modern alternative to the McCrary (2008) density test with better boundary properties.
- 2220
Cattaneo, M. D., & Titiunik, R. (2022). Regression Discontinuity Designs. Annual Review of Economics, 14, 821–851.
doi.org/10.1146/annurev-economics-051520-021409
Surveyon regression discontinuity sharpsurveystate-of-the-artfuzzy-RDDgeographic-RDDmulti-cutoffAnnotation
Cattaneo and Titiunik survey the state of the art in RDD methodology, including extensions to fuzzy designs, geographic RDD, and multi-cutoff designs. They provide guidance on current recommended practices and an excellent entry point to the modern RDD literature.
- 2420
Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2024). A Practical Introduction to Regression Discontinuity Designs: Extensions. Cambridge University Press.
Surveyon regression discontinuity sharptextbookextensionsmulti-scoregeographic-RDDkink-designAnnotation
Cattaneo, Idrobo, and Titiunik cover extensions of the regression discontinuity framework in this follow-up volume, including multi-score designs, geographic RDD, kink designs, and discrete running variables. They provide practical guidance and software implementations for these more advanced settings, making it an essential companion for applied researchers going beyond the standard sharp RDD.
- 1620
Certo, S. T., Busenbark, J. R., Woo, H., & Semadeni, M. (2016). Sample Selection Bias and Heckman Models in Strategic Management Research. Strategic Management Journal, 37(13), 2639–2657.
surveymanagementbest-practicesAnnotation
Certo, Busenbark, Woo, and Semadeni review the use of Heckman models in strategic management. They provide practical guidance on when selection correction is needed, how to choose exclusion restrictions, and how to interpret results. Finds that many SMJ papers misapply the technique.
- 8019
Chamberlain, G. (1980). Analysis of Covariance with Qualitative Data. Review of Economic Studies, 47(1), 225–238.
Foundationalon fixed effects, logit probitnonlinear-modelsconditional-logitdiscrete-choiceAnnotation
Chamberlain extends the fixed effects approach to nonlinear models like logit, showing how to condition out the fixed effects in discrete choice settings. This work is fundamental for researchers who need fixed effects in models where the dependent variable is binary or categorical.
- 1620
Chatterji, A. K., Findley, M., Jensen, N. M., Meier, S., & Nielson, D. (2016). Field Experiments in Strategy Research. Strategic Management Journal, 37(1), 116–132.
field-experimentsstrategymethodologyAnnotation
Chatterji, Findley, Jensen, Meier, and Nielson make the case for using field experiments in strategy research and provide practical guidance for doing so. They discuss internal validity, external validity, and ethical considerations specific to strategy scholars.
- 0820
Chava, S., & Roberts, M. R. (2008). How Does Financing Impact Investment? The Role of Debt Covenants. Journal of Finance, 63(5), 2085–2121.
doi.org/10.1111/j.1540-6261.2008.01391.x
Applicationon regression discontinuity sharpdebt-covenantscorporate-financeinvestmentAnnotation
Chava and Roberts use an RDD around debt covenant thresholds to study how covenant violations affect firm investment. This paper is an important early application of RDD in corporate finance, where accounting-based thresholds create natural discontinuities.
- 0520
Chernozhukov, V., & Hansen, C. (2005). An IV Model of Quantile Treatment Effects. Econometrica, 73(1), 245–261.
doi.org/10.1111/j.1468-0262.2005.00570.x
Foundationalon quantile treatment effectsfoundationalinstrumental-variablesquantile-regressionAnnotation
Chernozhukov and Hansen develop an instrumental variable framework for quantile regression to address endogeneity. Proposes the inverse quantile regression (IQR) method that exploits moment conditions implied by the structural quantile model. Provides conditions under which quantile treatment effects are identified with endogenous treatments, extending quantile regression to credible causal inference settings.
- 1820
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/Debiased Machine Learning for Treatment and Structural Parameters. Econometrics Journal, 21(1), C1–C68.
Foundationalon double debiased machine learningNeyman-orthogonalitycross-fittingpartially-linear-modelAnnotation
Chernozhukov et al. introduce double/debiased machine learning (DML), showing how to combine Neyman orthogonality with cross-fitting to obtain root-n consistent and asymptotically normal estimates of low-dimensional causal parameters while using high-dimensional machine learning for nuisance functions. This paper provides the theoretical foundation for valid inference when first-stage estimation uses flexible ML methods that would otherwise invalidate standard asymptotic arguments. The cross-fitting procedure it introduces is now standard practice for any application combining ML prediction with causal parameter estimation.
- 2120
Chernozhukov, V., Wuthrich, K., & Zhu, Y. (2021). An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls. Journal of the American Statistical Association, 116(536), 1849–1864.
doi.org/10.1080/01621459.2021.1920957
Foundationalon synthetic controlconformal-inferencecounterfactualfinite-sampleAnnotation
Chernozhukov, Wuthrich, and Zhu develop a conformal inference method for synthetic control that provides exact, finite-sample valid p-values and confidence intervals without requiring a large number of control units. This approach offers a modern, robust alternative to placebo-based inference for counterfactual and synthetic control estimators.
- 2220
Chernozhukov, V., Escanciano, J. C., Ichimura, H., Newey, W. K., & Robins, J. M. (2022). Locally Robust Semiparametric Estimation. Econometrica, 90(4), 1501–1535.
Foundationalon double debiased machine learningsemiparametriclocal-robustnessdebiasingAnnotation
Chernozhukov, Escanciano, Ichimura, Newey, and Robins develop locally robust semiparametric estimators that extend the DML framework, demonstrating how automatic debiasing with machine learning first-stage estimates can be applied broadly. Their approach yields root-n consistent estimates of causal and structural parameters even when nuisance functions are estimated with regularized machine learning methods.
- 1120
Chetty, R., Friedman, J. N., Olsen, T., & Pistaferri, L. (2011). Adjustment Costs, Firm Responses, and Micro vs. Macro Labor Supply Elasticities: Evidence from Danish Tax Records. Quarterly Journal of Economics, 126(2), 749–804.
Applicationon bunching estimationlabor-supplyadjustment-costsDenmarktax-kinksfrictionsAnnotation
Chetty, Friedman, Olsen, and Pistaferri use Danish administrative tax data to reconcile the gap between micro and macro labor supply elasticities using bunching methods. They show that adjustment frictions explain why micro estimates from bunching at tax kinks are small: many workers cannot freely adjust hours, so observed bunching understates the frictionless elasticity. They estimate that accounting for frictions raises the implied elasticity substantially. The paper is a landmark application of bunching to the micro-macro elasticity puzzle and introduces key methods for dealing with frictions in bunching designs.
- 1420
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates. American Economic Review, 104(9), 2593–2632.
doi.org/10.1257/aer.104.9.2593
Applicationon fixed effectsteacher-value-addededucationcausal-validationAnnotation
Chetty, Friedman, and Rockoff use teacher fixed effects (value-added models) and quasi-experimental validation to measure individual teachers' causal impacts on student outcomes. They demonstrate that teacher fixed effects capture real causal effects, not just selection, and their work has influenced education policy worldwide.
- 2120
Choudhury, P., Foroughi, C., & Larson, B. (2021). Work-from-anywhere: The Productivity Effects of Geographic Flexibility. Strategic Management Journal, 42(4), 655–683.
difference-in-differencesremote-worknatural-experimentproductivityAnnotation
Choudhury, Foroughi, and Larson use a difference-in-differences design to study the productivity effects of a work-from-anywhere policy at the U.S. Patent and Trademark Office. They find that geographic flexibility increases output by approximately 4.4% without reducing quality. The paper demonstrates the application of DiD to a natural experiment in organizational design and is a leading example of causal inference in the future-of-work literature.
- 1820
Christensen, G., & Miguel, E. (2018). Transparency, Reproducibility, and the Credibility of Economics Research. Journal of Economic Literature, 56(3), 920–980.
Surveyon pre registrationtransparencyreproducibilityAEA-registryeconomicsAnnotation
Christensen and Miguel survey the transparency and reproducibility landscape in economics, documenting the growing adoption of pre-registration through the AEA RCT Registry and other platforms. They present evidence on the prevalence of specification searching and publication bias, and make the case that pre-registration combined with pre-analysis plans substantially improves the credibility of empirical findings.
- 2020
Cinelli, C., & Hazlett, C. (2020). Making Sense of Sensitivity: Extending Omitted Variable Bias. Journal of the Royal Statistical Society: Series B, 82(1), 39–67.
Foundationalon sensitivity analysisomitted-variable-biaspartial-R-squaredbenchmarkingAnnotation
Cinelli and Hazlett develop a modern framework for sensitivity analysis based on partial R-squared measures, extending the omitted variable bias formula. Their approach allows researchers to benchmark the strength of hypothetical confounders against observed covariates, making sensitivity analysis more interpretable.
- 2420
Cinelli, C., Ferwerda, J., & Hazlett, C. (2024). Sensemakr: Sensitivity Analysis Tools for OLS in R and Stata. Observational Studies, 10(2), 93–127.
doi.org/10.1353/obs.2024.a946583
softwarepartial-R-squaredbenchmarkingR-packageAnnotation
Cinelli, Ferwerda, and Hazlett develop the sensemakr R and Stata package implementing their partial R-squared sensitivity analysis framework. They demonstrate the tool with applications to studies of violence and political attitudes, showing how researchers can benchmark potential confounders against observed covariates to assess the robustness of causal claims from observational data.
- 2520
Cinelli, C., & Hazlett, C. (2025). An Omitted Variable Bias Framework for Sensitivity Analysis of Instrumental Variables. Biometrika, 112(2), asaf004.
doi.org/10.1093/biomet/asaf004
Applicationon sensitivity analysisinstrumental-variablesexclusion-restrictionomitted-variable-biasIV-sensitivityAnnotation
Cinelli and Hazlett extend their OLS sensitivity framework to instrumental variables settings, showing how to assess the robustness of IV estimates to violations of the exclusion restriction. They derive bounds on IV bias as a function of the partial R-squared of a hypothetical confounder with both the instrument and the outcome, providing practical tools for benchmarking the plausibility of IV assumptions.
- 1520
Clark, T. S., & Linzer, D. A. (2015). Should I Use Fixed or Random Effects?. Political Science Research and Methods, 3(2), 399–408.
Surveyon random effectsfixed-vs-randommodel-selectionpractical-guidanceAnnotation
Clark and Linzer provide practical guidance on choosing between fixed and random effects, arguing the decision depends on the research question, sample size, and the degree of correlation between unit effects and covariates. They demonstrate via simulation that random effects can outperform fixed effects when the number of units is small or when between-unit variation is of substantive interest. The paper challenges the common practice of defaulting to fixed effects solely because the Hausman test rejects.
- 2020
Clarke, D., Romano, J. P., & Wolf, M. (2020). The Romano-Wolf Multiple-Hypothesis Correction in Stata. Stata Journal, 20(4), 812–843.
doi.org/10.1177/1536867X20976314
Foundationalon multiple testingStatasoftwareRomano-WolfFWERAnnotation
Clarke, Romano, and Wolf develop a Stata implementation of the Romano-Wolf stepwise multiple testing correction, which controls the family-wise error rate while accounting for the dependence structure among test statistics via resampling. This correction is more powerful than Bonferroni or Holm procedures when test statistics are correlated, which is the typical case in applied research with related outcomes. The rwolf command provides applied researchers with an accessible tool for rigorous multiple hypothesis testing.
- 2420
Clarke, D., Pailanir, D., Athey, S., & Imbens, G. (2024). On Synthetic Difference-in-Differences and Related Estimation Methods in Stata. Stata Journal, 24(4), 557–598.
doi.org/10.1177/1536867X241297914
Foundationalon synthetic difference in differencesStatasoftwareimplementationAnnotation
Clarke and colleagues develop the sdid Stata package for implementing synthetic DID, providing detailed documentation and empirical examples. This paper makes the method accessible to applied researchers and demonstrates implementation with real policy evaluation data.
- 1620
Cleves, M., Gould, W., & Marchenko, Y. (2016). An Introduction to Survival Analysis Using Stata. Stata Press.
Surveyon cox proportional hazardsurveystatapractical-guideAnnotation
Cleves, Gould, and Marchenko provide a comprehensive practical guide to survival analysis in Stata. Covers Kaplan-Meier estimation, Cox regression, parametric models, competing risks, and frailty models with extensive Stata code examples and diagnostic procedures.
- 1520
Coffman, L. C., & Niederle, M. (2015). Pre-Analysis Plans Have Limited Upside, Especially Where Replications Are Feasible. Journal of Economic Perspectives, 29(3), 81–98.
Surveyon pre registrationskepticismreplicationflexibilityAnnotation
Coffman and Niederle offer a skeptical perspective on pre-analysis plans, arguing that their benefits are limited when replication is feasible and that rigid adherence to pre-specified analyses can prevent researchers from learning from the data. This paper provides important counterarguments in the pre-registration debate.
- 8819
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.
Foundationalon power analysistextbookeffect-sizesample-sizeAnnotation
Cohen's foundational textbook introduces the concepts of effect size, statistical power, and sample size determination that becomes standard in the behavioral sciences. He provides power tables and conventions for small, medium, and large effect sizes that remain widely used across disciplines.
- 9919
Conley, T. G. (1999). GMM Estimation with Cross Sectional Dependence. Journal of Econometrics, 92(1), 1–45.
doi.org/10.1016/S0304-4076(98)00084-0
Foundationalon choosing standard errorsAnnotation
Conley develops GMM estimators and nonparametric, positive semi-definite covariance matrix estimators that account for cross-sectional dependence characterized by economic or geographic distance between observations. The approach extends HAC-style inference to spatial settings by allowing error correlations to decline smoothly with distance, and the covariance estimator remains consistent even when distances are imprecisely measured. Researchers with spatially distributed data should use Conley standard errors when observations within a defined neighborhood are likely correlated.
- 1220
Conley, T. G., Hansen, C. B., & Rossi, P. E. (2012). Plausibly Exogenous. Review of Economics and Statistics, 94(1), 260–272.
Foundationalon instrumental variablesexclusion-restrictionsensitivity-analysisplausible-exogeneityAnnotation
Conley, Hansen, and Rossi develop methods for inference when the exclusion restriction is 'plausibly' rather than exactly satisfied, parameterizing the degree of violation and constructing valid confidence intervals. This approach provides a formal sensitivity analysis for IV estimates, answering the question: how large would the violation of the exclusion restriction need to be to overturn the result? Applied researchers can use these methods to transparently assess the robustness of IV findings to a common critique.
- 1620
Cornelissen, T., Dustmann, C., Raute, A., & Schonberg, U. (2016). From LATE to MTE: Alternative Methods for the Evaluation of Policy Interventions. Labour Economics, 41, 47–60.
doi.org/10.1016/j.labeco.2016.06.004
Applicationon marginal treatment effectsMTEchild-carepolicy-evaluationappliedGermany+1Annotation
Cornelissen, Dustmann, Raute, and Schonberg provide an accessible methodological guide to MTE estimation, covering the theoretical foundations and practical steps for moving from LATE to the full marginal treatment effect curve. The paper explains how to use local instrumental variables to trace out how treatment effects vary with individuals' unobserved propensity to participate. It serves as a tutorial for applied researchers seeking to implement MTE methods, with clear exposition of identification, estimation, and interpretation.
- 8819
Cornwell, C., & Rupert, P. (1988). Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variables Estimators. Journal of Applied Econometrics, 3(2), 149–155.
doi.org/10.1002/jae.3950030206
Applicationon lab re replicationAnnotation
Cornwell and Rupert compare the efficiency of alternative instrumental variables estimators for panel data models with correlated individual effects, including the Hausman-Taylor, Amemiya-MaCurdy, and Breusch-Mizon-Schmidt estimators. Using a Mincer wage equation on PSID data, they find that efficiency gains from the more complex estimators are limited to the coefficients of time-invariant endogenous variables.
- 1720
Correia, S. (2017). Linear Models with High-Dimensional Fixed Effects: An Efficient and Feasible Estimator. Working Paper.
Foundationalon fixed effectsreghdfehigh-dimensional-FEStatacomputationalAnnotation
Correia develops an efficient iterative demeaning estimator for linear models with multiple high-dimensional fixed effects that scales to very large datasets. The estimator handles arbitrary numbers of fixed-effect dimensions and supports cluster-robust standard errors. Its implementation as the reghdfe Stata command has become the standard tool for applied researchers working with high-dimensional fixed effects in panel data.
- 2020
Correia, S., Guimaraes, P., & Zylkin, T. (2020). Fast Poisson Estimation with High-Dimensional Fixed Effects. Stata Journal, 20(1), 95–115.
doi.org/10.1177/1536867X20909691
Foundationalon poisson negative binomialppmlhdfehigh-dimensional-FEPPMLStataAnnotation
Correia, Guimaraes, and Zylkin introduce the ppmlhdfe Stata command for fast Poisson estimation with multiple levels of fixed effects, making PPML feasible for large datasets with high-dimensional fixed effects. This tool has become standard for applied researchers working with count data in panel settings.
- 7219
Cox, D. R. (1972). Regression Models and Life-Tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–220.
doi.org/10.1111/j.2517-6161.1972.tb00899.x
Foundationalon cox proportional hazardfoundationalproportional-hazardspartial-likelihoodAnnotation
Cox introduces the proportional hazards model with an unspecified baseline hazard, estimated via a conditional likelihood argument (later formalized as partial likelihood in Cox, 1975). The semiparametric approach avoids distributional assumptions on the baseline hazard while allowing covariate effects to be estimated consistently. One of the most cited papers in statistics.
- 1320
Crepon, B., Duflo, E., Gurgand, M., Rathelot, R., & Zamora, P. (2013). Do Labor Market Policies Have Displacement Effects? Evidence from a Clustered Randomized Experiment. Quarterly Journal of Economics, 128(2), 531–580.
Applicationon experimental designjob-placementdisplacement-effectscluster-RCTFranceAnnotation
Crepon and colleagues evaluate a job placement assistance program in France using a two-step clustered randomization design that varies treatment intensity across 235 labor markets. The paper's key contribution is identifying displacement effects: treated job seekers gain at the expense of untreated competitors, particularly in weak labor markets and among workers with similar skills. This innovative experimental design allows estimation of both direct and indirect (general equilibrium) effects of active labor market policies.
- 1220
Cunat, V., Gine, M., & Guadalupe, M. (2012). The Vote Is Cast: The Effect of Corporate Governance on Shareholder Value. Journal of Finance, 67(5), 1943–1977.
doi.org/10.1111/j.1540-6261.2012.01776.x
Applicationon regression discontinuity fuzzycorporate-governanceshareholder-valueclose-votesAnnotation
Cunat, Gine, and Guadalupe use a fuzzy RDD around the majority threshold in shareholder governance proposals to estimate the causal effect of governance provisions on firm value. This paper is a leading example of fuzzy RDD applied to corporate governance and finance.
- 1820
Cunningham, S., & Shah, M. (2018). Decriminalizing Indoor Prostitution: Implications for Sexual Violence and Public Health. Review of Economic Studies, 85(3), 1683–1715.
Applicationon synthetic controlpolicy-evaluationpublic-healthcrimeAnnotation
Cunningham and Shah use the synthetic control method to study how Rhode Island's accidental decriminalization of indoor prostitution affected sex crimes and STI rates. This study is a well-known application that illustrates how synthetic control can exploit a unique policy change affecting a single unit.
- 2120
Cunningham, S. (2021). Causal Inference: The Mixtape. Yale University Press.
doi.org/10.12987/9780300255881
SurveyReplicationon difference in differences, instrumental variables, regression discontinuity fuzzy +2textbookcausal-inferenceaccessiblecode-examplesAnnotation
Cunningham provides an accessible textbook with an excellent DiD chapter that walks through the intuition, the math, and the code (in Stata and R). Freely available online at mixtape.scunning.com, it is a valuable companion for students who want worked examples alongside formal treatment.
- 1920
Dahabreh, I. J., Robertson, S. E., Tchetgen Tchetgen, E. J., Stuart, E. A., & Hernan, M. A. (2019). Generalizing Causal Inferences from Individuals in Randomized Trials to All Trial-Eligible Individuals. Biometrics, 75(2), 685–694.
Foundationalon external validityAnnotation
Dahabreh, Robertson, Tchetgen Tchetgen, Stuart, and Hernan develop a formal framework for generalizing causal inferences from randomized trial participants to all trial-eligible individuals in a target population, using baseline covariate data from both randomized and non-randomized individuals. They establish identifiability conditions and propose inverse probability weighting, outcome modeling, and doubly robust estimators for the target population average treatment effect. Researchers conducting trials nested within observational cohorts can apply this framework to estimate treatment effects for the full eligible population rather than only for those who enrolled.
- 1720
Davis, J., & Heller, S. B. (2017). Using Causal Forests to Predict Treatment Heterogeneity: An Application to Summer Jobs. American Economic Review, 107(5), 546–550.
Applicationon causal forestspolicy-evaluationsummer-jobstargetingAnnotation
Davis and Heller apply causal forests to a randomized summer jobs program for disadvantaged youth in Chicago, exploring how useful predicted treatment effect heterogeneity is in practice. They find the method can identify heterogeneity for some outcomes that standard interaction methods miss, while highlighting limitations of the approach.
- 2020
de Chaisemartin, C., & D'Haultfoeuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review, 110(9), 2964–2996.
Foundationalon staggered difference in differences, fixed effectsnegative-weightsTWFEheterogeneous-effectsAnnotation
De Chaisemartin and D'Haultfoeuille show that the TWFE estimator can assign negative weights to some treatment effects, potentially producing estimates with the wrong sign. They propose an alternative estimator and a decomposition that reveals which group-time effects receive negative weights.
- 2320
de Chaisemartin, C., & D'Haultfoeuille, X. (2023). Two-Way Fixed Effects and Differences-in-Differences with Heterogeneous Treatment Effects: A Survey. Econometrics Journal, 26(3), C1–C30.
Surveyon event studiesTWFEheterogeneous-effectssurveyDIDAnnotation
De Chaisemartin and D'Haultfoeuille provide a comprehensive survey of the recent literature on problems with two-way fixed effects estimators under heterogeneous treatment effects. They cover the key diagnostic tests (including the Goodman-Bacon decomposition), alternative estimators that are robust to heterogeneity, and practical guidance for choosing among them. The survey is essential reading for applied researchers working with event-study and difference-in-differences designs who need to understand when standard TWFE is and is not appropriate.
- 9919
Dehejia, R. H., & Wahba, S. (1999). Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs. Journal of the American Statistical Association, 94(448), 1053–1062.
doi.org/10.1080/01621459.1999.10473858
Applicationon matching methodspropensity-scoreprogram-evaluationexperimental-benchmarkAnnotation
Dehejia and Wahba show that propensity score matching can replicate experimental estimates of a job training program using observational data, revisiting LaLonde's influential critique. The paper demonstrates the practical value of matching by showing that propensity score methods yield estimates much closer to the experimental benchmark than the nonexperimental estimators LaLonde had examined.
- 1020
Dell, M. (2010). The Persistent Effects of Peru's Mining Mita. Econometrica, 78(6), 1863–1903.
Applicationon regression discontinuity sharpgeographic-RDDcolonial-institutionspersistencespatial-discontinuityAnnotation
Dell uses a geographic RDD exploiting the historical boundary of the mita forced labor system in Peru to estimate the persistent effect of colonial institutions on economic outcomes centuries later. The study demonstrates how RDD can exploit spatial discontinuities, not just score-based cutoffs.
- 1920
Deshpande, M., & Li, Y. (2019). Who Is Screened Out? Application Costs and the Targeting of Disability Programs. American Economic Journal: Economic Policy, 11(4), 213–248.
Applicationon staggered difference in differencesdisability-policystaggered-rolloutfield-office-closuresAnnotation
Deshpande and Li use staggered closings of Social Security field offices across the United States to estimate the effects of application costs on disability program participation. The staggered timing of office closures provides quasi-experimental variation in application costs, and the paper demonstrates how treatment-timing variation can be leveraged for credible policy evaluation.
- 1520
Dong, Y. (2015). Regression Discontinuity Applications with Rounding Errors in the Running Variable. Journal of Applied Econometrics, 30(3), 422–446.
Foundationalon regression discontinuity sharprounding-errorsdiscrete-running-variablediagnosticsmeasurementAnnotation
Dong examines regression discontinuity designs when the running variable is subject to rounding or heaping, a common practical concern. She shows that standard RD estimators can be biased in such settings and derives correction formulas for the resulting discretization bias, extending the applicability of RDD to settings with imperfect measurement of the running variable.
- 1520
Dong, Y., & Lewbel, A. (2015). Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models. Review of Economics and Statistics, 97(5), 1081–1092.
Foundationalon regression discontinuity fuzzypolicy-thresholdcounterfactualfuzzy-RDD-extensionsAnnotation
Dong and Lewbel show that the derivative of the RD treatment effect with respect to the running variable at the cutoff is identified. Under a local policy-invariance interpretation, this derivative can be used to evaluate counterfactual policies that shift the eligibility threshold, broadening the policy relevance of RDD beyond the effect at the existing cutoff.
- 1620
Doudchenko, N., & Imbens, G. W. (2016). Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis. NBER Working Paper No. 22791.
Foundationalon synthetic controlunificationDID-connectionpenalized-regressionAnnotation
Doudchenko and Imbens place synthetic control within a broader framework that includes DID and regression as special cases, proposing extensions that relax the non-negativity and adding-up constraints on weights. This paper helps researchers understand the connections between synthetic control and other methods.
- 9419
Dranove, D., & Olsen, C. (1994). The Economic Side Effects of Dangerous Drug Announcements. Journal of Law and Economics, 37(2), 323–348.
Applicationon event studiespharmaceuticalFDAregulationstock-marketAnnotation
Dranove and Olsen use event studies to measure the stock market impact of FDA drug safety announcements on pharmaceutical firms. This application demonstrates how event studies can quantify the financial consequences of regulatory actions in health care and management contexts.
- 2520
Dube, A., Girardi, D., Jordà, Ò., & Taylor, A. M. (2025). A Local Projections Approach to Difference-in-Differences. Journal of Applied Econometrics, 40(7), 741–758.
Foundationalon staggered difference in differenceslocal-projectionsdynamic-effectsevent-studyAnnotation
Dube and colleagues propose a local projections (LP) approach to difference-in-differences estimation that combines LPs with a flexible 'clean control' condition to define appropriate treated and control units. The LP-DiD estimator subsumes many recent solutions to negative weighting problems, accommodates covariates and nonabsorbing treatments, and is simple to implement.
- 0120
Duflo, E. (2001). Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment. American Economic Review, 91(4), 795–813.
Applicationon difference in differenceseducationschool-constructionIndonesiatreatment-intensityAnnotation
Duflo uses DiD comparing cohorts exposed to a massive school construction program in Indonesia to older cohorts not exposed, across regions with different program intensity. A beautifully clean application showing how DiD can exploit variation in treatment intensity across space and cohorts.
- 0720
Duflo, E., Glennerster, R., & Kremer, M. (2007). Using Randomization in Development Economics Research: A Toolkit. Handbook of Development Economics, 4, 3895–3962.
doi.org/10.1016/S1573-4471(07)04061-2
Surveyon experimental design, power analysisdevelopment-economicstoolkitfield-experimentspractical-guideAnnotation
Duflo, Glennerster, and Kremer write a comprehensive practical guide to running randomized experiments in development economics. The chapter covers all stages from design to analysis, including power calculations, stratification, dealing with attrition, and estimating treatment effects with imperfect compliance. It has become required reading for anyone designing a field experiment.
- 1220
Dunning, T. (2012). Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge University Press.
doi.org/10.1017/CBO9781139084444
Foundationalon experimental designnatural-experimentsdesign-basedtextbooksocial-sciencesAnnotation
Dunning provides a systematic framework for identifying and analyzing natural experiments across the social sciences. The book covers as-if random assignment, instrumental variables, regression discontinuity, and difference-in-differences through a unified design-based lens, making it essential reading for researchers exploiting natural variation for causal inference.
- 6919
Fama, E. F., Fisher, L., Jensen, M. C., & Roll, R. (1969). The Adjustment of Stock Prices to New Information. International Economic Review, 10(1), 1–21.
Foundationalon event studiesstock-pricesabnormal-returnsmarket-efficiencyAnnotation
Fama, Fisher, Jensen, and Roll establish the modern event study methodology by studying how stock prices adjust to stock splits. They develop the framework of measuring abnormal returns around corporate events using a market model to construct the counterfactual return. This methodology has become the standard tool for studying how information events affect asset prices and is used in thousands of subsequent studies across finance and strategy.
- 2220
Fan, Q., Hsu, Y.-C., Lieli, R. P., & Zhang, Y. (2022). Estimation of Conditional Average Treatment Effects with High-Dimensional Data. Journal of Business & Economic Statistics, 40(1), 313–327.
doi.org/10.1080/07350015.2020.1811102
Foundationalon double debiased machine learningCATEhigh-dimensionaldoubly-robustAnnotation
Fan and colleagues propose nonparametric estimators for conditional average treatment effects in high-dimensional settings. Their approach uses machine learning to estimate nuisance functions in a first stage, then applies local linear regression for the CATE function of interest, with functional limit theory and multiplier-bootstrap uniform inference.
- 9919
Fine, J. P., & Gray, R. J. (1999). A Proportional Hazards Model for the Subdistribution of a Competing Risk. Journal of the American Statistical Association, 94(446), 496–509.
doi.org/10.1080/01621459.1999.10474144
Foundationalon cox proportional hazardfoundationalcompeting-riskssubdistributionAnnotation
Fine and Gray develop a regression model for the cumulative incidence function under competing risks. The Fine-Gray model extends the Cox framework to settings where multiple event types compete, allowing estimation of covariate effects on the subdistribution hazard.
- 1220
Finkelstein, A., Taubman, S., Wright, B., Bernstein, M., Gruber, J., Newhouse, J. P., Allen, H., Baicker, K., & The Oregon Health Study Group (2012). The Oregon Health Insurance Experiment: Evidence from the First Year. Quarterly Journal of Economics, 127(3), 1057–1106.
Applicationon experimental designhealth-insurancelotteryLATEfield-experimentAnnotation
Finkelstein and colleagues analyze the Oregon Health Insurance Experiment, in which uninsured low-income adults are selected by lottery for the chance to apply for Medicaid. Using this randomized controlled design with IV to handle noncompliance, they estimate the local average treatment effect of Medicaid coverage on health care utilization, financial strain, and self-reported health. The study demonstrates the practical difference between intent-to-treat and LATE estimates in a real-world experiment where not all lottery winners enrolled.
- 0920
Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional Quantile Regressions. Econometrica, 77(3), 953–973.
Foundationalon quantile treatment effectsfoundationalunconditional-quantileRIFAnnotation
Firpo, Fortin, and Lemieux introduce the recentered influence function (RIF) regression for estimating unconditional quantile effects. They show that standard quantile regression estimates conditional quantile effects that do not aggregate to unconditional effects. RIF regression transforms the outcome variable so that OLS on the transformed outcome recovers the effect of covariates on unconditional quantiles. The key innovation enabling policy-relevant distributional analysis.
- 1820
Firpo, S., & Possebom, V. (2018). Synthetic Control Method: Inference, Sensitivity Analysis and Confidence Sets. Journal of Causal Inference, 6(2), 1–26.
Foundationalon synthetic controlinferencesensitivity-analysisconfidence-setsAnnotation
Firpo and Possebom develop formal inference procedures for the synthetic control method, including sensitivity analysis tools and confidence sets. Their framework provides a more rigorous basis for statistical inference in synthetic control applications beyond the standard permutation-based placebo tests.
- 3519
Fisher, R. A. (1935). The Design of Experiments. Oliver & Boyd.
Foundationalon experimental design, randomization inferencerandomizationfactorial-designfoundationsAnnotation
Fisher's classic book lays the foundations of experimental design, introducing concepts like randomization, blocking, and factorial designs. The 'lady tasting tea' example from this book remains one of the most famous illustrations of hypothesis testing and the logic of controlled experiments.
- 1520
Flammer, C. (2015). Does Corporate Social Responsibility Lead to Superior Financial Performance? A Regression Discontinuity Approach. Management Science, 61(11), 2549–2568.
doi.org/10.1287/mnsc.2014.2038
CSRshareholder-votingmanagement-scienceAnnotation
Flammer uses a regression discontinuity design around close-call shareholder votes on CSR proposals, comparing proposals that pass or fail by a small margin as a quasi-experiment. She finds that adopting CSR proposals leads to positive announcement returns and superior accounting performance, with effects operating through labor productivity and sales growth. Published in Management Science, it is a prominent example of RDD in top management journals.
- 0120
Fleming, L., & Sorenson, O. (2001). Technology as a Complex Adaptive System: Evidence from Patent Data. Research Policy, 30(7), 1019–1039.
doi.org/10.1016/S0048-7333(00)00135-9
Applicationon poisson negative binomialpatent-citationstechnology-complexityinnovationAnnotation
Fleming and Sorenson use negative binomial regression on patent citation counts to study how the complexity of technological combinations affects the usefulness of inventions. This paper is a prominent application of count models in the innovation and technology management literature.
- 2520
Frake, J., Gibbs, A., Goldfarb, B., Hiraiwa, T., Starr, E., & Yamaguchi, S. (2025). From Perfect to Practical: Partial Identification Methods for Causal Inference in Strategic Management Research. Strategic Management Journal, 46(8), 1894–1929.
partial-identificationsensitivity-analysisboundsmanagementAnnotation
Frake and colleagues introduce partial identification methods to strategic management, providing a practical framework for assessing the sensitivity of difference-in-differences and instrumental variables estimates to violations of identifying assumptions. The paper demonstrates how researchers can construct informative bounds on treatment effects when parallel trends or exclusion restriction assumptions are relaxed. It bridges the gap between the theoretical ideal of point identification and the practical reality that identifying assumptions are rarely perfectly satisfied.
- 0020
Frank, K. A. (2000). Impact of a Confounding Variable on a Regression Coefficient. Sociological Methods & Research, 29(2), 147–194.
doi.org/10.1177/0049124100029002001
Foundationalon sensitivity analysisITCVconfounding-variablethresholdAnnotation
Frank develops the impact threshold for a confounding variable (ITCV), which calculates how much bias an omitted variable would need to introduce to invalidate an inference. This approach is widely adopted in education and management research.
- 8419
Freeman, R. B., & Medoff, J. L. (1984). What Do Unions Do?. Basic Books.
Applicationon fixed effectsunion-wage-premiumfixed-effectslabor-economicsAnnotation
Freeman and Medoff examine the effects of unions on wages, productivity, inequality, and workplace governance, drawing on a wide range of data sources and econometric methods including longitudinal analysis. The book argues that unions have both a monopoly face (raising wages above competitive levels) and a collective voice face (improving workplace communication and reducing turnover). It remains influential as a comprehensive empirical assessment of union effects and a common pedagogical motivation for fixed effects methods in labor economics.
- 1620
Fremeth, A. R., Holburn, G. L. F., & Richter, B. K. (2016). Bridging Qualitative and Quantitative Methods in Organizational Research: Applications of Synthetic Control Methodology in the U.S. Automobile Industry. Organization Science, 27(2), 462–482.
doi.org/10.1287/orsc.2015.1034
managementstrategyfirm-level-synthetic-controlAnnotation
Fremeth, Holburn, and Richter introduce synthetic control methodology to strategic management research, demonstrating its application for studying the causal effect of organizational and regulatory events on individual firms. The paper shows how data-driven counterfactuals can replace ad-hoc comparison group selection in comparative case studies. It provides a template for strategy researchers seeking to apply synthetic control methods to firm-level outcome data with few treated units.
- 1920
Freyaldenhoven, S., Hansen, C., & Shapiro, J. M. (2019). Pre-Event Trends in the Panel Event-Study Design. American Economic Review, 109(9), 3307–3338.
Foundationalon event studiespre-trendspanel-datainstrumental-variablesAnnotation
Freyaldenhoven, Hansen, and Shapiro study panel event-study designs in which unobserved confounds can generate pre-event trends. They show how causal effects can still be identified by exploiting covariates related to the policy only through the confounds, yielding a 2SLS estimator that remains valid even when endogeneity induces pre-trends.
- 2220
Friebel, G., Heinz, M., & Zubanov, N. (2022). Middle Managers, Personnel Turnover, and Performance: A Long-Term Field Experiment in a Retail Chain. Management Science, 68(1), 211–229.
doi.org/10.1287/mnsc.2020.3905
field-experimentRCTmanagement-practicesturnoverretailAnnotation
Friebel, Heinz, and Zubanov conduct a long-term randomized field experiment in a large Eastern European retail chain, in which the CEO asked treated store managers to reduce employee quit rates. The intervention decreased the quit rate by a fifth to a quarter, lasting nine months before petering out, but reappearing after a reminder. However, there is no treatment effect on sales, illustrating that reducing turnover does not automatically translate into improved store performance.
- 3319
Frisch, R., & Waugh, F. V. (1933). Partial Time Regressions as Compared with Individual Trends. Econometrica, 1(4), 387–401.
Foundationalon ols regressionFWL-theorempartialling-outmultiple-regressionfixed-effectsAnnotation
Frisch and Waugh establish that a coefficient in a multiple regression can be obtained by first residualizing both the outcome and the regressor against all other covariates. The Frisch-Waugh-Lovell (FWL) theorem provides the theoretical foundation for understanding what 'controlling for' means in multiple regression and is the basis for modern fixed-effects estimation.
- 1120
Funk, M. J., Westreich, D., Wiesen, C., Sturmer, T., Brookhart, M. A., & Davidian, M. (2011). Doubly Robust Estimation of Causal Effects. American Journal of Epidemiology, 173(7), 761–767.
Applicationon doubly robust estimationepidemiologytutorialAIPWAnnotation
Funk and colleagues provide a practical tutorial on doubly robust estimation for epidemiologists, demonstrating through a worked example how the AIPW estimator protects against misspecification of either the outcome model or the propensity score model. This paper helps spread the method in health sciences.
- 1420
Gelman, A., & Carlin, J. (2014). Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science, 9(6), 641–651.
doi.org/10.1177/1745691614551642
Foundationalon power analysisType-S-errorType-M-errorexaggeration-ratioAnnotation
Gelman and Carlin extend traditional power analysis by introducing Type S (sign) errors (the probability a significant estimate has the wrong sign) and Type M (magnitude) errors (the expected exaggeration ratio of significant estimates). These concepts provide a richer understanding of what happens in underpowered studies.
- 1420
Gelman, A., & Loken, E. (2014). The Statistical Crisis in Science. American Scientist, 102(6), 460–465.
Foundationalon pre registrationreplication-crisisstatistical-crisispre-registrationAnnotation
Gelman and Loken argue that data-dependent analysis creates a 'garden of forking paths' that explains why many statistically significant comparisons do not hold up. They emphasize that researchers' analytical choices conditional on data characteristics inflate false positive rates even without deliberate p-hacking.
- 1920
Gelman, A., & Imbens, G. W. (2019). Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs. Journal of Business & Economic Statistics, 37(3), 447–456.
doi.org/10.1080/07350015.2017.1366909
Foundationalon regression discontinuity sharppolynomial-orderlocal-polynomialbest-practicesbandwidthAnnotation
Gelman and Imbens show that using high-order global polynomials in RDD leads to noisy estimates, sensitivity to the degree of polynomial, and poor coverage of confidence intervals. They recommend local linear or quadratic fits with appropriate bandwidth selection instead, fundamentally changing best practice for RDD estimation.
- 2020
Gerard, F., Rokkanen, M., & Rothe, C. (2020). Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable. Quantitative Economics, 11(3), 839–870.
Foundationalon lee boundsRDDmanipulationrunning-variableAnnotation
Gerard, Rokkanen, and Rothe study regression-discontinuity settings in which the running variable is manipulated, so conventional point identification fails. They show that treatment effects are still partially identified and derive sharp bounds under a general model in which the extent of manipulation is learned from the data.
- 1220
Gerber, A. S., & Green, D. P. (2012). Field Experiments: Design, Analysis, and Interpretation. W. W. Norton.
Surveyon experimental designfield-experimentstextbookpolitical-scienceAnnotation
Gerber and Green write a comprehensive textbook on field experiments covering randomization, blocking, clustering, noncompliance, and attrition. The book provides rigorous treatment of experimental design principles with practical guidance drawn from political science and public policy applications. It is particularly valuable for its coverage of complications that arise in real-world experiments, including how to handle noncompliance through intent-to-treat analysis and instrumental variables.
- 1020
Glynn, A. N., & Quinn, K. M. (2010). An Introduction to the Augmented Inverse Propensity Weighted Estimator. Political Analysis, 18(1), 36–56.
Foundationalon doubly robust estimationpolitical-sciencetutorialAIPWAnnotation
Glynn and Quinn introduce the AIPW estimator to political scientists, providing intuition, simulation evidence, and practical guidance. This tutorial demonstrates the advantages of doubly robust methods over propensity score weighting or outcome regression alone in social science applications.
- 0620
Gneezy, U., & List, J. A. (2006). Putting Behavioral Economics to Work: Testing for Gift Exchange in Labor Markets Using Field Experiments. Econometrica, 74(5), 1365–1384.
doi.org/10.1111/j.1468-0262.2006.00707.x
Applicationon lab experiment replicationAnnotation
Gneezy and List conduct field experiments to test gift exchange in labor markets. Workers who received an unexpectedly higher wage initially increased effort, but the effect dissipated within hours, suggesting that strong forms of gift exchange may not persist outside the laboratory.
- 1620
Gobillon, L., & Magnac, T. (2016). Regional Policy Evaluation: Interactive Fixed Effects and Synthetic Controls. Review of Economics and Statistics, 98(3), 535–551.
Foundationalon synthetic controlinteractive-fixed-effectsfactor-modelsregional-policyAnnotation
Gobillon and Magnac connect synthetic control to interactive fixed-effects models, showing that synthetic control can be interpreted as an estimator that allows for time-varying factor loadings. This paper bridges the synthetic control and factor model literatures.
- 1620
Goldfarb, B., & King, A. A. (2016). Scientific Apophenia in Strategic Management Research: Significance Tests & Mistaken Inference. Strategic Management Journal, 37(1), 167–176.
apopheniastrategic-managementrobustnessAnnotation
Goldfarb and King use distributional matching and posterior predictive checks to estimate that 24-40% of significant coefficients in strategic management research would become insignificant if studies were repeated. They document the problem of apophenia (finding patterns in noise) and offer practical suggestions for reducing false and inflated findings at both the individual and field level.
- 2020
Goldsmith-Pinkham, P., Sorkin, I., & Swift, H. (2020). Bartik Instruments: What, When, Why, and How. American Economic Review, 110(8), 2586–2624.
Foundationalon shift share instrumentsshare-exogeneitydecompositionidentificationAnnotation
Goldsmith-Pinkham, Sorkin, and Swift provide a rigorous econometric framework for shift-share instruments, showing that the Bartik instrument can be decomposed into a weighted sum of individual share-based instruments. They clarify that identification requires exogeneity of the initial shares, not the shocks.
- 2120
Goodman-Bacon, A. (2021). Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics, 225(2), 254–277.
doi.org/10.1016/j.jeconom.2021.03.014
Foundationalon staggered difference in differencesTWFE-decompositiontreatment-timingnegative-weightsAnnotation
Goodman-Bacon decomposes the two-way fixed-effects DID estimator into a weighted average of all possible two-group, two-period DID comparisons, revealing that some comparisons use already-treated units as controls. The decomposition clarifies when already-treated units enter as controls and why this can make the estimator difficult to interpret under treatment-effect heterogeneity.
- 2520
Gornall, W., & Strebulaev, I. A. (2025). Gender, Race, and Entrepreneurship: A Randomized Field Experiment on Venture Capitalists and Angels. Management Science, 71(6), 5308–5327.
doi.org/10.1287/mnsc.2024.4990
audit-studycorrespondence-studydiscriminationventure-capitalentrepreneurship+2Annotation
Gornall and Strebulaev conduct a large-scale correspondence experiment, sending approximately 80,000 pitch emails from fictitious startups to 28,000 venture capitalists and angel investors. By randomly varying the entrepreneur's name to signal gender and race, they find that female entrepreneurs received 9% more interested replies and Asian-surname entrepreneurs received 6% more responses than White-surname entrepreneurs, indicating favorable rather than adverse bias. The paper provides large-scale experimental evidence on investor response patterns by entrepreneur demographics in entrepreneurial finance.
- 8419
Gourieroux, C., Monfort, A., & Trognon, A. (1984). Pseudo Maximum Likelihood Methods: Theory. Econometrica, 52(3), 681–700.
Foundationalon poisson negative binomialpseudo-MLEPoisson-regressionrobust-estimationPPMLAnnotation
Gourieroux, Monfort, and Trognon develop the general theory of pseudo maximum likelihood estimation for cases in which the likelihood family may be misspecified. They derive conditions for consistency and asymptotic normality and characterize efficiency bounds in this broader framework. The Poisson PML result — consistency for the conditional mean under misspecification — is a special case that underpins the later widespread use of Poisson regression with robust standard errors.
- 9419
Grambsch, P. M., & Therneau, T. M. (1994). Proportional Hazards Tests and Diagnostics Based on Weighted Residuals. Biometrika, 81(3), 515–526.
doi.org/10.1093/biomet/81.3.515
Foundationalon cox proportional hazardfoundationaldiagnosticsschoenfeld-residualsAnnotation
Grambsch and Therneau introduce the scaled Schoenfeld residual test for the proportional hazards assumption. Plotting scaled Schoenfeld residuals against time reveals time-varying effects. The test is the standard diagnostic in applied survival analysis.
- 0820
Grant, A. M. (2008). The Significance of Task Significance: Job Performance Effects, Relational Mechanisms, and Boundary Conditions. Journal of Applied Psychology, 93(1), 108–124.
doi.org/10.1037/0021-9010.93.1.108
Applicationon experimental designtask-significancemotivationorganizational-behaviorfield-experimentAnnotation
Grant conducts field experiments showing that briefly exposing workers to the beneficiaries of their work significantly increased their motivation and performance. This paper is a well-known example of experimental design applied within organizational behavior research.
- 0320
Greve, H. R. (2003). A Behavioral Theory of R&D Expenditures and Innovations: Evidence from Shipbuilding. Academy of Management Journal, 46(6), 685–702.
behavioral-theoryaspiration-levelsinnovationR&Dnegative-binomial+1Annotation
Greve tests behavioral theory predictions about how performance relative to aspiration levels affects R&D investment and innovation output using count models in the Japanese shipbuilding industry. He finds that low performance triggers problemistic search (increasing R&D), high slack triggers slack search (also increasing R&D), and low performance increases risk tolerance for launching innovations. The paper demonstrates how to model count-based innovation outcomes with firm-level panel data in a management context.
- 7719
Griliches, Z. (1977). Estimating the Returns to Schooling: Some Econometric Problems. Econometrica, 45(1), 1–22.
Foundationalon ols regressionability-biasreturns-to-educationomitted-variablesAnnotation
Griliches systematically examines the biases in OLS estimates of returns to schooling, including ability bias and measurement error. This paper is a classic illustration of why researchers must think carefully about omitted variables when interpreting OLS coefficients causally.
- 9019
Griliches, Z. (1990). Patent Statistics as Economic Indicators: A Survey. Journal of Economic Literature, 28(4), 1661–1707.
Surveyon poisson negative binomialpatentsinnovationeconomic-indicatorsAnnotation
Griliches surveys the use of patent data as economic indicators, establishing patent counts as a key measure of innovative output. This survey motivates much of the subsequent applied work using Poisson and negative binomial models to study innovation.
- 9419
Gruber, J. (1994). The Incidence of Mandated Maternity Benefits. American Economic Review, 84(3), 622–641.
Applicationon difference in differencesmaternity-benefitslabor-economicspolicy-evaluationAnnotation
Gruber uses a DID design exploiting variation in state-level mandated maternity benefits to show that the costs of these benefits are shifted to workers in the form of lower wages. This study is a classic example of how DID can exploit policy variation across states and time.
- 9819
Hahn, J. (1998). On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects. Econometrica, 66(2), 315–331.
Foundationalon doubly robust estimationsemiparametric-efficiencypropensity-scoreefficiency-boundAnnotation
Hahn derives the semiparametric efficiency bound for estimating average treatment effects and shows that knowledge of the propensity score does not improve the bound—it is ancillary for ATE. The efficient estimators take the form of sample averages completed by nonparametric imputation. This paper is foundational for understanding efficient semiparametric estimation of treatment effects.
- 0120
Hahn, J., Todd, P., & Van der Klaauw, W. (2001). Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design. Econometrica, 69(1), 201–209.
doi.org/10.1111/1468-0262.00183
Foundationalon regression discontinuity fuzzyidentificationnonparametricWald-estimatorAnnotation
Hahn, Todd, and Van der Klaauw provide the formal econometric framework for both sharp and fuzzy regression discontinuity designs. For the fuzzy case, they show that the treatment effect can be identified as the ratio of the discontinuity in the outcome to the discontinuity in the treatment probability, analogous to a Wald estimator.
- 1220
Hainmueller, J. (2012). Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies. Political Analysis, 20(1), 25–46.
Foundationalon matching methodsentropy-balancingreweightingcovariate-balanceobservational-studiesAnnotation
Hainmueller introduces entropy balancing, a reweighting scheme that directly targets covariate balance by finding weights that satisfy pre-specified balance constraints while remaining as close to uniform as possible. Entropy balancing has become a popular alternative to propensity score matching because it achieves exact balance on specified moments by construction.
- 0320
Hamilton, B. H., & Nickerson, J. A. (2003). Correcting for Endogeneity in Strategic Management Research. Strategic Organization, 1(1), 51–78.
doi.org/10.1177/1476127003001001218
endogeneitystrategyself-selectionAnnotation
Hamilton and Nickerson warn strategy researchers that naive OLS estimates of the strategy-performance relationship are often biased by endogeneity, because firms that adopt a strategy differ systematically from those that do not. They provide an accessible tutorial on endogeneity and point toward solutions including instrumental variables and Heckman selection models. The paper remains a key reference for understanding why strategic management research requires identification strategies beyond simple regression.
- 0420
Harrison, G. W., & List, J. A. (2004). Field Experiments. Journal of Economic Literature, 42(4), 1009–1055.
doi.org/10.1257/0022051043004577
Foundationalon experimental design, randomization inferencefield-experimentstaxonomyexternal-validityAnnotation
Harrison and List provide an influential taxonomy of field experiments, distinguishing artefactual, framed, and natural field experiments from conventional lab experiments. The paper helps establish field experiments as a mainstream methodology in economics.
- 1620
Haushofer, J., & Shapiro, J. (2016). The Short-Term Impact of Unconditional Cash Transfers to the Poor: Experimental Evidence from Kenya. Quarterly Journal of Economics, 131(4), 1973–2042.
Applicationon multiple testingcash-transfersRCTFDRdevelopment-economicsAnnotation
Haushofer and Shapiro evaluate GiveDirectly's unconditional cash transfer program in Kenya, testing effects across many outcome domains including consumption, assets, food security, health, and psychological well-being. They apply FWER corrections with bootstrapped p-values across outcome families, providing a model for how to handle multiple testing transparently in large-scale randomized evaluations. A 2017 erratum (QJE 132(4): 2057–2060) corrected the FWER-adjusted p-values in Tables I and II, which had used insufficient bootstrap iterations.
- 7819
Hausman, J. A. (1978). Specification Tests in Econometrics. Econometrica, 46(6), 1251–1271.
Foundationalon fixed effects, random effectsHausman-testspecification-testfixed-vs-randomAnnotation
Hausman develops a general framework for specification testing based on comparing two estimators: one consistent under a broad set of assumptions and one efficient under a narrower null hypothesis. The test's most well-known application compares fixed effects (consistent if unit effects are correlated with regressors) against random effects (efficient under the null of no correlation), but the framework applies broadly to IV, simultaneous equations, and time-series cross-section models. The test statistic has a chi-squared distribution under the null and remains one of the most widely used diagnostic tools in applied econometrics.
- 8119
Hausman, J. A., & Taylor, W. E. (1981). Panel Data and Unobservable Individual Effects. Econometrica, 49(6), 1377–1398.
Foundationalon random effectsHausman-Taylortime-invariant-variablespanel-datainstrumental-variablesAnnotation
Hausman and Taylor develop an instrumental variables estimator for panel data that allows consistent estimation of coefficients on time-invariant variables even when individual effects are correlated with some regressors. The Hausman-Taylor estimator occupies a middle ground between fixed effects (which cannot estimate time-invariant coefficients) and random effects (which requires strict exogeneity).
- 8419
Hausman, J., Hall, B. H., & Griliches, Z. (1984). Econometric Models for Count Data with an Application to the Patents–R&D Relationship. Econometrica, 52(4), 909–938.
Foundationalon poisson negative binomialcount-datapatentsR&Dpanel-dataAnnotation
Hausman, Hall, and Griliches develop the econometric framework for Poisson and negative binomial regression models applied to count data, using the relationship between R&D spending and patent counts as the motivating application. The paper is a classic early econometric treatment of count-data models in panel settings.
- 8419
Hausman, J., & McFadden, D. (1984). Specification Tests for the Multinomial Logit Model. Econometrica, 52(5), 1219–1240.
Foundationalon logit probitIIAspecification-testmultinomial-logitAnnotation
Hausman and McFadden develop a specification test for the independence of irrelevant alternatives (IIA) assumption in multinomial logit. The test allows researchers to assess whether the logit model's restrictive substitution patterns are appropriate for their data, which is critical for applied work with multiple choice categories.
- 1920
Haven, T. L., & Van Grootel, L. (2019). Preregistering Qualitative Research. Accountability in Research, 26(3), 229–244.
doi.org/10.1080/08989621.2019.1580147
Surveyon pre registrationqualitative-researchpre-registrationextensionAnnotation
Haven and Van Grootel explore extending pre-registration to qualitative research, discussing what elements of qualitative studies can and should be pre-registered. This paper broadens the pre-registration conversation beyond quantitative experimental designs.
- 7919
Heckman, J. J. (1979). Sample Selection Bias as a Specification Error. Econometrica, 47(1), 153–161.
Foundationalon heckman selection model, lee boundsfoundationalselection-biasinverse-mills-ratioAnnotation
Heckman introduces the two-step estimator for correcting sample selection bias using the inverse Mills ratio. The paper shows that selection bias can be treated as an omitted variable problem, where the omitted variable is the conditional expectation of the error term given selection. One of the most cited papers in econometrics.
- 9719
Heckman, J. J., Ichimura, H., & Todd, P. E. (1997). Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme. Review of Economic Studies, 64(4), 605–654.
Foundationalon matching methodsmatching-estimatorcommon-supportprogram-evaluationAnnotation
Heckman, Ichimura, and Todd develop the econometric theory behind matching estimators, including conditions for identification and the importance of common support. They apply these methods to evaluate job training programs and show when matching works well and when it does not.
- 0520
Heckman, J. J., & Vytlacil, E. (2005). Structural Equations, Treatment Effects, and Econometric Policy Evaluation. Econometrica, 73(3), 669–738.
doi.org/10.1111/j.1468-0262.2005.00594.x
Foundationalon marginal treatment effectsMTEtreatment-effectsLATEATEATT+2Annotation
Heckman and Vytlacil use the marginal treatment effect (MTE) to connect the treatment-effects literature with structural econometric policy evaluation. A central result is that commonly used treatment-effect parameters (ATE, ATT, LATE, PRTE) can be expressed as weighted averages of the MTE curve, with each estimand using a different weight function. The framework shows how IV estimates with different instruments recover different weighted averages of the same underlying MTE, providing the theoretical foundation for understanding instrument-dependent variation in treatment-effect estimates.
- 0620
Henderson, A. D., Miller, D., & Hambrick, D. C. (2006). How Quickly Do CEOs Become Obsolete? Industry Dynamism, CEO Tenure, and Company Performance. Strategic Management Journal, 27(5), 447–460.
CEO-tenurefirm-performanceindustry-dynamismAnnotation
Henderson, Miller, and Hambrick study how CEO tenure affects performance in dynamic versus stable industries in this longitudinal strategy paper. In the stable food industry, performance improved steadily with tenure, declining only after 10-15 years; in the dynamic computer industry, performance declined steadily from the start. The paper demonstrates that the relationship between CEO tenure and performance is contingent on industry dynamism.
- 1720
Heß, S. (2017). Randomization Inference with Stata: A Guide and Software. Stata Journal, 17(3), 630–651.
doi.org/10.1177/1536867X1701700306
Foundationalon randomization inferenceStatasoftwareimplementationAnnotation
Heß develops the ritest Stata command for randomization inference, providing a practical tool for conducting permutation tests under arbitrary randomization procedures in experimental and quasi-experimental settings. The command accommodates stratified, clustered, and blocked randomization designs, and produces exact finite-sample p-values without distributional assumptions. The paper serves as both a software introduction and a practical guide to randomization inference for applied researchers.
- 0720
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3), 199–236.
Foundationalon matching methodspreprocessingmodel-dependencenonparametriccausal-inferenceAnnotation
Ho, Imai, King, and Stuart argue that matching should be used as a preprocessing step before parametric modeling, reducing model dependence and improving robustness of causal estimates. This influential paper reframed matching not as a standalone estimator but as a way to make subsequent parametric analyses less sensitive to specification choices.
- 0120
Hoenig, J. M., & Heisey, D. M. (2001). The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis. The American Statistician, 55(1), 19–24.
doi.org/10.1198/000313001300339897
Foundationalon power analysispost-hoc-powerstatistical-fallacypower-calculationsAnnotation
Hoenig and Heisey demonstrate that post hoc (observed) power calculations are fundamentally flawed because they are a monotone function of the p-value and add no information beyond the test result itself. This paper is essential reading for understanding why power analysis must be conducted before data collection.
- 0720
Hoetker, G. (2007). The Use of Logit and Probit Models in Strategic Management Research: Critical Issues. Strategic Management Journal, 28(4), 331–343.
strategy-researchmethodologycoefficient-comparisonAnnotation
Hoetker reviews how strategy researchers use logit and probit models and identifies common pitfalls, including misinterpretation of coefficients across groups and incorrect use of interaction terms. This paper provides concrete guidance for improving practice in management journals.
- 9719
Hofmann, D. A. (1997). An Overview of the Logic and Rationale of Hierarchical Linear Models. Journal of Management, 23(6), 723–744.
doi.org/10.1177/014920639702300602
HLMmanagement-methodologymultilevelAnnotation
Hofmann introduces hierarchical linear models to the management research community, explaining when and why multilevel random-effects models are appropriate for organizational data with nested structures. This tutorial is highly influential in promoting multilevel methods in management journals.
- 8619
Holland, P. W. (1986). Statistics and Causal Inference. Journal of the American Statistical Association, 81(396), 945–960.
doi.org/10.1080/01621459.1986.10478354
Foundationalon ols regressioncausal-inferencepotential-outcomesRubin-causal-modelfundamental-problemAnnotation
Holland articulates the fundamental problem of causal inference—that we can never observe both potential outcomes for the same unit—and formalizes the Rubin Causal Model framework. His dictum 'no causation without manipulation' shapes how a generation of researchers thinks about the conditions under which statistical associations can be given causal interpretations.
- 1720
Hollenbeck, J. R., & Wright, P. M. (2017). Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data. Journal of Management, 43(1), 5–18.
doi.org/10.1177/0149206316679487
HARKingpost-hoc-analysismanagement-methodologyAnnotation
Hollenbeck and Wright introduce the concept of 'Tharking' (Transparently Hypothesizing After Results Are Known), arguing that post hoc analysis of scientific data is valuable when conducted and reported transparently. They distinguish destructive HARKing from constructive post hoc exploration, making the case that management researchers should embrace exploratory analysis in discussion sections rather than disguising it as confirmatory.
- 1720
Hoogendoorn, S., Parker, S. C., & van Praag, M. (2017). Smart or Diverse Start-up Teams? Evidence from a Field Experiment. Organization Science, 28(6), 1010–1028.
doi.org/10.1287/orsc.2017.1158
field-experimentteam-diversityentrepreneurshipperformanceAnnotation
Hoogendoorn, Parker, and van Praag conduct a field experiment with 573 students randomly assigned to 49 startup teams that varied in cognitive ability dispersion. They find an inverted U-shaped relationship between ability dispersion and team performance, with moderately diverse teams in ability outperforming both homogeneous and highly dispersed teams. The random assignment to teams ensures that ability composition is exogenous, providing clean experimental identification of the effect of team cognitive diversity on venture performance.
- 0020
Horowitz, J. L., & Manski, C. F. (2000). Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data. Journal of the American Statistical Association, 95(449), 77–84.
doi.org/10.1080/01621459.2000.10473902
Foundationalon lee boundsmissing-datanonparametric-boundsrandomized-experimentsAnnotation
Horowitz and Manski extend the bounding approach to experiments with missing data on both covariates and outcomes. They show how to construct valid bounds under different assumptions about the missing data mechanism, providing a principled alternative to complete-case analysis and imputation.
- 2420
Hurst, R., Lee, S., & Frake, J. (2024). The Effect of Flatter Hierarchy on Applicant Pool Gender Diversity: Evidence from Experiments. Strategic Management Journal, 45(8), 1446–1484.
reverse-audit-studyfield-experimentgenderhierarchyrecruitment+1Annotation
Hurst, Lee, and Frake conduct a reverse audit study in partnership with a U.S. healthcare startup, sending recruitment emails to approximately 8,400 job seekers with randomly varied descriptions of the firm's organizational hierarchy. Featuring a flatter hierarchy did not significantly affect applicant pool size but significantly decreased women's representation, because women perceived flatter structures as offering fewer career advancement opportunities and greater workload burdens.
- 9519
Huselid, M. A. (1995). The Impact of Human Resource Management Practices on Turnover, Productivity, and Corporate Financial Performance. Academy of Management Journal, 38(3), 635–672.
human-resource-managementfirm-performancestrategic-HRMAnnotation
Huselid uses OLS (and related cross-sectional methods) to estimate the relationship between HR practices and firm performance in this influential management study. It helps launch the field of strategic HRM and illustrates both the power and limitations of regression-based approaches in management research.
- 1220
Iacus, S. M., King, G., & Porro, G. (2012). Causal Inference without Balance Checking: Coarsened Exact Matching. Political Analysis, 20(1), 1–24.
Foundationalon matching methodscoarsened-exact-matchingCEMbalanceAnnotation
Iacus, King, and Porro introduce Coarsened Exact Matching (CEM), which coarsens covariates into bins and then performs exact matching within those bins. CEM avoids many pitfalls of propensity score matching, such as the need to check balance iteratively, and gives the researcher direct control over the matching quality.
- 1020
Imai, K., Keele, L., & Tingley, D. (2010). A General Approach to Causal Mediation Analysis. Psychological Methods, 15(4), 309–334.
Foundationalon causal mediation analysispotential-outcomessequential-ignorabilitysensitivity-analysisAnnotation
Imai, Keele, and Tingley develop a general framework for causal mediation analysis grounded in the potential outcomes framework. They clarify the assumptions needed for identifying causal mediation effects, particularly the sequential ignorability assumption, and provide sensitivity analyses for violations.
- 1920
Imai, K., & Kim, I. S. (2019). When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?. American Journal of Political Science, 63(2), 467–490.
Foundationalon fixed effectscausal-inferencelongitudinal-datatreatment-historyassumptionsAnnotation
Imai and Kim provide a modern causal-inference framework for understanding when unit fixed effects regression yields unbiased estimates with longitudinal data. They clarify the often-implicit assumptions about treatment history and carryover effects, offering a more rigorous foundation for applied fixed effects analysis.
- 9419
Imbens, G. W., & Angrist, J. D. (1994). Identification and Estimation of Local Average Treatment Effects. Econometrica, 62(2), 467–475.
Foundationalon instrumental variablesLATEcompliersmonotonicityidentificationAnnotation
Imbens and Angrist show that IV identifies the average causal effect for compliers -- the subpopulation whose treatment status is changed by the instrument -- under the monotonicity assumption, in this foundational paper on LATE. This reinterpretation fundamentally changes how researchers understand what IV estimates.
- 0420
Imbens, G. W. (2004). Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review. Review of Economics and Statistics, 86(1), 4–29.
doi.org/10.1162/003465304323023651
Surveyon matching methodsaverage-treatment-effectunconfoundednessnonparametricsurveyAnnotation
Imbens provides a comprehensive review of nonparametric methods for estimating average treatment effects under the unconfoundedness assumption, covering matching, weighting, and subclassification estimators. This survey unifies the theoretical foundations of matching methods and clarifies the connections between different estimators used in program evaluation.
- 0420
Imbens, G. W., & Manski, C. F. (2004). Confidence Intervals for Partially Identified Parameters. Econometrica, 72(6), 1845–1857.
doi.org/10.1111/j.1468-0262.2004.00555.x
Foundationalon lee boundspartial-identificationconfidence-intervalsboundsinferenceAnnotation
Imbens and Manski develop methods for constructing valid confidence intervals when parameters are only partially identified—that is, when the data and assumptions narrow the parameter to a set rather than a point. This paper provides the inferential foundation for reporting uncertainty around bounds estimates, including Lee bounds.
- 0820
Imbens, G. W., & Lemieux, T. (2008). Regression Discontinuity Designs: A Guide to Practice. Journal of Econometrics, 142(2), 615–635.
doi.org/10.1016/j.jeconom.2007.05.001
Foundationalon regression discontinuity fuzzy, regression discontinuity sharppractical-guidebandwidth-selectionlocal-IVAnnotation
Imbens and Lemieux provide a comprehensive practical guide to implementing RDD, covering bandwidth selection, functional form, and graphical analysis. Their treatment of fuzzy RDD as a local IV estimator clarifies the interpretation and implementation for applied researchers.
- 1220
Imbens, G., & Kalyanaraman, K. (2012). Optimal Bandwidth Choice for the Regression Discontinuity Estimator. Review of Economic Studies, 79(3), 933–959.
Foundationalon regression discontinuity fuzzybandwidth-selectionlocal-linearoptimal-bandwidthAnnotation
Imbens and Kalyanaraman derive the asymptotically optimal bandwidth for the local linear regression discontinuity estimator and propose a simple data-driven bandwidth selector. The IK bandwidth becomes the standard choice before the Calonico-Cattaneo-Titiunik (2014) refinement.
- 1520
Imbens, G. W. (2015). Matching Methods in Practice: Three Examples. Journal of Human Resources, 50(2), 373–419.
Applicationon matching methodspractical-guidepropensity-scorebalancesensitivity-analysisAnnotation
Imbens demonstrates how to implement matching methods in practice through three detailed empirical examples, covering propensity score estimation, covariate balance assessment, overlap and trimming, and robustness to alternative estimators. This paper is an invaluable practical guide that bridges the gap between matching theory and applied research.
- 1520
Imbens, G. W., & Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press.
doi.org/10.1017/CBO9781139025751
Surveyon matching methods, randomization inferencecausal-inferencepotential-outcomespropensity-scoretextbookAnnotation
Imbens and Rubin provide a comprehensive textbook grounding causal inference in the potential outcomes framework, with detailed treatment of matching, propensity scores, and subclassification. They provide rigorous foundations for selection-on-observables methods.
- 1720
Ioannidis, J. P. A., Stanley, T. D., & Doucouliagos, H. (2017). The Power of Bias in Economics Research. Economic Journal, 127(605), F236–F265.
Surveyon power analysisunderpowered-studiespublication-biasmeta-scienceAnnotation
Ioannidis, Stanley, and Doucouliagos conduct a large-scale assessment of statistical power in economics research and find that the median power to detect typical effect sizes is only 18%. They document widespread underpowering and publication bias, highlighting the importance of ex ante power analysis.
- 9519
Islam, N. (1995). Growth Empirics: A Panel Data Approach. Quarterly Journal of Economics, 110(4), 1127–1170.
Applicationon random effectsgrowth-empiricsconvergencecross-countrypanel-dataAnnotation
Islam applies panel data methods—including random effects and fixed effects—to the cross-country growth regression framework, showing that accounting for unobserved country heterogeneity substantially changes estimates of convergence rates. This paper demonstrates the importance of choosing between fixed and random effects in macroeconomic growth empirics.
- 1820
Jaeger, D. A., Ruist, J., & Stuhler, J. (2018). Shift-Share Instruments and the Impact of Immigration. NBER Working Paper No. 24285.
Foundationalon shift share instrumentsimmigrationserial-correlationexclusion-restrictionAnnotation
Jaeger, Ruist, and Stuhler highlight a threat to shift-share instruments in immigration research: serial correlation in immigrant inflows can bias estimates if past immigration affects current outcomes through channels other than current immigration. This paper raises important concerns about the exclusion restriction.
- 2420
Jia, N., Luo, X., Fang, Z., & Liao, C. (2024). When and How Artificial Intelligence Augments Employee Creativity. Academy of Management Journal, 67(1), 5–32.
field-experimentRCTartificial-intelligencecreativitydouble-randomizationAnnotation
Jia, Luo, Fang, and Liao conduct a field experiment examining how AI assistance affects creative work through a sequential division of labor. They find that AI augmentation improves average output quality but reduces the novelty of top-performing work, with effects moderated by employee skill level. The paper provides causal evidence on the productivity implications of human-AI collaboration in knowledge work.
- 0720
Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. Statistical Science, 22(4), 523–539.
Surveyon doubly robust estimationmodel-misspecificationsimulationcritical-assessmentAnnotation
Kang and Schafer show through simulations that doubly robust estimators can perform poorly when both models are moderately misspecified, even though they remain consistent when one model is correct. This influential paper tempers enthusiasm and motivates further methodological work on practical performance.
- 1620
Kang, S. K., DeCelles, K. A., Tilcsik, A., & Jun, S. (2016). Whitened Résumés: Race and Self-Presentation in the Labor Market. Administrative Science Quarterly, 61(3), 469–502.
doi.org/10.1177/0001839216639577
audit-studydiscriminationhiringracerésumésAnnotation
Kang and colleagues conduct a résumé audit study sending fictitious applications to real employers, finding that minority applicants who 'whitened' their résumés received significantly more callbacks. The study combines a correspondence experiment with qualitative interviews, providing a powerful example of how audit studies can identify discrimination in hiring.
- 5819
Kaplan, E. L., & Meier, P. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53(282), 457–481.
doi.org/10.1080/01621459.1958.10501452
Foundationalon cox proportional hazardfoundationalnonparametricsurvival-functionAnnotation
Kaplan and Meier introduce the product-limit estimator (Kaplan-Meier estimator) for the survival function from right-censored data. The KM curve is the standard nonparametric tool for visualizing survival and comparing groups before fitting regression models.
- 0220
Katila, R., & Ahuja, G. (2002). Something Old, Something New: A Longitudinal Study of Search Behavior and New Product Introduction. Academy of Management Journal, 45(6), 1183–1194.
knowledge-searchnew-productsinnovation-managementAnnotation
Katila and Ahuja use negative binomial models to study how the depth and scope of a firm's knowledge search affect new product introductions. This paper is a widely cited application of count data models in the strategic management and innovation literature.
- 2220
Kaul, A., Klossner, S., Pfeifer, G., & Schieler, M. (2022). Standard Synthetic Control Methods: The Case of Using All Preintervention Outcomes Together With Covariates. Journal of Business & Economic Statistics, 40(3), 1362–1376.
doi.org/10.1080/07350015.2021.1930012
Applicationon synthetic controlsynthetic-controlmatching-pitfallspre-treatment-outcomesAnnotation
Kaul et al. show that using all pre-treatment outcome lags as predictors in synthetic control (a form of matching for aggregate units) renders other covariates irrelevant, threatening unbiasedness. Their finding highlights pitfalls when matching on pre-treatment outcomes and is relevant for understanding matching assumptions more broadly.
- 0120
King, G., & Zeng, L. (2001). Logistic Regression in Rare Events Data. Political Analysis, 9(2), 137–163.
doi.org/10.1093/oxfordjournals.pan.a004868
Foundationalon logit probitrare-eventslogistic-regressionbinary-outcomesmethodologyAnnotation
King and Zeng develop a correction for logistic regression when the outcome event is rare. Standard logit underestimates the probability of rare events; their rare-events logit (relogit) applies a correction based on prior information about the event rate in the population. Essential reference for binary outcome studies with highly imbalanced classes.
- 1520
King, G., & Roberts, M. E. (2015). How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It. Political Analysis, 23(2), 159–179.
Surveyon ols regressionrobust-standard-errorsmodel-specificationmethodologyAnnotation
King and Roberts argue that researchers often use robust standard errors as a band-aid rather than fixing the underlying model specification. They provide practical guidance on when robust SEs are appropriate and when the model itself needs to be reconsidered.
- 1920
King, G., & Nielsen, R. (2019). Why Propensity Scores Should Not Be Used for Matching. Political Analysis, 27(4), 435–454.
Surveyon matching methodspropensity-score-critiquebalancemodel-dependenceAnnotation
King and Nielsen argue that propensity score matching can increase imbalance, model dependence, and bias relative to other matching methods. This provocative paper has influenced a shift toward alternatives like CEM and Mahalanobis distance matching in applied research.
- 1320
Kleven, H. J., & Waseem, M. (2013). Using Notches to Uncover Optimization Frictions and Structural Elasticities: Theory and Evidence from Pakistan. Quarterly Journal of Economics, 128(2), 669–723.
Foundationalon bunching estimationnotchoptimization-frictionsstructural-estimationPakistanAnnotation
Kleven and Waseem extend bunching estimation from kinks to notches -- discrete jumps in the tax schedule where the average tax rate changes discontinuously. They develop a structural framework that distinguishes between frictionless and frictional bunching, showing that optimization frictions attenuate observed bunching and cause the naive estimator to understate the true elasticity. Their model identifies both the structural elasticity and the friction distribution from the observed bunching pattern. Applied to Pakistan's income tax notches, they demonstrate that frictions are empirically important and that ignoring them substantially biases elasticity estimates downward.
- 1620
Kleven, H. J. (2016). Bunching. Annual Review of Economics, 8, 435–464.
doi.org/10.1146/annurev-economics-080315-015234
Surveyon bunching estimationsurveykinknotchfrictionsmethodologyAnnotation
Kleven provides a comprehensive survey of the bunching methodology, covering both kink and notch designs, the role of optimization frictions, and extensions to multiple applications beyond taxation. The survey unifies the theoretical frameworks from Saez (2010) and Kleven and Waseem (2013), discusses practical implementation issues (polynomial order, bandwidth, bin width), and catalogs the growing literature applying bunching to estimate behavioral elasticities in public finance, labor economics, and regulation. Essential reading for anyone starting with bunching methods.
- 1620
Kline, P., & Walters, C. R. (2016). Evaluating Public Programs with Close Substitutes: The Case of Head Start. Quarterly Journal of Economics, 131(4), 1795–1848.
Applicationon lee boundsHead-Startprogram-evaluationsubstitutionAnnotation
Kline and Walters develop a semi-parametric selection model to evaluate Head Start in the presence of close substitute preschool programs, estimating both average and marginal treatment effects. They find that Head Start's effects vary substantially with the quality of available alternatives, and that the program passes a cost-benefit test for the average participant. The paper demonstrates how accounting for alternative program availability changes the interpretation of experimental treatment effects.
- 2120
Knaus, M. C., Lechner, M., & Strittmatter, A. (2021). Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence. Econometrics Journal, 24(1), 134–161.
Applicationon double debiased machine learninglabor-market-policyheterogeneous-effectsempirical-Monte-CarloAnnotation
Knaus, Lechner, and Strittmatter conduct an empirical Monte Carlo study benchmarking eleven causal machine learning estimators for heterogeneous treatment effects across 24 data-generating processes based on real labor market data. They find that no single estimator dominates across all settings, and that ensemble methods combining multiple learners perform well overall. The study provides practical guidance on when different CATE estimators (causal forests, DML-based methods, meta-learners) are most reliable.
- 7819
Koenker, R., & Bassett, G., Jr. (1978). Regression Quantiles. Econometrica, 46(1), 33–50.
Foundationalon quantile treatment effectsfoundationalquantile-regressioneconometricsAnnotation
Koenker and Bassett introduce quantile regression, proposing to estimate conditional quantile functions by minimizing an asymmetric absolute loss (check function), generalizing least absolute deviations to arbitrary quantiles. Establishes asymptotic theory and demonstrates robustness to outliers and heteroscedasticity relative to OLS.
- 9919
Koenker, R., & Machado, J. A. F. (1999). Goodness of Fit and Related Inference Processes for Quantile Regression. Journal of the American Statistical Association, 94(448), 1296–1310.
doi.org/10.1080/01621459.1999.10473882
Foundationalon quantile treatment effectsAnnotation
Koenker and Machado introduce a goodness-of-fit measure for quantile regression analogous to the R-squared of least squares, based on the ratio of minimized check functions across restricted and unrestricted models. They also develop related inference processes for testing composite hypotheses about covariate effects over an entire range of quantiles, with asymptotic behavior linked to Bessel processes. Practitioners estimating quantile regressions can use this pseudo-R-squared and joint significance tests to assess model fit across the conditional distribution.
- 1520
Kontopantelis, E., Doran, T., Springate, D. A., Buchan, I., & Reeves, D. (2015). Regression Based Quasi-Experimental Approach When Randomisation Is Not an Option: Interrupted Time Series Analysis. BMJ, 350, h2750.
Surveyon interrupted time seriessurveypractical-guidehealth-policyAnnotation
Kontopantelis and colleagues provide a practical guide to ITS analysis published in the BMJ. Covers model specification, autocorrelation testing, sensitivity analyses, and the addition of control series. Provides clear visual examples of level and slope changes and discusses common pitfalls.
- 0720
Kothari, S. P., & Warner, J. B. (2007). Econometrics of Event Studies. Handbook of Empirical Corporate Finance, 1, 3–36.
doi.org/10.1016/B978-0-444-53265-7.50015-9
Surveyon event studieslong-horizoncross-sectionaleconometricsAnnotation
Kothari and Warner provide an updated survey of event study methods, covering long-horizon event studies, cross-sectional regression approaches, and the econometric challenges that arise with overlapping events and event-induced variance changes. The survey documents how the basic FFJR framework is extended and refined over four decades. It is an essential reference for researchers designing event studies who need to understand the full menu of methodological choices and their trade-offs.
- 9919
Krueger, A. B. (1999). Experimental Estimates of Education Production Functions. Quarterly Journal of Economics, 114(2), 497–532.
doi.org/10.1162/003355399556052
Applicationon ols regressioneducationclass-sizerandomized-experimentProject-STARAnnotation
Krueger uses Tennessee's Project STAR randomized class-size experiment to estimate the effect of class size on student achievement via OLS. Because treatment is randomized, the OLS coefficient has a causal interpretation, demonstrating that the method is not the issue -- the research design is what determines causality.
- 1920
Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning. Proceedings of the National Academy of Sciences, 116(10), 4156–4165.
doi.org/10.1073/pnas.1804597116
Foundationalon causal forestsX-learnermeta-learnersCATEAnnotation
Künzel and colleagues propose the X-learner meta-algorithm for estimating CATEs and systematically compare it with T-learners and S-learners using random forests and BART as base learners. The paper provides practical guidance on when different meta-learning strategies perform well or poorly.
- 8219
Laird, N. M., & Ware, J. H. (1982). Random-Effects Models for Longitudinal Data. Biometrics, 38(4), 963–974.
Foundationalon random effectslongitudinal-datamixed-effectsbiostatisticsAnnotation
Laird and Ware develop the general framework for random-effects models in longitudinal data, integrating fixed population parameters with random individual-level effects. This paper is foundational for the mixed-effects modeling approach widely used in biostatistics and social sciences.
- 8619
LaLonde, R. J. (1986). Evaluating the Econometric Evaluations of Training Programs with Experimental Data. American Economic Review, 76(4), 604–620.
Foundationalon matching methodsexperimental-benchmarkprogram-evaluationjob-trainingnon-experimental-methodsAnnotation
LaLonde compares econometric estimates of a job training program's effect with experimental benchmarks from a randomized trial, finding that non-experimental methods often failed to replicate the experimental results. This paper establishes the standard test bed for evaluating matching and other observational causal methods.
- 9219
Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14.
Foundationalon poisson negative binomialzero-inflated-Poissonexcess-zerosmanufacturingcount-dataAnnotation
Lambert introduces the zero-inflated Poisson (ZIP) model, which accounts for excess zeros in count data by mixing a point mass at zero with a Poisson distribution. The ZIP model has become a standard tool for count outcomes where a subpopulation generates only zeros.
- 1520
Landais, C. (2015). Assessing the Welfare Effects of Unemployment Benefits Using the Regression Kink Design. American Economic Journal: Economic Policy, 7(4), 243–278.
Applicationon regression kink designunemployment-insuranceUSbenefit-schedulessocial-insuranceAnnotation
Landais uses the regression kink design to decompose the moral hazard and liquidity effects of unemployment insurance benefits using US data. The progressive UI benefit formula creates kinks that provide quasi-experimental variation in benefit levels. This paper demonstrates the power of RKD for evaluating social insurance programs where benefits change slope at known thresholds.
- 8319
Leamer, E. E. (1983). Let's Take the Con Out of Econometrics. American Economic Review, 73(1), 31–43.
Foundationalon specification curveextreme-boundsrobustnessspecification-sensitivityAnnotation
Leamer's classic paper argues that the sensitivity of empirical results to specification choices undermines the credibility of econometric evidence. He proposes extreme bounds analysis, an early form of systematic robustness testing that anticipates modern specification curve analysis by several decades.
- 0820
Lee, D. S. (2008). Randomized Experiments from Non-random Selection in U.S. House Elections. Journal of Econometrics, 142(2), 675–697.
doi.org/10.1016/j.jeconom.2007.05.004
Foundationalon regression discontinuity sharpelectionslocal-randomizationmanipulationAnnotation
Lee formalizes the conditions under which an RDD is 'as good as' a randomized experiment—namely, when agents cannot precisely manipulate the running variable around the cutoff. Applied to U.S. House elections, this paper establishes the modern theoretical foundation for sharp RDD.
- 0920
Lee, D. S. (2009). Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects. Review of Economic Studies, 76(3), 1071–1102.
doi.org/10.1111/j.1467-937X.2009.00536.x
Foundationalon lee boundssharp-boundssample-selectionmonotonicityAnnotation
Lee develops sharp nonparametric bounds on treatment effects in the presence of sample selection, requiring only a monotonicity assumption (that treatment affects selection in one direction). These bounds are widely used to address attrition and selective sample composition in randomized experiments.
- 1020
Lee, D. S., & Lemieux, T. (2010). Regression Discontinuity Designs in Economics. Journal of Economic Literature, 48(2), 281–355.
surveyvalidity-testseconometric-theoryAnnotation
Lee and Lemieux write the standard survey of RDD methods in economics, covering both sharp and fuzzy designs, validity tests, and extensions. This paper is the standard reference for understanding the econometric theory and practical implementation of RDD.
- 2220
Lee, S. (2022). The Myth of the Flat Start-Up: Reconsidering the Organizational Structure of Start-Ups. Strategic Management Journal, 43(1), 58–92.
Oster-methodcoefficient-stabilityomitted-variable-biasstart-upsorganizational-structure+1Annotation
Lee examines the relationship between organizational hierarchy on start-up creative and commercial success in the video game industry. She uses Oster's (2019) coefficient stability method to assess robustness to omitted variable bias, demonstrating how partial identification techniques complement standard empirical approaches in strategy research.
- 2220
Lee, D. S., McCrary, J., Moreira, M. J., & Porter, J. (2022). Valid t-Ratio Inference for IV. American Economic Review, 112(10), 3260–3290.
Foundationalon instrumental variablesweak-instrumentst-ratioF-statisticinferenceAnnotation
Lee, McCrary, Moreira, and Porter address the potentially severe large-sample distortions of t-ratio-based inference in the single-IV model. They introduce the tF critical value function, a standard error adjustment that is a smooth function of the first-stage F-statistic, which corrects for weak instrument bias. They find that for one-quarter of specifications in 61 AER papers, corrected standard errors are at least 49% larger than conventional 2SLS standard errors at the 5% significance level. The practical implication is that researchers using IV should apply their tF correction rather than relying on conventional standard errors.
- 1220
Lennox, C. S., Francis, J. R., & Wang, Z. (2012). Selection Models in Accounting Research. The Accounting Review, 87(2), 589–616.
Surveyon heckman selection modelsurveyaccountingbest-practicesAnnotation
Lennox, Francis, and Wang review the use (and misuse) of Heckman selection models in accounting research. Documents common pitfalls including weak exclusion restrictions, failure to test normality, and mechanical application without economic justification for the selection equation.
- 9719
Levitt, S. D. (1997). Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime. American Economic Review, 87(3), 270–290.
Applicationon instrumental variablescrimepoliceelectoral-cyclesreverse-causalityAnnotation
Levitt uses the timing of mayoral and gubernatorial elections as an instrument for police hiring to estimate the causal effect of police on crime. The paper illustrates the IV approach in a policy-relevant setting where the key concern is reverse causality (more crime leads to more police).
- 9319
Lin, D. Y., Wei, L. J., & Ying, Z. (1993). Checking the Cox Model with Cumulative Sums of Martingale-Based Residuals. Biometrika, 80(3), 557–572.
doi.org/10.1093/biomet/80.3.557
Foundationalon cox proportional hazardfoundationaldiagnosticsmodel-checkingAnnotation
Lin, Wei, and Ying develop graphical and numerical methods for checking the Cox model using cumulative sums of martingale-based residuals. Provides formal tests for the proportional hazards assumption, functional form of covariates, and overall model adequacy.
- 1320
Lin, W. (2013). Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman's Critique. Annals of Applied Statistics, 7(1), 295–318.
Foundationalon lab experiment tutorialAnnotation
Lin shows that OLS regression adjustment with a full set of treatment-covariate interactions yields an estimator that is asymptotically no less precise than the unadjusted difference in means in randomized experiments, even without assuming correct model specification. This result resolves Freedman's critique of regression adjustment by demonstrating that the interacted specification, combined with Huber-White standard errors, produces valid inference under Neyman's randomization model. Experimentalists should include treatment-by-covariate interactions and use robust standard errors when adjusting for baseline covariates.
- 1520
Linden, A. (2015). Conducting Interrupted Time-Series Analysis for Single- and Multiple-Group Comparisons. Stata Journal, 15(2), 480–500.
doi.org/10.1177/1536867X1501500208
Surveyon interrupted time seriessurveystatasoftwareAnnotation
Linden introduces the itsa command in Stata for single- and multiple-group ITS analysis. Covers Newey-West standard errors for autocorrelation, Prais-Winsten estimation, and the extension to controlled ITS with a comparison group. A key reference for Stata users.
- 1120
List, J. A., Sadoff, S., & Wagner, M. (2011). So You Want to Run an Experiment, Now What? Some Simple Rules of Thumb for Optimal Experimental Design. Experimental Economics, 14(4), 439–457.
doi.org/10.1007/s10683-011-9275-7
Surveyon experimental designpower-analysissample-sizedesign-guideAnnotation
List, Sadoff, and Wagner provide rules of thumb for sample size, treatment assignment, and other design decisions in field experiments in this practical guide. It is a useful starting point for researchers planning their first experiment.
- 1920
List, J. A., Shaikh, A. M., & Xu, Y. (2019). Multiple Hypothesis Testing in Experimental Economics. Experimental Economics, 22(4), 773–793.
doi.org/10.1007/s10683-018-09597-5
Surveyon multiple testingexperimental-economicsfield-experimentspractical-guideAnnotation
List, Shaikh, and Xu provide practical guidance on addressing multiple hypothesis testing in experimental economics. They compare various correction methods including Bonferroni, Holm, and FDR procedures, and demonstrate their application to field experiments with multiple outcome variables.
- 9719
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. SAGE Publications.
Surveyon logit probittextbookcategorical-datalimited-dependent-variablesAnnotation
Long provides a comprehensive reference for applied researchers working with binary, ordinal, multinomial, and count outcome models. The textbook covers maximum likelihood estimation, marginal effects computation, and model diagnostics with clear exposition and software implementation guidance. It remains the standard practical guide for researchers who need to move beyond OLS to handle categorical and limited dependent variables.
- 0020
Long, J. S., & Ervin, L. H. (2000). Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model. The American Statistician, 54(3), 217–224.
doi.org/10.1080/00031305.2000.10474549
Foundationalon ols regressionrobust-standard-errorsheteroscedasticityHC3simulationAnnotation
Long and Ervin compare HC0, HC1, HC2, and HC3 heteroscedasticity-consistent standard error estimators in a simulation study. Their finding that HC3 performs best in finite samples has influenced applied practice, with many applied researchers preferring HC3 over the default HC0.
- 1720
Lopez Bernal, J., Cummins, S., & Gasparrini, A. (2017). Interrupted Time Series Regression for the Evaluation of Public Health Interventions: A Tutorial. International Journal of Epidemiology, 46(1), 348–355.
Surveyon interrupted time seriessurveytutorialpublic-healthAnnotation
Lopez Bernal, Cummins, and Gasparrini provide an accessible tutorial on ITS regression for public health researchers. Covers the segmented regression model, autocorrelation diagnostics, Newey-West standard errors, and practical guidance on minimum number of time points. An excellent starting point for applied researchers.
- 1820
Lopez Bernal, J., Cummins, S., & Gasparrini, A. (2018). The Use of Controls in Interrupted Time Series Studies of Public Health Interventions. International Journal of Epidemiology, 47(6), 2082–2093.
Surveyon interrupted time seriessurveytutorialcontrolled-itsAnnotation
Lopez Bernal and colleagues provide a tutorial on extending ITS analysis with control groups to strengthen causal inference. Discusses controlled ITS (CITS) designs that combine the ITS framework with a comparison series, addressing the key threat of concurrent events confounding the intervention effect.
- 0420
Lunceford, J. K., & Davidian, M. (2004). Stratification and Weighting via the Propensity Score in Estimation of Causal Treatment Effects: A Comparative Study. Statistics in Medicine, 23(19), 2937–2960.
Applicationon doubly robust estimationpropensity-scorecomparisonsimulationAnnotation
Lunceford and Davidian compare propensity-score stratification, inverse probability weighting, and doubly robust estimators in a systematic simulation study. The paper provides a side-by-side assessment of these approaches for estimating causal treatment effects from observational data.
- 1920
Machado, J. A. F., & Santos Silva, J. M. C. (2019). Quantiles via Moments. Journal of Econometrics, 213(1), 145–173.
doi.org/10.1016/j.jeconom.2019.04.009
Foundationalon quantile treatment effectsfoundationalpanel-datafixed-effectsAnnotation
Machado and Santos Silva show that, under a conditional location-scale structure, regression quantiles can be estimated by estimating conditional means. This 'quantiles via moments' approach makes it possible to use tools developed for mean regression in distributional-effects settings, and it can be adapted to panel data with fixed effects by avoiding the incidental parameters problem.
- 9719
MacKinlay, A. C. (1997). Event Studies in Economics and Finance. Journal of Economic Literature, 35(1), 13–39.
Surveyon event studiessurveymethodologyabnormal-returnsstatistical-testingAnnotation
MacKinlay provides a comprehensive methodological survey of event studies, covering the statistical framework, estimation windows, abnormal return calculations, and testing procedures. This paper remains the standard reference for researchers designing and implementing event studies.
- 0720
MacKinnon, D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation Analysis. Annual Review of Psychology, 58, 593–614.
doi.org/10.1146/annurev.psych.58.110405.085542
Surveyon causal mediation analysispsychologySobel-testbootstrappingAnnotation
MacKinnon, Fairchild, and Fritz provide an accessible review of mediation analysis methods for psychologists, covering the Baron-Kenny approach, the Sobel test, bootstrapping methods, and extensions to multiple mediators. This survey helped bridge the gap between traditional and modern approaches.
- 9019
Manski, C. F. (1990). Nonparametric Bounds on Treatment Effects. American Economic Review: Papers & Proceedings, 80(2), 319–323.
Foundationalon lee boundspartial-identificationworst-case-boundsnonparametricAnnotation
Manski introduces the partial identification approach to treatment effects, showing that even without strong assumptions, one can bound causal effects using the observed data. His worst-case bounds framework lays the theoretical foundation for Lee's sharper bounds under the monotonicity assumption.
- 9319
Manski, C. F. (1993). Identification of Endogenous Social Effects: The Reflection Problem. Review of Economic Studies, 60(3), 531–542.
Foundationalon instrumental variablesidentificationsocial-interactionspeer-effectsreflection-problemAnnotation
Manski formalizes the reflection problem in the analysis of social interactions: when individual outcomes depend on group averages, the group average is simultaneously determined by its members. This simultaneity makes it impossible to distinguish true social (endogenous) effects from correlated effects without additional structure or exclusion restrictions. The paper is essential reading for any researcher attempting to estimate peer effects or social spillovers.
- 0320
Manski, C. F. (2003). Partial Identification of Probability Distributions. Springer.
Foundationalon lee boundspartial-identificationtextbookboundsnonparametricAnnotation
Manski's monograph provides a comprehensive treatment of partial identification, showing how to derive informative bounds on parameters of interest when point identification is not possible. This book formalizes and extends his earlier work on bounding treatment effects and is the standard reference for the theoretical framework underlying Lee bounds.
- 1220
Masicampo, E. J., & Lalande, D. (2012). A Peculiar Prevalence of p Values Just Below .05. Quarterly Journal of Experimental Psychology, 65(11), 2271–2279.
doi.org/10.1080/17470218.2012.711335
Applicationon specification curvep-valuespublication-biasspecification-searchingAnnotation
Masicampo and Lalande document a suspicious clustering of p-values just below the .05 threshold in psychology journals, providing empirical evidence of publication bias and researcher degrees of freedom. They discuss potential sources of this pattern and its implications for the credibility of published findings in the social sciences.
- 2120
Masten, M. A., & Poirier, A. (2021). Salvaging Falsified Instrumental Variable Models. Econometrica, 89(3), 1449–1469.
Foundationalon sensitivity analysisinstrumental-variablesfalsificationpartial-identificationboundsAnnotation
Masten and Poirier study what researchers can do when an IV model is falsified. They introduce the falsification frontier and the falsification adaptive set, which quantify minimal relaxations of the baseline assumptions and report the parameter values consistent with minimally nonfalsified models, providing a structured sensitivity-analysis framework for IV.
- 0820
McCrary, J. (2008). Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test. Journal of Econometrics, 142(2), 698–714.
doi.org/10.1016/j.jeconom.2007.05.005
Foundationalon regression discontinuity fuzzy, regression discontinuity sharpmanipulation-testdensity-testvalidityAnnotation
McCrary develops the standard test for whether agents are manipulating the running variable to sort around the cutoff. If the density of the running variable shows a discontinuity at the cutoff, the RDD is compromised. This density test is now a routine validity check in all RDD papers.
- 7419
McFadden, D. (1974). Conditional Logit Analysis of Qualitative Choice Behavior. Frontiers in Econometrics, 105–142.
Foundationalon logit probitconditional-logitdiscrete-choicerandom-utilityAnnotation
McFadden develops the conditional logit model grounded in random utility theory, showing how discrete choices among alternatives can be modeled by assuming individuals maximize utility with an extreme-value distributed error. This work earns him the 2000 Nobel Prize and remains the foundation of discrete choice analysis.
- 1220
McKenzie, D. (2012). Beyond Baseline and Follow-Up: The Case for More T in Experiments. Journal of Development Economics, 99(2), 210–221.
doi.org/10.1016/j.jdeveco.2012.01.002
Foundationalon power analysisANCOVAmultiple-periodsdevelopment-economicsAnnotation
McKenzie shows that collecting multiple rounds of data can substantially increase statistical power in randomized experiments. He demonstrates that ANCOVA with baseline data and difference-in-differences with multiple time periods can substantially reduce the required sample size, which is particularly valuable in development economics.
- 9719
McWilliams, A., & Siegel, D. (1997). Event Studies in Management Research: Theoretical and Empirical Issues. Academy of Management Journal, 40(3), 626–657.
management-methodologytutorialstrategy-researchAnnotation
McWilliams and Siegel provide a critical assessment of event study methodology as applied in management research, identifying common theoretical and design pitfalls including confounding events, improper event window selection, and thin trading. The paper outlines procedures for appropriate use of event studies and serves as a widely cited methodological guide for strategy and management researchers conducting event studies.
- 0420
Miguel, E., Satyanath, S., & Sergenti, E. (2004). Economic Shocks and Civil Conflict: An Instrumental Variables Approach. Journal of Political Economy, 112(4), 725–753.
Applicationon instrumental variablescivil-conflictrainfall-instrumentweather-IVAfricaAnnotation
Miguel, Satyanath, and Sergenti instrument for economic growth using rainfall variation to estimate the causal effect of economic shocks on civil conflict in Sub-Saharan Africa. Their paper is a clean and widely cited example of using weather as an instrumental variable, illustrating both the power and the exclusion restriction challenges of weather-based instruments.
- 1420
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., Glennerster, R., Green, D. P., Humphreys, M., Imbens, G., Laitin, D., Madon, T., Nelson, L., Nosek, B. A., Petersen, M., Sedlmayr, R., Simmons, J. P., Simonsohn, U., & Van der Laan, M. (2014). Promoting Transparency in Social Science Research. Science, 343(6166), 30–31.
doi.org/10.1126/science.1245317
Foundationalon pre registrationtransparencyopen-sciencesocial-scienceAnnotation
Miguel and a coalition of leading social scientists call for greater transparency in research, including pre-registration of studies and analysis plans, open data, and replication. This short but influential piece in Science helps establish the norms and infrastructure for pre-registration in social science.
- 7419
Mincer, J. (1974). Schooling, Experience, and Earnings. National Bureau of Economic Research / Columbia University Press.
Applicationon ols regressionreturns-to-educationlabor-economicswage-equationAnnotation
Mincer develops the canonical human-capital earnings function relating log wages to years of schooling and labor-market experience. The Mincer equation remains one of the most replicated empirical models in economics and remains the standard benchmark for wage-equation analysis, though it should not be read as having solved the causal identification problems surrounding returns to schooling.
- 1820
Mogstad, M., Santos, A., & Torgovitsky, A. (2018). Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters. Econometrica, 86(5), 1589–1619.
Foundationalon marginal treatment effectsMTEpartial-identificationboundspolicy-evaluationivmteAnnotation
Mogstad, Santos, and Torgovitsky develop a framework for using instrumental variables to conduct inference on policy-relevant treatment effects under weaker assumptions than full MTE identification. They show that even when the MTE is only partially identified (due to limited support of the propensity score), informative bounds on ATE, ATT, and PRTE can be derived by combining the identified portion of the MTE with shape restrictions. Their approach uses linear programming to compute sharp bounds on the target parameter given the data and assumptions. The paper provides the R package ivmte for implementation and demonstrates that useful policy conclusions can be drawn even without point-identifying the entire MTE curve.
- 1320
Montiel Olea, J. L., & Pflueger, C. (2013). A Robust Test for Weak Instruments. Journal of Business & Economic Statistics, 31(3), 358–369.
doi.org/10.1080/00401706.2013.806694
Foundationalon instrumental variablesweak-instrumentsrobust-inferenceF-statisticAnnotation
Montiel Olea and Pflueger propose an effective F-statistic for testing weak instruments that is robust to heteroscedasticity, serial correlation, and clustering — unlike the conventional first-stage F. The effective F is now the standard diagnostic for instrument strength in applied IV research.
- 9019
Moulton, B. R. (1990). An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Units. Review of Economics and Statistics, 72(2), 334–338.
Foundationalon ols regressionclusteringaggregate-variablesstandard-errorsMoulton-problemAnnotation
Moulton demonstrates that when aggregate-level variables (such as state policies) are used to explain individual-level outcomes, OLS standard errors that ignore within-group correlation can be dramatically understated. This paper establishes the 'Moulton problem' and motivates the widespread adoption of clustered standard errors in applied microeconomics.
- 8719
Mroz, T. A. (1987). The Sensitivity of an Empirical Model of Married Women's Hours of Work to Economic and Statistical Assumptions. Econometrica, 55(4), 765–799.
Applicationon heckman selection modelapplicationlabor-supplysensitivityAnnotation
Mroz provides a classic application of the Heckman selection model to female labor supply. Shows that the two-step estimator's results are sensitive to the choice of exclusion restriction and the normality assumption. The Mroz dataset remains a standard teaching dataset for selection models.
- 1720
Mullainathan, S., & Spiess, J. (2017). Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), 87–106.
Surveyon double debiased machine learningmachine-learningprediction-vs-causationeconomicsAnnotation
Mullainathan and Spiess provide an accessible introduction to supervised machine learning for economists, emphasizing how ML differs from classical parameter estimation and where prediction-oriented tools can be useful in empirical economics. The paper is a broad ML-for-economists survey, not a foundational paper on double/debiased machine learning specifically.
- 1720
Munafo, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017). A Manifesto for Reproducible Science. Nature Human Behaviour, 1, 0021.
doi.org/10.1038/s41562-016-0021
Foundationalon specification curvereproducibilityopen-sciencemanifestoAnnotation
Munafo, Nosek, and colleagues identify threats to reproducible science and propose a broad reform agenda spanning methods, reporting, reproducibility practices, evaluation, and incentives. The article is a general reproducibility manifesto that provides the broader scientific reform context motivating robustness-analysis approaches.
- 7819
Mundlak, Y. (1978). On the Pooling of Time Series and Cross Section Data. Econometrica, 46(1), 69–85.
Foundationalon fixed effects, random effectscorrelated-random-effectspanel-datapoolingAnnotation
Mundlak shows that the fixed effects estimator can be understood as an OLS regression that includes the group means of all time-varying regressors. This 'correlated random effects' interpretation bridges the fixed effects and random effects models and clarifies exactly what assumption is being relaxed.
- 1620
Muralidharan, K., Niehaus, P., & Sukhtankar, S. (2016). Building State Capacity: Evidence from Biometric Smartcards in India. American Economic Review, 106(10), 2895–2929.
Applicationon power analysiscluster-RCTMDEdevelopment-economicsstate-capacityAnnotation
Muralidharan, Niehaus, and Sukhtankar evaluate a large-scale randomized rollout of biometric smartcards for welfare payments in India, finding that the reform improved payment speed, predictability, and integrity. The paper includes detailed ex ante power calculations that demonstrate best practices for reporting minimum detectable effects in cluster-randomized designs.
- 0620
Murray, M. P. (2006). Avoiding Invalid Instruments and Coping with Weak Instruments. Journal of Economic Perspectives, 20(4), 111–132.
Surveyon instrumental variablesinstrument-validityweak-instrumentspractical-guideapplied-workAnnotation
Murray provides practical guidance on evaluating instrument validity and dealing with weak instruments in applied work. Written in an accessible style, it helps applied researchers think critically about their instrument choices and provides concrete strategies for addressing common IV pitfalls.
- 0020
Neumark, D., & Wascher, W. (2000). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Comment. American Economic Review, 90(5), 1362–1396.
Applicationon difference in differencesminimum-wagereplicationmeasurementdidAnnotation
Neumark and Wascher challenge Card and Krueger's (1994) minimum wage findings by re-analyzing the data using payroll records instead of survey responses, finding negative employment effects. The exchange illustrates the importance of data quality and measurement choices in difference-in-differences designs.
- 8719
Newey, W. K., & West, K. D. (1987). A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica, 55(3), 703–708.
Foundationalon ols regressionHACautocorrelationtime-seriesstandard-errorsAnnotation
Newey and West extend White's robust standard errors to also account for autocorrelation in time-series data in this short but hugely influential paper. The 'Newey-West standard errors' or 'HAC standard errors' are standard practice whenever researchers work with data that have a time dimension.
- 9919
Newey, W. K. (1999). Two Step Series Estimation of Sample Selection Models. MIT Department of Economics Working Paper 99-04.
Foundationalon heckman selection modelAnnotation
Newey proposes a semiparametric two-step estimator for sample selection models that replaces the parametric inverse Mills ratio with a flexible series (power series or regression spline) approximation to the unknown selection correction function. This approach avoids the normality assumption underlying the standard Heckman correction while retaining the computational convenience of a two-step procedure. Researchers concerned about distributional misspecification in selection models can use series-based selection corrections as a robust alternative to parametric methods.
- 8119
Nickell, S. (1981). Biases in Dynamic Models with Fixed Effects. Econometrica, 49(6), 1417–1426.
Foundationalon fixed effectsdynamic-panelsNickell-biaslagged-dependent-variableAnnotation
Nickell shows that including a lagged dependent variable in a fixed effects regression creates a bias that does not vanish as the number of cross-sectional units grows. This 'Nickell bias' is a critical concern for researchers using fixed effects in dynamic panel models with short time series.
- 2120
Nie, X., & Wager, S. (2021). Quasi-Oracle Estimation of Heterogeneous Treatment Effects. Biometrika, 108(2), 299–319.
doi.org/10.1093/biomet/asaa076
Foundationalon causal forestsR-learnerCATEmeta-learnersAnnotation
Nie and Wager propose the R-learner, a two-step approach for estimating heterogeneous treatment effects that first residualizes outcomes and treatment on covariates, then estimates the CATE by regressing outcome residuals on treatment residuals. This approach can use any machine learning method including causal forests.
- 1020
Nielsen, H. S., Sorensen, T., & Taber, C. (2010). Estimating the Effect of Student Aid on College Enrollment: Evidence from a Government Grant Policy Reform. American Economic Journal: Economic Policy, 2(2), 185–215.
Applicationon regression kink designstudent-aidcollege-enrollmentDenmarkearly-applicationAnnotation
Nielsen, Sorensen, and Taber apply a regression kink design to estimate the effect of student financial aid on college enrollment in Denmark. The Danish student aid formula creates a kink in the relationship between parental income and aid received. They exploit this kink to identify causal effects, providing one of the earliest applications of the RKD methodology.
- 1820
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The Preregistration Revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606.
doi.org/10.1073/pnas.1708274114
Foundationalon pre registrationpre-registrationopen-sciencecredibilityAnnotation
Nosek and colleagues make the case for widespread adoption of pre-registration, arguing that it distinguishes confirmatory from exploratory analyses, reduces publication bias, and increases the credibility of empirical research. This paper helps catalyze the pre-registration movement across the social sciences.
- 1520
Olken, B. A. (2015). Promises and Perils of Pre-Analysis Plans. Journal of Economic Perspectives, 29(3), 61–80.
Surveyon pre registrationpre-analysis-plansdevelopment-economicstradeoffsAnnotation
Olken provides a balanced assessment of pre-analysis plans in development economics, discussing both benefits (reduced specification searching, increased credibility) and costs (loss of flexibility, difficulty specifying analyses in advance). This paper is essential reading for understanding the practical tradeoffs of pre-registration.
- 1920
Oprescu, M., Syrgkanis, V., & Wu, Z. S. (2019). Orthogonal Random Forest for Causal Inference. Proceedings of the 36th International Conference on Machine Learning, 97, 4932–4941.
orthogonal-forestsEconMLDMLAnnotation
Oprescu, Syrgkanis, and Wu propose orthogonal random forests, which combine Neyman-orthogonal moments with generalized random forests to reduce sensitivity to nuisance-estimation error. The paper provides theoretical results and shows how the method can be used for heterogeneous-effect estimation with discrete or continuous treatments.
- 1920
Orben, A., & Przybylski, A. K. (2019). The Association between Adolescent Well-Being and Digital Technology Use. Nature Human Behaviour, 3(2), 173–182.
doi.org/10.1038/s41562-018-0506-1
Applicationon specification curvedigital-technologywell-beinglarge-scale-applicationpsychologyAnnotation
Orben and Przybylski apply specification curve analysis to the hotly debated question of whether digital technology use harms adolescent well-being, running over 20,000 specifications across three large datasets. They find that technology use has a negligible negative association with well-being, far smaller than commonly assumed, demonstrating how specification curve analysis can bring clarity to contested empirical questions by mapping the full space of defensible analytical choices.
- 1920
Oster, E. (2019). Unobservable Selection and Coefficient Stability: Theory and Evidence. Journal of Business & Economic Statistics, 37(2), 187–204.
doi.org/10.1080/07350015.2016.1227711
Foundationalon sensitivity analysiscoefficient-stabilityproportional-selectionboundingAnnotation
Oster extends the Altonji, Elder, and Taber approach to assess the robustness of regression estimates to omitted variable bias. She proposes a bounding method based on the proportional selection assumption and coefficient stability across specifications, now widely used in applied economics.
- 8619
Palepu, K. G. (1986). Predicting Takeover Targets: A Methodological and Empirical Analysis. Journal of Accounting and Economics, 8(1), 3–35.
doi.org/10.1016/0165-4101(86)90008-X
Applicationon logit probittakeover-predictioncorporate-governancefinanceAnnotation
Palepu uses logit models to study takeover prediction and identifies methodological flaws in prior prediction studies, showing that targets are more difficult to predict than earlier work suggests. The paper highlights the importance of proper classification criteria and sampling methodology when applying binary choice models to rare-event corporate outcomes.
- 0120
Pearl, J. (2001). Direct and Indirect Effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, 411–420.
Foundationalon causal mediation analysisstructural-causal-modelsnatural-effectsdo-calculusAnnotation
Pearl formalizes the concepts of natural direct and indirect effects using structural causal models and do-calculus. This paper establishes the nonparametric identification conditions for mediation effects and shows that traditional mediation analysis conflates causal and non-causal pathways.
- 0920
Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press.
doi.org/10.1017/CBO9780511803161
Foundationalon matching methodsDAGsdo-calculusstructural-causal-modelsfoundationsAnnotation
Pearl provides a comprehensive treatment of causal inference using directed acyclic graphs, the do-calculus, and structural causal models. The book formalizes the rules for reading conditional independence from graphs and establishes when causal effects are identifiable from observational data. It is the foundational reference for any researcher using DAGs to reason about confounding, mediation, and causal identification.
- 1420
Pearl, J. (2014). Interpretation and Identification of Causal Mediation. Psychological Methods, 19(4), 459–481.
Foundationalon causal mediation analysisstructural-causal-modelsidentificationnatural-effectsgraphical-criteriaAnnotation
Pearl provides a structural causal model perspective on mediation, clarifying the interpretation and identification of natural direct and indirect effects. He shows how graphical criteria can determine when mediation effects are identifiable and contrasts the structural approach with the potential outcomes framework used by Imai, Keele, and Tingley.
- 1220
Peterson, M. F., Arregle, J.-L., & Martin, X. (2012). Multilevel Models in International Business Research. Journal of International Business Studies, 43(5), 451–457.
international-businessmultilevelcross-countryAnnotation
Peterson, Arregle, and Martin review the use of multilevel random-effects models in international business research, where firms are nested within countries. They discuss best practices for modeling cross-level effects and the importance of accounting for the hierarchical structure of international data.
- 2420
Pongeluppe, L. S. (2024). The Allegory of the Favela: The Multifaceted Effects of Socioeconomic Mobility. Administrative Science Quarterly, 69(3), 619–654.
doi.org/10.1177/00018392241240469
RCTfield-experimentsocioeconomic-mobilitystigmaentrepreneurship+1Annotation
Pongeluppe conducts a randomized controlled trial of a business training program offered to residents of Brazilian favelas, complementing the experiment with quantile regressions, field visits, and interviews. The results show that training improves economic outcomes such as income and entrepreneurship participation, but also intensifies participants' experiences of favela-related stigma, revealing that socioeconomic mobility can simultaneously generate material benefits and psychosocial costs.
- 2220
Porreca, Z. (2022). Synthetic Difference-in-Differences Estimation with Staggered Treatment Timing. Economics Letters, 220, 110874.
doi.org/10.1016/j.econlet.2022.110874
Foundationalon synthetic difference in differencesstaggered-adoptionextensionpolicy-evaluationAnnotation
Porreca extends the synthetic DID estimator to staggered treatment adoption settings, where multiple units adopt treatment at different times. The method constructs a localized estimator in which treated units are compared to a never-treated control group weighted on both the time and unit dimensions.
- 8719
Powell, J. L. (1987). Semiparametric Estimation of Bivariate Latent Variable Models. SSRI Working Paper 8704, University of Wisconsin-Madison.
Foundationalon heckman selection modelAnnotation
Powell develops semiparametric methods for estimating bivariate latent variable models—including censored sample selection models—without imposing distributional assumptions on the error terms. This approach relaxes the bivariate normality requirement of the Heckman two-step estimator, requiring only an exclusion restriction and mild regularity conditions. Researchers who doubt the normality assumption in selection models can apply these methods to obtain consistent estimates under weaker conditions.
- 0820
Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and Resampling Strategies for Assessing and Comparing Indirect Effects in Multiple Mediator Models. Behavior Research Methods, 40(3), 879–891.
Foundationalon causal mediation analysismultiple-mediatorsbootstrappingsoftwareAnnotation
Preacher and Hayes develop methods and software for testing indirect effects through multiple mediators simultaneously, using bootstrapping to construct confidence intervals. Their approach and accompanying SPSS and SAS macros become extremely widely used in psychology and management research.
- 0020
Puhani, P. A. (2000). The Heckman Correction for Sample Selection and Its Critique. Journal of Economic Surveys, 14(1), 53–68.
doi.org/10.1111/1467-6419.00104
Surveyon heckman selection modelsurveycomparisontwo-step-vs-mleAnnotation
Puhani provides a short overview of Monte Carlo evidence on the Heckman two-step estimator, comparing it with full-information MLE and subsample OLS. Finds MLE preferable absent collinearity between the exclusion restriction and other regressors, but subsample OLS most robust when collinearity is present.
- 1820
Pustejovsky, J. E., & Tipton, E. (2018). Small-Sample Methods for Cluster-Robust Variance Estimation and Hypothesis Testing in Fixed Effects Models. Journal of Business & Economic Statistics, 36(4), 672–683.
doi.org/10.1080/07350015.2016.1247004
Foundationalon clustering inferencecluster-robustfew-clustersCR2Annotation
Pustejovsky and Tipton develop the CR2 bias-reduced cluster-robust variance estimator for fixed effects models with few clusters. The CR2 correction improves coverage relative to the standard CR1 estimator when the number of clusters is small.
- 1220
Rabe-Hesketh, S., & Skrondal, A. (2012). Multilevel and Longitudinal Modeling Using Stata. Stata Press, 3rd edition.
Surveyon random effectsmultilevel-modelsStatapractical-guidehierarchicalAnnotation
Rabe-Hesketh and Skrondal provide a comprehensive practical guide to multilevel (hierarchical) models in Stata, which generalize the random effects framework to more complex nested data structures. It is an essential reference for applied researchers implementing multilevel models.
- 2320
Rambachan, A., & Roth, J. (2023). A More Credible Approach to Parallel Trends. Review of Economic Studies, 90(5), 2555–2591.
doi.org/10.1093/restud/rdad018
Foundationalon event studiesparallel-trendssensitivity-analysishonest-confidence-intervalsAnnotation
Rambachan and Roth develop a sensitivity analysis framework for assessing the robustness of event-study and difference-in-differences estimates to violations of the parallel trends assumption. Their approach constructs honest confidence intervals under restrictions on how pre-trends can extrapolate into the post-treatment period, providing a disciplined alternative to informal pre-trend tests.
- 2420
Rathje, J., Katila, R., & Reineke, P. (2024). Making the Most of AI and Machine Learning in Organizations and Strategy Research: Supervised Machine Learning, Causal Inference, and Matching Models. Strategic Management Journal, 45(10), 1926–1953.
machine-learningmatchingpropensity-scorecausal-inferencemethodology+1Annotation
Rathje, Katila, and Reineke review how supervised machine learning can support causal-inference workflows in strategy research, with emphasis on two-stage matching models for sample-selection problems. Using technology invention data, they demonstrate ML-based approaches to covariate selection and matching while discussing the broader potential and limits of ML in organizational research.
- 0220
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods. SAGE Publications.
Surveyon random effectsHLMmultilevel-modelingnested-datatextbookAnnotation
Raudenbush and Bryk popularize hierarchical linear models (HLM), which are random-effects models for nested data structures such as students within schools, in this influential textbook. It becomes the standard reference for multilevel modeling in education, psychology, and organizational research.
- 8819
Rivers, D., & Vuong, Q. H. (1988). Limited Information Estimators and Exogeneity Tests for Simultaneous Probit Models. Journal of Econometrics, 39(3), 347–366.
doi.org/10.1016/0304-4076(88)90063-2
Foundationalon heckman selection modelAnnotation
Rivers and Vuong propose a computationally simple two-step maximum likelihood procedure for estimating simultaneous probit models with endogenous regressors, and derive simple exogeneity tests based on this estimator. The exogeneity tests are asymptotically equivalent to classical tests based on limited information maximum likelihood but require only probit and OLS regressions to implement. Applied researchers working with binary outcome models and suspected endogeneity can use the Rivers-Vuong procedure as a tractable alternative to full information maximum likelihood.
- 9219
Robins, J. M., & Greenland, S. (1992). Identifiability and Exchangeability for Direct and Indirect Effects. Epidemiology, 3(2), 143–155.
doi.org/10.1097/00001648-199203000-00013
Foundationalon causal mediation analysisdirect-effectsindirect-effectsepidemiologyAnnotation
Robins and Greenland provide early formal conditions for identifying direct and indirect causal effects in epidemiology. Their work on controlled direct effects and the assumptions required for mediation analysis lays important groundwork for the modern causal mediation literature.
- 9419
Robins, J. M., Rotnitzky, A., & Zhao, L. P. (1994). Estimation of Regression Coefficients When Some Regressors Are Not Always Observed. Journal of the American Statistical Association, 89(427), 846–866.
doi.org/10.1080/01621459.1994.10476818
Foundationalon doubly robust estimationAIPWmissing-datasemiparametricAnnotation
Robins, Rotnitzky, and Zhao introduce the augmented inverse probability weighting (AIPW) estimator, which combines outcome modeling and propensity score weighting. The key insight is that the estimator is consistent if either the outcome model or the propensity score model is correctly specified, providing a double layer of protection against misspecification.
- 8819
Robinson, P. M. (1988). Root-N-Consistent Semiparametric Regression. Econometrica, 56(4), 931–954.
Foundationalon double debiased machine learningpartially-linearsemiparametricroot-n-consistencyAnnotation
Robinson develops the partially linear regression estimator that achieves root-n consistency for the parametric component by partialling out nonparametric nuisance functions. This paper provides the semiparametric foundation that DML generalizes to the machine learning setting.
- 1720
Rohrer, J. M., Egloff, B., & Schmukle, S. C. (2017). Probing Birth-Order Effects on Narrow Traits Using Specification-Curve Analysis. Psychological Science, 28(12), 1821–1832.
doi.org/10.1177/0956797617723726
Applicationon specification curvebirth-orderpersonalityapplied-exampleAnnotation
Rohrer, Egloff, and Schmukle apply specification curve analysis to the long-debated question of whether birth order affects personality traits. By running all defensible specifications, they show that most previously reported birth-order effects disappear, demonstrating the method's power to resolve contested empirical questions.
- 0520
Romano, J. P., & Wolf, M. (2005). Stepwise Multiple Testing as Formalized Data Snooping. Econometrica, 73(4), 1237–1282.
doi.org/10.1111/j.1468-0262.2005.00615.x
Foundationalon multiple testingstepwise-testingresamplingFWERAnnotation
Romano and Wolf develop a stepwise multiple testing procedure that controls the family-wise error rate while being less conservative than Bonferroni by resampling from the joint distribution of test statistics. Their method accounts for the correlation structure among tests and is widely used in economics.
- 8319
Rosenbaum, P. R., & Rubin, D. B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70(1), 41–55.
doi.org/10.1093/biomet/70.1.41
Foundationalon matching methodspropensity-scoreselection-on-observablescausal-inferenceAnnotation
Rosenbaum and Rubin introduce the propensity score as a dimension-reduction tool for matching, showing that conditioning on the scalar probability of treatment is sufficient to remove selection bias when the unconfoundedness assumption holds. This paper establishes the theoretical foundation for all propensity-score-based methods, including matching, stratification, and inverse probability weighting. The key practical insight is that matching on a single score avoids the curse of dimensionality that makes direct covariate matching infeasible with many confounders.
- 0220
Rosenbaum, P. R. (2002). Observational Studies. Springer.
doi.org/10.1007/978-1-4757-3692-2
observational-studiessensitivity-analysisRosenbaum-boundstextbookAnnotation
Rosenbaum provides the standard textbook on observational study design, covering matching, sensitivity analysis, and design principles for drawing causal inferences from non-experimental data. His framework for sensitivity analysis (Rosenbaum bounds) is the standard tool for assessing how much unobserved confounding would be needed to overturn a matching-based finding.
- 2220
Roth, J. (2022). Pretest with Caution: Event-Study Estimates after Testing for Parallel Trends. American Economic Review: Insights, 4(3), 305–322.
Foundationalon difference in differences, event studiespre-trendspre-testinghonest-confidence-intervalsevent-studyAnnotation
Roth shows that the common practice of testing for parallel pre-trends and proceeding conditional on 'passing' can lead to distorted inference. He proposes honest confidence intervals that account for pre-testing, fundamentally changing how researchers should think about event study pre-trends in DiD designs.
- 2320
Roth, J., Sant'Anna, P. H. C., Bilinski, A., & Poe, J. (2023). What's Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature. Journal of Econometrics, 235(2), 2218–2244.
doi.org/10.1016/j.jeconom.2023.03.008
Surveyon difference in differences, staggered difference in differences, synthetic difference in differencessurveystaggered-DIDheterogeneous-effectspre-trendsAnnotation
Roth et al. synthesize the explosion of recent econometric work on DID in this comprehensive survey, covering staggered treatment timing, heterogeneous treatment effects, pre-trends testing, and new estimators. It is the essential starting point for understanding the modern DID literature.
- 7419
Rubin, D. B. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66(5), 688–701.
Foundationalon experimental designpotential-outcomescausal-inferenceRubin-causal-modelAnnotation
Rubin formalizes the 'potential outcomes' framework that is now central to causal inference. The idea is simple but powerful: each unit has a potential outcome under treatment and under control, and the causal effect is the difference. This paper is the origin of what is now called the Rubin Causal Model.
- 1020
Saez, E. (2010). Do Taxpayers Bunch at Kink Points?. American Economic Journal: Economic Policy, 2(3), 180–212.
Foundationalon bunching estimationbunchingkink-pointelasticityincome-taxEITCAnnotation
Saez introduces the modern bunching methodology by examining taxpayer responses to kink points in the US income tax schedule, where marginal tax rates change discretely. He shows how to estimate the compensated elasticity of reported income from the excess mass of taxpayers at kink points relative to a smooth counterfactual density fitted by polynomial. The paper establishes the standard empirical approach: bin the data, fit a polynomial excluding the bunching region, and compute the excess mass. He finds modest elasticities overall but sharp bunching among the self-employed near the first EITC kink.
- 2020
Sant'Anna, P. H. C., & Zhao, J. (2020). Doubly Robust Difference-in-Differences Estimators. Journal of Econometrics, 219(1), 101–122.
doi.org/10.1016/j.jeconom.2020.06.003
Foundationalon doubly robust estimationDIDdoubly-robustATTAnnotation
Sant'Anna and Zhao develop doubly robust DID estimators that combine outcome regression and inverse probability weighting. The estimator is consistent for the ATT if either the outcome evolution model or the propensity score model for treatment group membership is correctly specified.
- 9919
Scharfstein, D. O., Rotnitzky, A., & Robins, J. M. (1999). Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models. Journal of the American Statistical Association, 94(448), 1096–1120.
doi.org/10.1080/01621459.1999.10473862
Foundationalon doubly robust estimationmissing-datadropoutsemiparametric-efficiencyAnnotation
Scharfstein, Rotnitzky, and Robins develop a semiparametric sensitivity analysis framework for nonignorable dropout in longitudinal studies. They propose treating the selection bias parameter as known, then varying it over a plausible range to assess how inferences change. This paper provides foundational methods for sensitivity analysis under nonignorable missing data.
- 1420
Semadeni, M., Withers, M. C., & Certo, S. T. (2014). The Perils of Endogeneity and Instrumental Variables in Strategy Research: Understanding through Simulations. Strategic Management Journal, 35(7), 1070–1079.
weak-instrumentsstrategy-researchsimulationmethodologyAnnotation
Semadeni, Withers, and Certo use Monte Carlo simulations to demonstrate the dangers of using weak or invalid instruments in strategy research. They provide practical guidance for management scholars on when and how to use IV, and when it may do more harm than good.
- 2120
Semenova, V., & Chernozhukov, V. (2021). Debiased Machine Learning of Conditional Average Treatment Effects and Other Causal Functions. Econometrics Journal, 24(2), 264–289.
Foundationalon double debiased machine learningCATEheterogeneous-effectsgroup-effectsAnnotation
Semenova and Chernozhukov extend DML to estimate conditional average treatment effects (CATEs) and other causal functions, allowing researchers to characterize treatment effect heterogeneity. They provide inference methods for projections of the CATE onto interpretable subgroups.
- 2520
Semenova, V. (2025). Generalized Lee Bounds. Journal of Econometrics, 251, 106055.
doi.org/10.1016/j.jeconom.2025.106055
Foundationalon lee boundsmachine-learningcovariatestighter-boundsAnnotation
Semenova generalizes Lee bounds to allow for covariates and machine learning estimation of nuisance functions, improving the tightness of bounds while maintaining their nonparametric validity. This paper connects the Lee bounds literature to the modern machine learning causal inference literature.
- 0220
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin.
Foundationalon interrupted time seriesfoundationaltextbookquasi-experimentalAnnotation
Shadish, Cook, and Campbell write the standard textbook on quasi-experimental designs, including a comprehensive treatment of interrupted time series. Discusses threats to validity (history, instrumentation, selection-maturation interaction) specific to ITS designs and provides guidance on when ITS is most credible.
- 9819
Shaver, J. M. (1998). Accounting for Endogeneity When Assessing Strategy Performance: Does Entry Mode Choice Affect FDI Survival?. Management Science, 44(4), 571–585.
endogeneityself-selectionentry-modeFDIHeckman-correction+1Annotation
Shaver demonstrates how ignoring endogeneity — specifically, the self-selection of firms into entry modes — biases performance estimates in this foundational strategy paper. He shows that the choice between greenfield entries and acquisitions reflects private information about expected survival, and uses a Heckman-style selection correction to obtain unbiased estimates. One of the first papers to systematically demonstrate endogeneity problems in strategy research.
- 1720
Shipman, J. E., Swanquist, Q. T., & Whited, R. L. (2017). Propensity Score Matching in Accounting Research. The Accounting Review, 92(1), 213–244.
Surveyon matching methodspropensity-scoreaccountingbest-practicesmethodologyAnnotation
Shipman, Swanquist, and Whited review how propensity score matching is used (and sometimes misused) in accounting research. They provide practical guidelines on common pitfalls such as matching on post-treatment variables, inadequate balance checks, and ignoring the unconfoundedness assumption.
- 0120
Shumway, T. (2001). Forecasting Bankruptcy More Accurately: A Simple Hazard Model. Journal of Business, 74(1), 101–124.
Applicationon cox proportional hazardapplicationfinancebankruptcyAnnotation
Shumway shows that discrete-time hazard models outperform static logit models for bankruptcy prediction because they properly account for the time dimension and censoring. Demonstrates the importance of survival analysis framing for event prediction in finance.
- 0620
Silva, J. M. C. S., & Tenreyro, S. (2006). The Log of Gravity. Review of Economics and Statistics, 88(4), 641–658.
Foundationalon poisson negative binomialgravity-modelPPMLtradeheteroskedasticityAnnotation
Silva and Tenreyro demonstrate that OLS estimation of log-linearized gravity models produces inconsistent estimates in the presence of heteroskedasticity. They show that Poisson pseudo-maximum-likelihood (PPML) provides consistent estimates and naturally handles zero trade flows, transforming the trade literature.
- 1120
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366.
doi.org/10.1177/0956797611417632
Foundationalon pre registrationp-hackingresearcher-degrees-of-freedomfalse-positivesAnnotation
Simmons, Nelson, and Simonsohn demonstrate how researcher degrees of freedom in data collection and analysis can inflate false-positive rates dramatically. Their paper, which proposes disclosure requirements and pre-registration as solutions, is one of the catalysts for the replication crisis and pre-registration movement.
- 2020
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification Curve Analysis. Nature Human Behaviour, 4(11), 1208–1214.
doi.org/10.1038/s41562-020-0912-z
Foundationalon specification curvespecification-curverobustnessanalytical-flexibilityAnnotation
Simonsohn, Simmons, and Nelson introduce specification curve analysis, which systematically runs all reasonable specifications of a model and displays the distribution of estimates. This approach replaces selective reporting of specifications with a comprehensive view of how results depend on analytical choices.
- 0320
Singer, J. D., & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford University Press.
doi.org/10.1093/acprof:oso/9780195152968.001.0001
Surveyon cox proportional hazardsurveytextbookdiscrete-timeAnnotation
Singer and Willett write an accessible textbook covering both growth curve models and discrete-time survival analysis. Chapters 9-15 provide a clear introduction to hazard modeling for social science researchers, with worked examples and practical guidance.
- 1120
Singh, J., & Agrawal, A. (2011). Recruiting for Ideas: How Firms Exploit the Prior Inventions of New Hires. Management Science, 57(1), 129–150.
doi.org/10.1287/mnsc.1100.1253
knowledge-transferinventor-mobilitypatent-citationsAnnotation
Singh and Agrawal use a difference-in-differences approach, comparing citation rates to recruits' patents before and after the move against matched control patents, to study how hiring inventors affects knowledge flows to the hiring firm. They find that hiring an inventor increases the hiring firm's citations to the recruit's prior patents, indicating knowledge transfer. The paper demonstrates how DiD with matched controls can identify causal effects in knowledge flow studies.
- 0520
Smith, J. A., & Todd, P. E. (2005). Does Matching Overcome LaLonde's Critique of Nonexperimental Estimators?. Journal of Econometrics, 125(1–2), 305–353.
doi.org/10.1016/j.jeconom.2004.04.011
Foundationalon matching methodsLaLonde-critiquepropensity-scoreexternal-validityAnnotation
Smith and Todd reexamine the Dehejia and Wahba (1999) reanalysis of LaLonde (1986), showing that the matching results are sensitive to specific sample and specification choices. They demonstrate that matching methods cannot solve fundamental problems when treated and comparison groups come from very different populations.
- 9719
Staiger, D., & Stock, J. H. (1997). Instrumental Variables Regression with Weak Instruments. Econometrica, 65(3), 557–586.
Foundationalon instrumental variablesweak-instruments2SLS-biasasymptotic-theoryAnnotation
Staiger and Stock show formally that when instruments are weak, 2SLS estimates are biased toward OLS and standard inference breaks down. This paper establishes the theoretical foundations for the weak instruments problem that Stock and Yogo (2005) later provided practical tests for.
- 1920
Starr, E., Frake, J., & Agarwal, R. (2019). Mobility Constraint Externalities. Organization Science, 30(5), 961–980.
doi.org/10.1287/orsc.2018.1252
Oster-methodcoefficient-stabilitynoncompete-agreementslabor-mobilityexternalitiesAnnotation
Starr, Frake, and Agarwal study how noncompete agreements generate externalities for all workers in a labor market, not just those directly constrained. They use Oster's (2019) coefficient stability diagnostic to assess robustness of findings to omitted variable bias, demonstrating that enforceable noncompetes are associated with reduced job offers, mobility, and wages even for unconstrained workers.
- 1620
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing Transparency Through a Multiverse Analysis. Perspectives on Psychological Science, 11(5), 702–712.
doi.org/10.1177/1745691616658637
Foundationalon specification curvemultiverse-analysisgarden-of-forking-pathstransparencyAnnotation
Steegen and colleagues introduce multiverse analysis, which examines how results vary across the full set of defensible data processing and analytical decisions. This approach is closely related to specification curve analysis and emphasizes transparency about the garden of forking paths in data analysis.
- 0220
Stock, J. H., Wright, J. H., & Yogo, M. (2002). A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments. Journal of Business & Economic Statistics, 20(4), 518–529.
doi.org/10.1198/073500102288618658
Surveyon instrumental variablesweak-instrumentsGMMweak-identificationsurveyAnnotation
Stock, Wright, and Yogo survey the weak instruments and weak identification literature in IV and GMM settings, covering finite-sample bias toward OLS, size distortions in Wald tests, and practical diagnostic tools. The paper provides a comprehensive review of the theoretical landscape; the formal critical value tables now standard in applied work appear in the separate Stock and Yogo (2005) chapter.
- 0520
Stock, J. H., & Yogo, M. (2005). Testing for Weak Instruments in Linear IV Regression. Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg, 80–108.
doi.org/10.1017/CBO9780511614491.006
Foundationalon instrumental variablesweak-instrumentsF-statisticdiagnostic-testAnnotation
Stock and Yogo develop formal critical value tables for testing whether instruments are 'weak'—that is, only weakly correlated with the endogenous variable. Their tables formalize the Staiger and Stock (1997) rule of thumb that the first-stage F-statistic should exceed 10, and are probably the most widely used diagnostic in applied IV research.
- 1020
Stuart, E. A. (2010). Matching Methods for Causal Inference: A Review and a Look Forward. Statistical Science, 25(1), 1–21.
Surveyon matching methodsmatching-reviewpropensity-scorepractical-guidancesurveyAnnotation
Stuart provides a comprehensive review of matching methods including propensity score matching, Mahalanobis distance matching, and coarsened exact matching, with practical guidance on implementation. She offers an accessible overview of when and how to use different matching approaches.
- 1120
Stuart, E. A., Cole, S. R., Bradshaw, C. P., & Leaf, P. J. (2011). The Use of Propensity Scores to Assess the Generalizability of Results from Randomized Trials. Journal of the Royal Statistical Society: Series A, 174(2), 369–386.
doi.org/10.1111/j.1467-985X.2010.00673.x
Foundationalon external validityAnnotation
Stuart, Cole, Bradshaw, and Leaf propose propensity-score-based metrics for quantifying the similarity between randomized trial participants and a target population, using a model that predicts trial participation given observed covariates. The resulting scores enable matching, subclassification, or weighting of trial outcomes to the population, providing a diagnostic framework for assessing external validity. Researchers planning to generalize trial findings should use these propensity score diagnostics to evaluate whether their trial sample adequately represents the intended target population.
- 2120
Sun, L., & Abraham, S. (2021). Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects. Journal of Econometrics, 225(2), 175–199.
doi.org/10.1016/j.jeconom.2020.09.006
Foundationalon staggered difference in differences, event studiesevent-studyinteraction-weighteddynamic-effectsAnnotation
Sun and Abraham show that conventional event-study regression coefficients are contaminated by treatment effect heterogeneity across cohorts and propose an interaction-weighted estimator that recovers clean dynamic treatment effects. This paper is the key reference for event-study plots in staggered settings.
- 0020
Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer.
doi.org/10.1007/978-1-4757-3294-8
Surveyon cox proportional hazardsurveytextbookcox-extensionsAnnotation
Therneau and Grambsch provide an authoritative reference on extensions of the Cox model including time-varying covariates, stratification, frailty models, and multistate models. The R survival package is maintained by Therneau and implements the methods described here.
- 6019
Thistlethwaite, D. L., & Campbell, D. T. (1960). Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment. Journal of Educational Psychology, 51(6), 309–317.
Foundationalon regression discontinuity sharpRDD-originscutoff-designquasi-experimentAnnotation
Thistlethwaite and Campbell introduce the regression discontinuity design, proposing to compare units just above and just below a cutoff score to estimate causal effects, reasoning that units near the cutoff are as-good-as randomly assigned. The idea lies dormant for decades before being rediscovered by economists.
- 0920
Train, K. E. (2009). Discrete Choice Methods with Simulation. Cambridge University Press.
doi.org/10.1017/CBO9780511805271
Surveyon logit probittextbookdiscrete-choicesimulation-estimationAnnotation
Train's textbook provides a comprehensive and accessible treatment of logit, probit, mixed logit, and other discrete choice models. It covers both theory and practical simulation-based estimation methods and is widely used in economics, marketing, and transportation research.
- 0220
Van der Klaauw, W. (2002). Estimating the Effect of Financial Aid Offers on College Enrollment: A Regression-Discontinuity Approach. International Economic Review, 43(4), 1249–1287.
doi.org/10.1111/1468-2354.t01-1-00055
Applicationon regression discontinuity fuzzyfinancial-aideducationfuzzy-RDDAnnotation
Van der Klaauw applies a fuzzy RDD to study how financial aid offers affect college enrollment decisions, exploiting discontinuities in an aid assignment rule where eligibility changes at GPA thresholds but compliance is imperfect. This paper is one of the earliest and most influential applications of fuzzy RDD.
- 1520
VanderWeele, T. J. (2015). Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press.
Surveyon causal mediation analysistextbookmediationinteractionsensitivityAnnotation
VanderWeele's comprehensive textbook unifies the causal mediation literature, covering potential outcomes and structural equation approaches, sensitivity analysis, time-varying treatments, and interaction effects. It is the standard reference for researchers conducting mediation analysis.
- 1620
VanderWeele, T. J. (2016). Mediation Analysis: A Practitioner's Guide. Annual Review of Public Health, 37, 17–32.
doi.org/10.1146/annurev-publhealth-032315-021402
Surveyon causal mediation analysispractitioners-guidesensitivity-analysispublic-healthsurveyAnnotation
VanderWeele provides an accessible practitioner-oriented guide to modern causal mediation analysis, covering the assumptions required for identification, sensitivity analysis for unmeasured confounding, and extensions to multiple mediators and interactions. This review is an excellent entry point for applied researchers seeking to move beyond the Baron-Kenny framework.
- 1720
VanderWeele, T. J., & Ding, P. (2017). Sensitivity Analysis in Observational Research: Introducing the E-Value. Annals of Internal Medicine, 167(4), 268–274.
Foundationalon sensitivity analysisE-valueunmeasured-confoundingepidemiologyAnnotation
VanderWeele and Ding introduce the E-value, a simple and intuitive measure of the minimum strength of association that an unmeasured confounder would need to have with both the treatment and outcome to fully explain away an observed treatment-outcome association. The E-value is widely adopted in epidemiology and increasingly discussed in social science.
- 0620
Villalonga, B., & Amit, R. (2006). How Do Family Ownership, Control and Management Affect Firm Value?. Journal of Financial Economics, 80(2), 385–417.
doi.org/10.1016/j.jfineco.2004.12.005
Applicationon ols regressionfamily-firmscorporate-governancefirm-valueAnnotation
Villalonga and Amit study how different forms of family involvement — ownership, control, and management — affect firm value using OLS regression with clustered standard errors on a panel of Fortune 500 firms. The paper disentangles the separate effects of family ownership, voting control through dual-class shares and pyramids, and family management on Tobin's q.
- 1820
Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests. Journal of the American Statistical Association, 113(523), 1228–1242.
doi.org/10.1080/01621459.2017.1319839
Foundationalon causal forestscausal-forestsrandom-forestsasymptotic-normalityAnnotation
Wager and Athey develop causal forests by extending random forests to estimate conditional average treatment effects. They prove pointwise consistency and asymptotic normality under regularity conditions, enabling valid confidence intervals for individualized treatment effect estimates.
- 0220
Wagner, A. K., Soumerai, S. B., Zhang, F., & Ross-Degnan, D. (2002). Segmented Regression Analysis of Interrupted Time Series Studies in Medication Use Research. Journal of Clinical Pharmacy and Therapeutics, 27(4), 299–309.
doi.org/10.1046/j.1365-2710.2002.00430.x
Foundationalon interrupted time seriesfoundationalsegmented-regressionhealth-servicesAnnotation
Wagner and colleagues formalize segmented regression for ITS in health services research. The paper clearly specifies the model with level-change and slope-change parameters, discusses autocorrelation correction, and provides practical recommendations for minimum series length and model diagnostics.
- 2320
Webb, M. D. (2023). Reworking Wild Bootstrap-Based Inference for Clustered Errors. Canadian Journal of Economics, 56(3), 839–858.
Foundationalon clustering inferencewild-bootstrapfew-clustersWebb-weightsAnnotation
Webb introduces the six-point distribution as an alternative to Rademacher weights for the wild cluster bootstrap. The Webb weights improve finite-sample performance when the number of clusters is very small.
- 9319
Westfall, P. H., & Young, S. S. (1993). Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. Wiley.
Foundationalon multiple testingresamplingpermutationstep-downtextbookAnnotation
Westfall and Young develop resampling-based methods for multiple testing that account for the dependence structure among test statistics. Their permutation-based step-down procedure is less conservative than Bonferroni and becomes a standard reference for multiple testing adjustments in applied research.
- 8019
White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817–838.
Foundationalon ols regressionrobust-standard-errorsheteroskedasticityinferenceAnnotation
White introduces the now-standard 'robust standard errors' that researchers routinely use with OLS. Before White's correction, standard errors could be misleadingly small when the variance of the error term was not constant across observations. Nearly every empirical paper today uses some variant of this approach.
- 1920
Wolfolds, S. E., & Siegel, J. (2019). Misaccounting for Endogeneity: The Peril of Relying on the Heckman Two-Step Method without a Valid Instrument. Strategic Management Journal, 40(3), 432–462.
Heckman-correctionexclusion-restrictionselection-modelsmisapplicationAnnotation
Wolfolds and Siegel demonstrate that the Heckman selection correction is frequently misapplied in management research, particularly when the exclusion restriction is not credible. They show via simulation and replication that applying the Heckman correction without a valid instrument can introduce more bias than it removes. The paper provides a cautionary guide for researchers considering selection models and recommends transparent reporting of the exclusion restriction.
- 9919
Wooldridge, J. M. (1999). Distribution-Free Estimation of Some Nonlinear Panel Data Models. Journal of Econometrics, 90(1), 77–97.
doi.org/10.1016/S0304-4076(98)00033-5
Foundationalon poisson negative binomialquasi-MLEpanel-datarobustnessAnnotation
Wooldridge shows that Poisson quasi-maximum-likelihood estimation in panel data models is consistent for the conditional mean even if the data are not Poisson-distributed, as long as the mean is correctly specified. This result justifies the widespread use of Poisson regression for non-count continuous outcomes and provides the foundation for distribution-free estimation of nonlinear panel data models.
- 1020
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data. MIT Press, 2nd edition.
Surveyon cox proportional hazard, experimental design, fixed effects +7textbookpanel-datareferenceAnnotation
Wooldridge's graduate textbook is the standard reference for cross-section and panel data econometrics. Chapters 10-11 provide a thorough treatment of fixed effects, random effects, and related panel data methods, while later chapters cover general estimation methodology (MLE, GMM, M-estimation) with panel data applications throughout. The book covers both linear and nonlinear models with careful attention to assumptions.
- 1920
Wooldridge, J. M. (2019). Correlated Random Effects Models with Unbalanced Panels. Journal of Econometrics, 211(1), 137–150.
doi.org/10.1016/j.jeconom.2018.12.010
Foundationalon random effectscorrelated-random-effectsunbalanced-panelspanel-dataCREAnnotation
Wooldridge extends the correlated random effects (CRE) framework to handle unbalanced panels, which are the norm in applied research. This paper shows how to combine the flexibility of fixed effects with the ability to estimate effects of time-invariant variables, making the CRE approach practical for real-world datasets.
- 1720
Young, C., & Holsteen, K. (2017). Model Uncertainty and Robustness: A Computational Framework for Multimodel Analysis. Sociological Methods & Research, 46(1), 3–40.
doi.org/10.1177/0049124115610347
Foundationalon specification curvemodel-uncertaintymultimodel-analysissociologyAnnotation
Young and Holsteen develop a computational framework for systematically exploring model uncertainty by running thousands of plausible specifications. Their approach is one of the earliest implementations of what would become known as specification curve or multiverse analysis, applied to sociological research.
- 1920
Young, A. (2019). Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results. Quarterly Journal of Economics, 134(2), 557–598.
Applicationon randomization inferencereplicationexperimental-economicsinferenceAnnotation
Young applies randomization inference to a large sample of experimental papers published in top economics journals and finds that many results that appear significant under conventional inference are insignificant under randomization tests. This paper demonstrates the practical importance of randomization inference for credible empirical research.
- 2220
Young, A. (2022). Consistency Without Inference: Instrumental Variables in Practical Application. European Economic Review, 147, 104112.
doi.org/10.1016/j.euroecorev.2022.104112
Applicationon instrumental variablesweak-instrumentspublished-researchreplicationinference-failuresAnnotation
Young reexamines published IV applications and argues that standard first-stage F-statistic diagnostics are largely uninformative of both size and bias under non-iid errors and high leverage. The paper finds that IV estimates in practice rarely demonstrate that OLS is biased, raising broader questions about the reliability of IV as commonly implemented.
- 0920
Zelner, B. A. (2009). Using Simulation to Interpret Results from Logit, Probit, and Other Nonlinear Models. Strategic Management Journal, 30(12), 1335–1348.
simulationinterpretationpredicted-probabilitiesAnnotation
Zelner advocates using simulation-based approaches to interpret and present results from nonlinear models in management research. By computing predicted probabilities and marginal effects via simulation, researchers can convey substantive significance more clearly than raw coefficients.
- 1020
Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and Truths about Mediation Analysis. Journal of Consumer Research, 37(2), 197–206.
Foundationalon causal mediation analysismediation-classificationBaron-Kenny-critiqueconsumer-researchAnnotation
Zhao, Lynch, and Chen provide an important critique of the Baron and Kenny mediation framework from within the marketing literature. They argue that the 'step 1' requirement of a significant total effect is unnecessary and introduces a more sensible classification of mediation types (complementary, competitive, indirect-only, direct-only, no-effect). While still operating within the regression framework rather than the full causal framework, this paper is a significant step forward for applied researchers.
- 1920
Zhao, Q., Small, D. S., & Bhattacharya, B. B. (2019). Sensitivity Analysis for Inverse Probability Weighting Estimators via the Percentile Bootstrap. Journal of the Royal Statistical Society: Series B, 81(4), 735–761.
Foundationalon doubly robust estimationsensitivity-analysishealthcarebootstrapAIPWAnnotation
Zhao, Small, and Bhattacharya develop sensitivity analysis tools for inverse probability weighted and augmented IPW estimators via the percentile bootstrap. They apply the methods to evaluate the causal effect of fish consumption on blood mercury levels, demonstrating practical use of AIPW sensitivity analysis in an observational study context. The paper provides a computationally convenient approach for assessing how sensitive doubly robust estimates are to violations of the unconfoundedness assumption.