grad-panel-data

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

追蹤資料分析 (Panel Data Analysis)

面板数据分析(Panel Data Analysis)

Overview

概述

Panel data analysis exploits both cross-sectional and temporal variation to estimate causal effects while controlling for unobserved heterogeneity. Fixed effects eliminate time-invariant confounders through within-entity demeaning, while random effects assume unobserved heterogeneity is uncorrelated with regressors, yielding more efficient estimates when valid.
面板数据分析同时利用横截面和时间维度的差异来估计因果效应,同时控制未观测异质性。固定效应通过实体内部去均值消除不随时间变化的混杂因素;而随机效应假设未观测异质性与回归变量不相关,在假设成立时能得到更有效的估计结果。

When to Use

适用场景

  • Data has repeated observations for the same entities (firms, individuals, countries) over time
  • Unobserved time-invariant factors likely confound the relationship of interest
  • Testing whether a policy or treatment effect varies across time periods
  • Dynamic models where the lagged dependent variable is a regressor (use GMM)
  • 数据包含同一实体(企业、个人、国家)的跨期重复观测值
  • 存在不随时间变化的未观测因素可能混淆核心研究关系
  • 需要检验政策或处理效应是否随时间变化
  • 构建包含滞后因变量作为回归变量的动态模型(使用GMM)

When NOT to Use

不适用场景

  • Pure cross-sectional data with no time dimension
  • Interest is in estimating the effect of time-invariant variables (FE eliminates these)
  • Panel is extremely short (T = 2) with many endogenous regressors
  • Attrition is non-random and creates survivorship bias
  • 仅含横截面维度、无时间维度的数据
  • 研究目标是估计不随时间变化的变量的效应(固定效应会消除这类变量)
  • 面板时间维度极短(T=2)且存在多个内生回归变量
  • 样本存在非随机 attrition(流失),导致幸存者偏差

Assumptions

假设条件

IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.
Key assumptions:
  1. Strict exogeneity for FE/RE: past, current, and future errors are uncorrelated with regressors
  2. No serial correlation in idiosyncratic errors (or use cluster-robust SEs)
  3. RE additionally assumes individual effects are uncorrelated with regressors
  4. For dynamic GMM: instruments are valid and not too many (instrument proliferation)
IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.
核心假设:
  1. FE/RE的严格外生性:过去、当前及未来的误差项与回归变量均不相关
  2. 异质性误差项无序列相关性(或使用聚类稳健标准误)
  3. RE额外假设个体效应与回归变量不相关
  4. 动态GMM:工具变量有效且数量不宜过多(避免工具变量泛滥)

Methodology

实施步骤

Step 1 — Explore Panel Structure

步骤1 — 探索面板结构

Report N (entities), T (time periods), balance status. Check within vs between variation for key variables. Visualize entity-level trends.
报告实体数量(N)、时间期数(T)、面板平衡性。分析核心变量的组内与组间差异,可视化实体层面趋势。

Step 2 — Estimate FE and RE Models

步骤2 — 估计FE和RE模型

Run fixed effects (within estimator) and random effects (GLS). Include time fixed effects if common shocks exist. Use cluster-robust standard errors at the entity level.
运行固定效应(组内估计量)和随机效应(GLS)模型。若存在共同冲击,需加入时间固定效应。使用实体层面的聚类稳健标准误。

Step 3 — Hausman Test for Model Selection

步骤3 — 利用Hausman检验选择模型

Test H₀: RE is consistent (individual effects uncorrelated with regressors). Rejection favors FE. See
references/
for test statistic derivation.
检验原假设H₀:RE估计量一致(个体效应与回归变量不相关)。拒绝原假设则选择FE。检验统计量推导可参考
references/
目录。

Step 4 — Dynamic Extensions (if needed)

步骤4 — 动态扩展(若需)

If lagged DV is included, use Arellano-Bond or System GMM. Report AR(1), AR(2) tests and Hansen/Sargan test for instrument validity. Monitor instrument count.
若模型包含滞后因变量,使用Arellano-Bond或系统GMM方法。报告AR(1)、AR(2)检验及Hansen/Sargan工具变量有效性检验结果,监控工具变量数量。

Output Format

输出格式

markdown
undefined
markdown
undefined

Panel Data Analysis: [Study Title]

面板数据分析:[研究标题]

Panel Structure

面板结构

DimensionValue
Entities (N)xxx
Time periods (T)xxx
Balanced?[Yes/No]
维度数值
实体数(N)xxx
时间期数(T)xxx
是否平衡[是/否]

Estimation Results

估计结果

VariableFE (β)RE (β)GMM (β)
[var]x.xx (x.xx)x.xx (x.xx)x.xx (x.xx)
变量FE(β)RE(β)GMM(β)
[变量名]x.xx (x.xx)x.xx (x.xx)x.xx (x.xx)

Model Selection

模型选择

TestStatisticp-valueDecision
Hausmanx.xxx.xx[FE/RE]
AR(2)x.xxx.xx[pass/fail]
Hansen Jx.xxx.xx[pass/fail]
检验统计量p值决策
Hausmanx.xxx.xx[FE/RE]
AR(2)x.xxx.xx[通过/不通过]
Hansen Jx.xxx.xx[通过/不通过]

Key Findings

核心结论

  • [Interpretation]
  • [解释说明]

Limitations

局限性

  • [Note any assumption violations]
undefined
  • [说明存在的假设违背情况]
undefined

Gotchas

注意事项

  • FE discards all between-entity variation; if most variation is between, FE estimates are imprecise
  • Hausman test has low power in small samples — insignificance does not validate RE
  • Dynamic panel GMM with too many instruments causes overfitting and weakens the Hansen test
  • Nickell bias afflicts FE estimates with a lagged DV when T is small
  • Two-way FE (entity + time) is often necessary but rarely the default in software
  • Cluster-robust standard errors require a sufficient number of clusters (N ≥ 50 as guideline)
  • FE会丢弃所有组间差异;若核心变量的变异主要来自组间,FE估计结果会不精确
  • 小样本中Hausman检验的功效较低——不显著不代表RE有效
  • 动态面板GMM若使用过多工具变量,会导致过拟合并削弱Hansen检验的效力
  • 当T较小时,包含滞后因变量的FE估计会存在Nickell偏差
  • 双向FE(实体+时间固定效应)通常是必要的,但多数软件默认不启用
  • 聚类稳健标准误需要足够的聚类数量(参考标准:N≥50)

References

参考文献

  • Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.
  • Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
  • Baltagi, B. H. (2013). Econometric Analysis of Panel Data (5th ed.). Wiley.
  • Wooldridge, J. M. (2010). 《横截面与面板数据的计量经济分析》(第2版). 麻省理工出版社.
  • Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
  • Baltagi, B. H. (2013). 《面板数据的计量经济分析》(第5版). 威利出版社.