grad-panel-data

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

追蹤資料分析 (Panel Data Analysis)

面板数据分析（Panel Data Analysis）

Overview

概述

Panel data analysis exploits both cross-sectional and temporal variation to estimate causal effects while controlling for unobserved heterogeneity. Fixed effects eliminate time-invariant confounders through within-entity demeaning, while random effects assume unobserved heterogeneity is uncorrelated with regressors, yielding more efficient estimates when valid.

面板数据分析同时利用横截面和时间维度的差异来估计因果效应，同时控制未观测异质性。固定效应通过实体内部去均值消除不随时间变化的混杂因素；而随机效应假设未观测异质性与回归变量不相关，在假设成立时能得到更有效的估计结果。

When to Use

适用场景

Data has repeated observations for the same entities (firms, individuals, countries) over time
Unobserved time-invariant factors likely confound the relationship of interest
Testing whether a policy or treatment effect varies across time periods
Dynamic models where the lagged dependent variable is a regressor (use GMM)

数据包含同一实体（企业、个人、国家）的跨期重复观测值
存在不随时间变化的未观测因素可能混淆核心研究关系
需要检验政策或处理效应是否随时间变化
构建包含滞后因变量作为回归变量的动态模型（使用GMM）

When NOT to Use

不适用场景

Pure cross-sectional data with no time dimension
Interest is in estimating the effect of time-invariant variables (FE eliminates these)
Panel is extremely short (T = 2) with many endogenous regressors
Attrition is non-random and creates survivorship bias

仅含横截面维度、无时间维度的数据
研究目标是估计不随时间变化的变量的效应（固定效应会消除这类变量）
面板时间维度极短（T=2）且存在多个内生回归变量
样本存在非随机 attrition（流失），导致幸存者偏差

Assumptions

假设条件

IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.

Key assumptions:

Strict exogeneity for FE/RE: past, current, and future errors are uncorrelated with regressors
No serial correlation in idiosyncratic errors (or use cluster-robust SEs)
RE additionally assumes individual effects are uncorrelated with regressors
For dynamic GMM: instruments are valid and not too many (instrument proliferation)

IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.

核心假设：

FE/RE的严格外生性：过去、当前及未来的误差项与回归变量均不相关
异质性误差项无序列相关性（或使用聚类稳健标准误）
RE额外假设个体效应与回归变量不相关
动态GMM：工具变量有效且数量不宜过多（避免工具变量泛滥）

Methodology

实施步骤

Step 1 — Explore Panel Structure

步骤1 — 探索面板结构

Report N (entities), T (time periods), balance status. Check within vs between variation for key variables. Visualize entity-level trends.

报告实体数量（N）、时间期数（T）、面板平衡性。分析核心变量的组内与组间差异，可视化实体层面趋势。

Step 2 — Estimate FE and RE Models

步骤2 — 估计FE和RE模型

Run fixed effects (within estimator) and random effects (GLS). Include time fixed effects if common shocks exist. Use cluster-robust standard errors at the entity level.

运行固定效应（组内估计量）和随机效应（GLS）模型。若存在共同冲击，需加入时间固定效应。使用实体层面的聚类稳健标准误。

Step 3 — Hausman Test for Model Selection

步骤3 — 利用Hausman检验选择模型

Test H₀: RE is consistent (individual effects uncorrelated with regressors). Rejection favors FE. See

references/

for test statistic derivation.

检验原假设H₀：RE估计量一致（个体效应与回归变量不相关）。拒绝原假设则选择FE。检验统计量推导可参考

references/

目录。

Step 4 — Dynamic Extensions (if needed)

步骤4 — 动态扩展（若需）

If lagged DV is included, use Arellano-Bond or System GMM. Report AR(1), AR(2) tests and Hansen/Sargan test for instrument validity. Monitor instrument count.

若模型包含滞后因变量，使用Arellano-Bond或系统GMM方法。报告AR(1)、AR(2)检验及Hansen/Sargan工具变量有效性检验结果，监控工具变量数量。

Output Format

输出格式

markdown

undefined

markdown

undefined

Panel Data Analysis: [Study Title]

面板数据分析：[研究标题]

Panel Structure

面板结构

Dimension	Value
Entities (N)	xxx
Time periods (T)	xxx
Balanced?	[Yes/No]

维度	数值
实体数（N）	xxx
时间期数（T）	xxx
是否平衡	[是/否]

Estimation Results

估计结果

Variable	FE (β)	RE (β)	GMM (β)
[var]	x.xx (x.xx)	x.xx (x.xx)	x.xx (x.xx)

变量	FE（β）	RE（β）	GMM（β）
[变量名]	x.xx (x.xx)	x.xx (x.xx)	x.xx (x.xx)

Model Selection

模型选择

Test	Statistic	p-value	Decision
Hausman	x.xx	x.xx	[FE/RE]
AR(2)	x.xx	x.xx	[pass/fail]
Hansen J	x.xx	x.xx	[pass/fail]

检验	统计量	p值	决策
Hausman	x.xx	x.xx	[FE/RE]
AR(2)	x.xx	x.xx	[通过/不通过]
Hansen J	x.xx	x.xx	[通过/不通过]

Key Findings

核心结论

[Interpretation]

[解释说明]

Limitations

局限性

[Note any assumption violations]

undefined

[说明存在的假设违背情况]

undefined

Gotchas

注意事项

FE discards all between-entity variation; if most variation is between, FE estimates are imprecise
Hausman test has low power in small samples — insignificance does not validate RE
Dynamic panel GMM with too many instruments causes overfitting and weakens the Hansen test
Nickell bias afflicts FE estimates with a lagged DV when T is small
Two-way FE (entity + time) is often necessary but rarely the default in software
Cluster-robust standard errors require a sufficient number of clusters (N ≥ 50 as guideline)

FE会丢弃所有组间差异；若核心变量的变异主要来自组间，FE估计结果会不精确
小样本中Hausman检验的功效较低——不显著不代表RE有效
动态面板GMM若使用过多工具变量，会导致过拟合并削弱Hansen检验的效力
当T较小时，包含滞后因变量的FE估计会存在Nickell偏差
双向FE（实体+时间固定效应）通常是必要的，但多数软件默认不启用
聚类稳健标准误需要足够的聚类数量（参考标准：N≥50）

References

参考文献

Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.
Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
Baltagi, B. H. (2013). Econometric Analysis of Panel Data (5th ed.). Wiley.

Wooldridge, J. M. (2010). 《横截面与面板数据的计量经济分析》（第2版）. 麻省理工出版社.
Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
Baltagi, B. H. (2013). 《面板数据的计量经济分析》（第5版）. 威利出版社.