grad-panel-data
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese追蹤資料分析 (Panel Data Analysis)
面板数据分析(Panel Data Analysis)
Overview
概述
Panel data analysis exploits both cross-sectional and temporal variation to estimate causal effects while controlling for unobserved heterogeneity. Fixed effects eliminate time-invariant confounders through within-entity demeaning, while random effects assume unobserved heterogeneity is uncorrelated with regressors, yielding more efficient estimates when valid.
面板数据分析同时利用横截面和时间维度的差异来估计因果效应,同时控制未观测异质性。固定效应通过实体内部去均值消除不随时间变化的混杂因素;而随机效应假设未观测异质性与回归变量不相关,在假设成立时能得到更有效的估计结果。
When to Use
适用场景
- Data has repeated observations for the same entities (firms, individuals, countries) over time
- Unobserved time-invariant factors likely confound the relationship of interest
- Testing whether a policy or treatment effect varies across time periods
- Dynamic models where the lagged dependent variable is a regressor (use GMM)
- 数据包含同一实体(企业、个人、国家)的跨期重复观测值
- 存在不随时间变化的未观测因素可能混淆核心研究关系
- 需要检验政策或处理效应是否随时间变化
- 构建包含滞后因变量作为回归变量的动态模型(使用GMM)
When NOT to Use
不适用场景
- Pure cross-sectional data with no time dimension
- Interest is in estimating the effect of time-invariant variables (FE eliminates these)
- Panel is extremely short (T = 2) with many endogenous regressors
- Attrition is non-random and creates survivorship bias
- 仅含横截面维度、无时间维度的数据
- 研究目标是估计不随时间变化的变量的效应(固定效应会消除这类变量)
- 面板时间维度极短(T=2)且存在多个内生回归变量
- 样本存在非随机 attrition(流失),导致幸存者偏差
Assumptions
假设条件
IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.Key assumptions:
- Strict exogeneity for FE/RE: past, current, and future errors are uncorrelated with regressors
- No serial correlation in idiosyncratic errors (or use cluster-robust SEs)
- RE additionally assumes individual effects are uncorrelated with regressors
- For dynamic GMM: instruments are valid and not too many (instrument proliferation)
IRON LAW: Fixed effects ONLY controls for TIME-INVARIANT unobservables —
time-varying confounders remain a threat. FE does not solve all
endogeneity problems.核心假设:
- FE/RE的严格外生性:过去、当前及未来的误差项与回归变量均不相关
- 异质性误差项无序列相关性(或使用聚类稳健标准误)
- RE额外假设个体效应与回归变量不相关
- 动态GMM:工具变量有效且数量不宜过多(避免工具变量泛滥)
Methodology
实施步骤
Step 1 — Explore Panel Structure
步骤1 — 探索面板结构
Report N (entities), T (time periods), balance status. Check within vs between variation for key variables. Visualize entity-level trends.
报告实体数量(N)、时间期数(T)、面板平衡性。分析核心变量的组内与组间差异,可视化实体层面趋势。
Step 2 — Estimate FE and RE Models
步骤2 — 估计FE和RE模型
Run fixed effects (within estimator) and random effects (GLS). Include time fixed effects if common shocks exist. Use cluster-robust standard errors at the entity level.
运行固定效应(组内估计量)和随机效应(GLS)模型。若存在共同冲击,需加入时间固定效应。使用实体层面的聚类稳健标准误。
Step 3 — Hausman Test for Model Selection
步骤3 — 利用Hausman检验选择模型
Test H₀: RE is consistent (individual effects uncorrelated with regressors). Rejection favors FE. See for test statistic derivation.
references/检验原假设H₀:RE估计量一致(个体效应与回归变量不相关)。拒绝原假设则选择FE。检验统计量推导可参考目录。
references/Step 4 — Dynamic Extensions (if needed)
步骤4 — 动态扩展(若需)
If lagged DV is included, use Arellano-Bond or System GMM. Report AR(1), AR(2) tests and Hansen/Sargan test for instrument validity. Monitor instrument count.
若模型包含滞后因变量,使用Arellano-Bond或系统GMM方法。报告AR(1)、AR(2)检验及Hansen/Sargan工具变量有效性检验结果,监控工具变量数量。
Output Format
输出格式
markdown
undefinedmarkdown
undefinedPanel Data Analysis: [Study Title]
面板数据分析:[研究标题]
Panel Structure
面板结构
| Dimension | Value |
|---|---|
| Entities (N) | xxx |
| Time periods (T) | xxx |
| Balanced? | [Yes/No] |
| 维度 | 数值 |
|---|---|
| 实体数(N) | xxx |
| 时间期数(T) | xxx |
| 是否平衡 | [是/否] |
Estimation Results
估计结果
| Variable | FE (β) | RE (β) | GMM (β) |
|---|---|---|---|
| [var] | x.xx (x.xx) | x.xx (x.xx) | x.xx (x.xx) |
| 变量 | FE(β) | RE(β) | GMM(β) |
|---|---|---|---|
| [变量名] | x.xx (x.xx) | x.xx (x.xx) | x.xx (x.xx) |
Model Selection
模型选择
| Test | Statistic | p-value | Decision |
|---|---|---|---|
| Hausman | x.xx | x.xx | [FE/RE] |
| AR(2) | x.xx | x.xx | [pass/fail] |
| Hansen J | x.xx | x.xx | [pass/fail] |
| 检验 | 统计量 | p值 | 决策 |
|---|---|---|---|
| Hausman | x.xx | x.xx | [FE/RE] |
| AR(2) | x.xx | x.xx | [通过/不通过] |
| Hansen J | x.xx | x.xx | [通过/不通过] |
Key Findings
核心结论
- [Interpretation]
- [解释说明]
Limitations
局限性
- [Note any assumption violations]
undefined- [说明存在的假设违背情况]
undefinedGotchas
注意事项
- FE discards all between-entity variation; if most variation is between, FE estimates are imprecise
- Hausman test has low power in small samples — insignificance does not validate RE
- Dynamic panel GMM with too many instruments causes overfitting and weakens the Hansen test
- Nickell bias afflicts FE estimates with a lagged DV when T is small
- Two-way FE (entity + time) is often necessary but rarely the default in software
- Cluster-robust standard errors require a sufficient number of clusters (N ≥ 50 as guideline)
- FE会丢弃所有组间差异;若核心变量的变异主要来自组间,FE估计结果会不精确
- 小样本中Hausman检验的功效较低——不显著不代表RE有效
- 动态面板GMM若使用过多工具变量,会导致过拟合并削弱Hansen检验的效力
- 当T较小时,包含滞后因变量的FE估计会存在Nickell偏差
- 双向FE(实体+时间固定效应)通常是必要的,但多数软件默认不启用
- 聚类稳健标准误需要足够的聚类数量(参考标准:N≥50)
References
参考文献
- Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.
- Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
- Baltagi, B. H. (2013). Econometric Analysis of Panel Data (5th ed.). Wiley.
- Wooldridge, J. M. (2010). 《横截面与面板数据的计量经济分析》(第2版). 麻省理工出版社.
- Arellano, M., & Bond, S. (1991). Some tests of specification for panel data. Review of Economic Studies, 58(2), 277-297.
- Baltagi, B. H. (2013). 《面板数据的计量经济分析》(第5版). 威利出版社.