python-panel-data

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Python Panel Data

Python 面板数据分析

Purpose

用途

This skill helps economists run panel data models in Python using
pandas
,
statsmodels
, and
linearmodels
, with correct fixed effects, clustering, and diagnostics.
本技能帮助经济学家使用
pandas
statsmodels
linearmodels
在Python中运行面板数据模型,包含正确的固定效应、聚类和诊断功能。

When to Use

适用场景

  • Estimating fixed effects or random effects models
  • Running difference-in-differences on panel data
  • Creating regression tables and plots in Python
  • 估计固定效应或随机效应模型
  • 对面板数据进行双重差分(DiD)分析
  • 在Python中创建回归表格和图表

Instructions

操作步骤

Follow these steps to complete the task:
按照以下步骤完成任务:

Step 1: Understand the Context

步骤1:了解上下文

Before generating any code, ask the user:
  • What is the unit of observation and panel identifiers?
  • Which outcomes and regressors are required?
  • What fixed effects or time effects are needed?
  • How should standard errors be clustered?
在生成代码前,请询问用户:
  • 观测单位和面板标识符是什么?
  • 需要哪些因变量和自变量?
  • 需要哪些固定效应或时间效应?
  • 标准误差应如何聚类?

Step 2: Generate the Output

步骤2:生成输出内容

Based on the context, generate Python code that:
  1. Loads and cleans the data with
    pandas
  2. Sets a MultiIndex for panel structure
  3. Fits the model using
    linearmodels.PanelOLS
    or
    RandomEffects
  4. Outputs results in a readable table and optional LaTeX
根据上下文,生成以下Python代码:
  1. 使用
    pandas
    加载并清洗数据
  2. 设置MultiIndex以构建面板结构
  3. 使用
    linearmodels.PanelOLS
    RandomEffects
    拟合模型
  4. 以可读表格形式输出结果,可选导出为LaTeX格式

Step 3: Verify and Explain

步骤3:验证与解释

After generating output:
  • Interpret key coefficients
  • Note assumptions (strict exogeneity, parallel trends, etc.)
  • Suggest robustness checks (alternative clustering, placebo tests)
生成输出后:
  • 解释关键系数
  • 标注假设条件(严格外生性、平行趋势等)
  • 建议稳健性检验方法(替代聚类方式、安慰剂检验等)

Example Prompts

示例提示词

  • "Run a two-way fixed effects model with firm and year effects"
  • "Estimate a DiD using state and year fixed effects"
  • "Export panel regression results to LaTeX"
  • "运行包含企业和年份效应的双向固定效应模型"
  • "使用州和年份固定效应估计双重差分模型(DiD)"
  • "将面板回归结果导出为LaTeX格式"

Example Output

示例输出

python
undefined
python
undefined

============================================

============================================

Panel Data Analysis in Python

Panel Data Analysis in Python

============================================

============================================

import pandas as pd from linearmodels.panel import PanelOLS
import pandas as pd from linearmodels.panel import PanelOLS

Load data

Load data

df = pd.read_csv("panel_data.csv")
df = pd.read_csv("panel_data.csv")

Set panel index

Set panel index

df = df.set_index(["firm_id", "year"])
df = df.set_index(["firm_id", "year"])

Create treatment indicator

Create treatment indicator

df["treat_post"] = df["treated"] * df["post"]
df["treat_post"] = df["treated"] * df["post"]

Two-way fixed effects model

Two-way fixed effects model

model = PanelOLS.from_formula( "outcome ~ 1 + treat_post + EntityEffects + TimeEffects", data=df ) results = model.fit(cov_type="clustered", cluster_entity=True)
print(results.summary)
undefined
model = PanelOLS.from_formula( "outcome ~ 1 + treat_post + EntityEffects + TimeEffects", data=df ) results = model.fit(cov_type="clustered", cluster_entity=True)
print(results.summary)
undefined

Requirements

环境要求

Software

软件

  • Python 3.10+
  • Python 3.10+

Packages

依赖包

  • pandas
  • linearmodels
  • statsmodels
Install with:
bash
pip install pandas linearmodels statsmodels
  • pandas
  • linearmodels
  • statsmodels
安装命令:
bash
pip install pandas linearmodels statsmodels

Best Practices

最佳实践

  1. Always verify panel identifiers and balanced vs unbalanced panels
  2. Cluster standard errors at the appropriate level
  3. Check for missing data before estimation
  1. 始终验证面板标识符,区分平衡面板与非平衡面板
  2. 在合适的层级对标准误差进行聚类
  3. 估计前检查缺失数据

Common Pitfalls

常见误区

  • Failing to set a proper panel index
  • Using pooled OLS when fixed effects are required
  • Misinterpreting coefficients without accounting for fixed effects
  • 未设置正确的面板索引
  • 在需要固定效应时使用混合OLS模型
  • 未考虑固定效应就误解释系数

References

参考资料

Changelog

更新日志

v1.0.0

v1.0.0

  • Initial release
  • 初始版本