r-econometrics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

R Econometrics

R计量经济学

Purpose

用途

This skill helps economists run rigorous econometric analyses in R, including Instrumental Variables (IV), Difference-in-Differences (DiD), and Regression Discontinuity Design (RDD). It generates publication-ready code with proper diagnostics and robust standard errors.
本技能帮助经济学家在R中开展严谨的计量经济分析,包括工具变量(IV)、双重差分(DiD)和断点回归设计(RDD)。它会生成带有恰当诊断和稳健标准误的可用于发表的代码。

When to Use

使用场景

  • Running causal inference analyses
  • Estimating treatment effects with panel data
  • Creating publication-ready regression tables
  • Implementing modern econometric methods (two-way fixed effects, event studies)
  • 开展因果推断分析
  • 利用面板数据估计处理效应
  • 创建可用于发表的回归表格
  • 实施现代计量经济方法(双向固定效应、事件研究)

Instructions

操作步骤

Step 1: Understand the Research Design

步骤1:明确研究设计

Before generating code, ask the user:
  1. What is your identification strategy? (IV, DiD, RDD, or simple regression)
  2. What is the unit of observation? (individual, firm, country-year, etc.)
  3. What fixed effects do you need? (entity, time, two-way)
  4. How should standard errors be clustered?
在生成代码前,询问用户:
  1. 你的识别策略是什么?(IV、DiD、RDD或简单回归)
  2. 观测单位是什么?(个体、企业、国家-年度等)
  3. 需要哪些固定效应?(个体、时间、双向)
  4. 标准误应如何聚类?

Step 2: Generate Analysis Code

步骤2:生成分析代码

Based on the research design, generate R code that:
  1. Uses the
    fixest
    package
    - Modern, fast, and feature-rich for panel data
  2. Includes proper diagnostics:
    • For IV: First-stage F-statistics, weak instrument tests
    • For DiD: Parallel trends visualization, event study plots
    • For RDD: Bandwidth selection, density tests
  3. Uses robust/clustered standard errors appropriate for the data structure
  4. Creates publication-ready output using
    modelsummary
    or
    etable
根据研究设计,生成满足以下要求的R代码:
  1. 使用
    fixest
    ——针对面板数据的现代、快速且功能丰富的包
  2. 包含恰当的诊断:
    • 对于IV:第一阶段F统计量、弱工具变量检验
    • 对于DiD:平行趋势可视化、事件研究图
    • 对于RDD:带宽选择、密度检验
  3. 使用适合数据结构的稳健/聚类标准误
  4. 利用
    modelsummary
    etable
    创建可用于发表的输出结果

Step 3: Structure the Output

步骤3:输出结构

Always include:
r
undefined
始终包含以下模块:
r
undefined

1. Setup and packages

1. 环境设置与包加载

2. Data loading and preparation

2. 数据加载与预处理

3. Descriptive statistics

3. 描述性统计

4. Main specification

4. 主回归设定

5. Robustness checks

5. 稳健性检验

6. Visualization

6. 可视化

7. Export results

7. 结果导出

undefined
undefined

Step 4: Add Documentation

步骤4:添加文档说明

Include comments explaining:
  • Why each specification choice was made
  • Interpretation of key coefficients
  • Limitations and assumptions
添加注释解释:
  • 为何选择每个设定
  • 关键系数的解释
  • 局限性与假设

Example Prompts

示例提示

  • "Run a DiD analysis with state and year fixed effects, clustering at the state level"
  • "Estimate the effect of X on Y using Z as an instrument"
  • "Create an event study plot showing treatment effects by year"
  • "Run a sharp RDD with optimal bandwidth selection"
  • "运行包含州和年度固定效应、以州为聚类水平的DiD分析"
  • "使用Z作为工具变量估计X对Y的影响"
  • "创建展示年度处理效应的事件研究图"
  • "运行带有最优带宽选择的精确断点回归"

Example Output

示例输出

r
undefined
r
undefined

============================================

============================================

Difference-in-Differences Analysis

双重差分分析

============================================

============================================

Setup

环境设置

library(tidyverse) library(fixest) library(modelsummary)
library(tidyverse) library(fixest) library(modelsummary)

Load data

加载数据

df <- read_csv("data.csv")
df <- read_csv("data.csv")

Prepare treatment variable

预处理处理变量

df <- df %>% mutate( post = year >= treatment_year, treated = state %in% treatment_states, treat_post = treated * post )
df <- df %>% mutate( post = year >= treatment_year, treated = state %in% treatment_states, treat_post = treated * post )

----------------------------------------

----------------------------------------

Main DiD Specification

主DiD设定

----------------------------------------

----------------------------------------

Two-way fixed effects

双向固定效应

did_model <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state )
did_model <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state )

View results

查看结果

summary(did_model)
summary(did_model)

----------------------------------------

----------------------------------------

Event Study

事件研究

----------------------------------------

----------------------------------------

Create relative time variable

创建相对时间变量

df <- df %>% mutate(rel_time = year - treatment_year)
df <- df %>% mutate(rel_time = year - treatment_year)

Event study regression

事件研究回归

event_study <- feols( outcome ~ i(rel_time, treated, ref = -1) | state + year, data = df, cluster = ~state )
event_study <- feols( outcome ~ i(rel_time, treated, ref = -1) | state + year, data = df, cluster = ~state )

Plot coefficients

绘制系数图

iplot(event_study, main = "Event Study: Effect on Outcome", xlab = "Years Relative to Treatment")
iplot(event_study, main = "Event Study: Effect on Outcome", xlab = "Years Relative to Treatment")

----------------------------------------

----------------------------------------

Robustness: Alternative Specifications

稳健性检验:替代设定

----------------------------------------

----------------------------------------

Different clustering

不同聚类方式

did_robust <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state + year # Two-way clustering )
did_robust <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state + year # 双向聚类 )

----------------------------------------

----------------------------------------

Export Results

导出结果

----------------------------------------

----------------------------------------

modelsummary( list("Main" = did_model, "Two-way Cluster" = did_robust), stars = c('' = 0.1, '' = 0.05, '' = 0.01), output = "results/did_table.tex" )
undefined
modelsummary( list("Main" = did_model, "Two-way Cluster" = did_robust), stars = c('' = 0.1, '' = 0.05, '' = 0.01), output = "results/did_table.tex" )
undefined

Requirements

要求

Software

软件

  • R 4.0+
  • R 4.0+

Packages

  • fixest
    - Fast fixed effects estimation
  • modelsummary
    - Publication-ready tables
  • tidyverse
    - Data manipulation
  • ggplot2
    - Visualization
Install with:
r
install.packages(c("fixest", "modelsummary", "tidyverse"))
  • fixest
    ——快速固定效应估计
  • modelsummary
    ——可用于发表的表格
  • tidyverse
    ——数据处理
  • ggplot2
    ——可视化
安装命令:
r
install.packages(c("fixest", "modelsummary", "tidyverse"))

Best Practices

最佳实践

  1. Always cluster standard errors at the level of treatment assignment
  2. Run pre-trend tests for DiD designs
  3. Report first-stage F-statistics for IV (should be > 10)
  4. Use
    feols
    over
    lm
    for panel data (faster and more features)
  5. Document all specification choices in your code comments
  1. 始终在处理分配的层面聚类标准误
  2. 对DiD设计进行趋势前置检验
  3. 报告IV的第一阶段F统计量(应>10)
  4. 针对面板数据使用
    feols
    而非
    lm
    (更快且功能更多)
  5. 在代码注释中记录所有设定选择

Common Pitfalls

常见误区

  • ❌ Not clustering standard errors at the right level
  • ❌ Ignoring weak instruments in IV estimation
  • ❌ Using TWFE with staggered treatment timing (use
    did
    or
    sunab()
    instead)
  • ❌ Not reporting robustness checks
  • ❌ 未在正确的层面聚类标准误
  • ❌ 在IV估计中忽略弱工具变量
  • ❌ 对交错处理时点使用双向固定效应(应使用
    did
    sunab()
    替代)
  • ❌ 未报告稳健性检验

References

参考文献

Changelog

更新日志

v1.0.0

v1.0.0

  • Initial release with IV, DiD, RDD support
  • 初始版本,支持IV、DiD、RDD分析