stata

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Stata Skill

Stata 技能指南

You have access to comprehensive Stata reference files. Do not load all files. Read only the 1-3 files relevant to the user's current task using the routing table below.

你可以访问全面的Stata参考文件。请勿加载所有文件。 请使用下方的路由表，仅读取与用户当前任务相关的1-3个文件。

Critical Gotchas

常见易犯陷阱

These are Stata-specific pitfalls that lead to silent bugs. Internalize these before writing any code.

这些是Stata特有的易导致隐性bug的陷阱。在编写任何代码前，请务必牢记。

Missing Values Sort to +Infinity

缺失值会被排序到正无穷

Stata's

(and

.a

.z

) are greater than all numbers.

stata

* WRONG — includes observations where income is missing!
gen high_income = (income > 50000)

* RIGHT
gen high_income = (income > 50000) if !missing(income)

* WRONG — missing ages appear in this list
list if age > 60

* RIGHT
list if age > 60 & !missing(age)

Stata中的

（以及

.a

.z

）大于所有数值。

stata

* 错误写法——会包含income为缺失值的观测！
gen high_income = (income > 50000)

* 正确写法
gen high_income = (income > 50000) if !missing(income)

* 错误写法——缺失的age会被包含在结果中
list if age > 60

* 正确写法
list if age > 60 & !missing(age)

==

与

==

的区别

is assignment;

==

is comparison. Mixing them up is a syntax error or silent bug.

stata

* WRONG — syntax error
gen employed = 1 if status = 1

* RIGHT
gen employed = 1 if status == 1

用于赋值；

==

用于比较。混淆二者会导致语法错误或隐性bug。

stata

* 错误写法——语法错误
gen employed = 1 if status = 1

* 正确写法
gen employed = 1 if status == 1

Local Macro Syntax

局部宏语法

Locals use

`name'

(backtick + single-quote). Globals use

$name

${name}

. Forgetting the closing quote is the #1 macro bug.

stata

local controls "age education income"
regress wage `controls'        // correct
regress wage `controls         // WRONG — missing closing quote
regress wage 'controls'        // WRONG — wrong quote characters

局部宏使用

`name'

（反引号+单引号）。全局宏使用

$name

或

${name}

。忘记闭合引号是最常见的宏错误。

stata

local controls "age education income"
regress wage `controls'        // 正确写法
regress wage `controls         // 错误写法——缺少闭合引号
regress wage 'controls'        // 错误写法——引号类型错误

by

Requires Prior Sort (Use

bysort

)

by

命令需要预先排序（推荐使用

bysort

）

stata

* WRONG — error if data not sorted by id
by id: gen first = (_n == 1)

* RIGHT — bysort sorts automatically
bysort id: gen first = (_n == 1)

* Also RIGHT — explicit sort
sort id
by id: gen first = (_n == 1)

stata

* 错误写法——如果数据未按id排序会报错
by id: gen first = (_n == 1)

* 正确写法——bysort会自动排序
bysort id: gen first = (_n == 1)

* 另一种正确写法——显式排序
sort id
by id: gen first = (_n == 1)

Factor Variable Notation (

i.

and

c.

)

因子变量表示法（

i.

和

c.

）

Use

i.

for categorical,

c.

for continuous. Omitting

i.

treats categories as continuous.

stata

* WRONG — treats race as continuous (e.g., race=3 has 3x effect of race=1)
regress wage race education

* RIGHT — creates dummies automatically
regress wage i.race education

* Interactions
regress wage i.race##c.education    // full interaction
regress wage i.race#c.education     // interaction only (no main effects)

使用

i.

表示分类变量，

c.

表示连续变量。省略

i.

会将分类变量视为连续变量处理。

stata

* 错误写法——将race视为连续变量（例如race=3的影响是race=1的3倍）
regress wage race education

* 正确写法——自动生成虚拟变量
regress wage i.race education

* 交互项
regress wage i.race##c.education    // 完整交互项（包含主效应）
regress wage i.race#c.education     // 仅交互项（无主效应）

generate

replace

generate

与

replace

的区别

generate

creates new variables;

replace

modifies existing ones. Using

generate

on an existing variable name is an error.

stata

gen x = 1
gen x = 2          // ERROR: x already defined
replace x = 2      // correct

generate

用于创建新变量；

replace

用于修改现有变量。对已存在的变量使用

generate

会报错。

stata

gen x = 1
gen x = 2          // 错误：x已定义
replace x = 2      // 正确写法

String Comparison Is Case-Sensitive

字符串比较区分大小写

stata

* May miss "Male", "MALE", etc.
keep if gender == "male"

* Safer
keep if lower(gender) == "male"

stata

* 可能会遗漏"Male"、"MALE"等情况
keep if gender == "male"

* 更安全的写法
keep if lower(gender) == "male"

merge

Always Check

_merge

merge

后务必检查

_merge

Never skip

tab _merge

— it costs nothing and is the only diagnostic you get when

assert

fails.

stata

merge 1:1 id using other.dta
tab _merge                      // ALWAYS tab before assert
assert _merge == 3              // fails silently without tab output
drop _merge

永远不要跳过

tab _merge

——这不会花费任何成本，而且是

assert

失败时唯一的诊断方式。

stata

merge 1:1 id using other.dta
tab _merge                      // 合并后务必执行此命令
assert _merge == 3              // 没有tab输出的话，断言失败时无提示
drop _merge

preserve

restore

tempfile

for Collapse-Merge-Back

preserve

restore

tempfile

用于聚合-合并-还原场景

The standard pattern for computing group stats and merging them onto the original data:

stata

tempfile stats
preserve
collapse (mean) avg_x=x, by(group)
save `stats'
restore
merge m:1 group using `stats'
tab _merge
assert _merge == 3
drop _merge

For simple group means,

bysort group: egen avg_x = mean(x)

avoids the round-trip entirely.

计算分组统计量并合并回原始数据的标准模式：

stata

tempfile stats
preserve
collapse (mean) avg_x=x, by(group)
save `stats'
restore
merge m:1 group using `stats'
tab _merge
assert _merge == 3
drop _merge

对于简单的分组均值，使用

bysort group: egen avg_x = mean(x)

可以完全避免上述往返操作。

Weights Are Not Interchangeable

各类权重不可互换

```
fweight
```
— frequency weights (replication)
```
aweight
```
— analytic/regression weights (inverse variance)
```
pweight
```
— probability/sampling weights (survey data, implies robust SE)
```
iweight
```
— importance weights (rarely used)

```
fweight
```
—— 频率权重（重复观测）
```
aweight
```
—— 分析/回归权重（逆方差）
```
pweight
```
—— 概率/抽样权重（调查数据，隐含稳健标准误）
```
iweight
```
—— 重要性权重（极少使用）

capture

Swallows Errors

capture

会捕获错误

stata

capture some_command
if _rc != 0 {
    di as error "Failed with code: " _rc
    exit _rc
}

stata

capture some_command
if _rc != 0 {
    di as error "执行失败，错误码: " _rc
    exit _rc
}

Line Continuation Uses

///

行续行使用

///

stata

regress y x1 x2 x3 ///
    x4 x5 x6, ///
    vce(robust)

stata

regress y x1 x2 x3 ///
    x4 x5 x6, ///
    vce(robust)

Stored Results:

r()

e()

s()

存储结果：

r()

e()

s()

```
r()
```
— r-class commands (summarize, tabulate, etc.)
```
e()
```
— e-class commands (estimation: regress, logit, etc.)
```
s()
```
— s-class commands (parsing)

A new estimation command overwrites previous

e()

results. Store them first:

stata

regress y x1 x2
estimates store model1

```
r()
```
—— r类命令（如summarize、tabulate等）
```
e()
```
—— e类命令（估计命令：regress、logit等）
```
s()
```
—— s类命令（解析命令）

新的估计命令会覆盖之前的

e()

结果。请先存储结果：

stata

regress y x1 x2
estimates store model1

Running Stata from the Command Line

从命令行运行Stata

Claude can execute Stata code by running

.do

files in batch mode from the terminal. This is how to run Stata non-interactively.

Claude可以通过在终端中以批处理模式运行

.do

文件来执行Stata代码。这是无交互运行Stata的方式。

Finding the Stata Binary

查找Stata可执行文件

Stata on macOS is a

.app

bundle. The actual binary is inside it. Common locations:

undefined

macOS上的Stata是一个

.app

包，实际的可执行文件在包内部。常见路径：

undefined

Stata 18 / StataNow (most common)

Stata 18 / StataNow（最常见）

/Applications/Stata/StataMP.app/Contents/MacOS/stata-mp /Applications/StataNow/StataMP.app/Contents/MacOS/stata-mp

Other editions (SE, BE)

其他版本（SE、BE）

/Applications/Stata/StataSE.app/Contents/MacOS/stata-se /Applications/Stata/StataBE.app/Contents/MacOS/stata-be


If Stata isn't on `$PATH`, find it with: `mdfind -name "stata-mp" | grep MacOS`

/Applications/Stata/StataSE.app/Contents/MacOS/stata-se /Applications/Stata/StataBE.app/Contents/MacOS/stata-be


如果Stata不在`$PATH`中，可以使用以下命令查找：`mdfind -name "stata-mp" | grep MacOS`

Batch Mode (

-b

)

批处理模式（

-b

）

bash

undefined

bash

undefined

Run a .do file in batch mode — output goes to <filename>.log

以批处理模式运行.do文件——输出会写入<filename>.log

/Applications/Stata/StataMP.app/Contents/MacOS/stata-mp -b do analysis.do

If stata-mp is on PATH (e.g., via symlink or alias):

如果stata-mp已在PATH中（例如通过符号链接或别名）：

stata-mp -b do analysis.do


- `-b` = batch mode (non-interactive, no GUI)
- Output (everything Stata would display) is written to `analysis.log` in the working directory
- Exit code is 0 on success, non-zero on error
- The log file contains all output, including error messages — check it after execution

stata-mp -b do analysis.do


- `-b` = 批处理模式（无交互，无图形界面）
- 输出（Stata会显示的所有内容）会写入当前工作目录下的`analysis.log`
- 执行成功时退出码为0，失败时为非0值
- 日志文件包含所有输出，包括错误信息——执行后请务必检查

Running Inline Stata Code

运行单行Stata代码

To run a quick Stata snippet without creating a

.do

file:

bash

undefined

无需创建

.do

文件即可快速运行Stata代码片段：

bash

undefined

Write a temp .do file and run it

创建临时.do文件并运行

cat > /tmp/stata_run.do << 'EOF' sysuse auto, clear summarize price mpg EOF stata-mp -b do /tmp/stata_run.do cat /tmp/stata_run.log

undefined

cat > /tmp/stata_run.do << 'EOF' sysuse auto, clear summarize price mpg EOF stata-mp -b do /tmp/stata_run.do cat /tmp/stata_run.log

undefined

Checking Results

检查执行结果

bash

undefined

bash

undefined

Check if it succeeded

检查是否执行成功

stata-mp -b do tests/run_tests.do && echo "SUCCESS" || echo "FAILED"

stata-mp -b do tests/run_tests.do && echo "执行成功" || echo "执行失败"

Search the log for pass/fail

在日志中搜索执行结果

grep -E "PASS|FAIL|error|r([0-9]+)" run_tests.log

undefined

grep -E "PASS|FAIL|error|r([0-9]+)" run_tests.log

undefined

Tips

注意事项

clear all
at the top of batch scripts — batch mode starts with a fresh Stata session, but
```
clear all
```
ensures no stale state from prior runs in the same session.
set more off
— prevents Stata from pausing for
```
--more--
```
prompts (fatal in batch mode).
Log files overwrite silently —
```
analysis.do
```
always writes to
```
analysis.log
```
in the current directory. If you run multiple
```
.do
```
files, check the right log.
Working directory — Stata's working directory is wherever you run the command from, not where the
```
.do
```
file lives. Use
```
cd
```
in the
```
.do
```
file or absolute paths if needed.

批处理脚本开头使用
clear all
——批处理模式会启动全新的Stata会话，但
```
clear all
```
可确保同一会话中之前的运行不会留下残留状态。
设置
set more off
——防止Stata因
```
--more--
```
提示而暂停（批处理模式下会导致致命错误）。
日志文件会被静默覆盖——
```
analysis.do
```
始终会写入当前目录下的
```
analysis.log
```
。如果运行多个
```
.do
```
文件，请检查对应的日志。
工作目录——Stata的工作目录是你运行命令的目录，而非
```
.do
```
文件所在的目录。如果需要，请在
```
.do
```
文件中使用
```
cd
```
命令或绝对路径。

Routing Table

路由表

Read only the files relevant to the user's task. Paths are relative to this SKILL.md file.

仅读取与用户任务相关的文件。路径相对于本SKILL.md文件。

Data Operations

数据操作

File	Topics & Key Commands
`references/basics-getting-started.md`	`use` , `save` , `describe` , `browse` , `sysuse` , basic workflow
`references/data-import-export.md`	`import delimited` , `import excel` , ODBC, `export` , web data
`references/data-management.md`	`generate` , `replace` , `merge` , `append` , `reshape` , `collapse` , `recode` , `egen` , `encode` / `decode`
`references/variables-operators.md`	Variable types, `byte` / `int` / `long` / `float` / `double` , operators, missing values ( `.<.a` ), `if` / `in` qualifiers
`references/string-functions.md`	`substr()` , `regexm()` , `strtrim()` , `split` , `ustrlen()` , regex, Unicode
`references/date-time-functions.md`	`date()` , `clock()` , `%td` / `%tc` formats, `mdy()` , `dofm()` , business calendars
`references/mathematical-functions.md`	`round()` , `log()` , `exp()` , `abs()` , `mod()` , `cond()` , distributions, random numbers

文件	主题与关键命令
`references/basics-getting-started.md`	`use` , `save` , `describe` , `browse` , `sysuse` , 基础工作流
`references/data-import-export.md`	`import delimited` , `import excel` , ODBC, `export` , 网络数据
`references/data-management.md`	`generate` , `replace` , `merge` , `append` , `reshape` , `collapse` , `recode` , `egen` , `encode` / `decode`
`references/variables-operators.md`	变量类型, `byte` / `int` / `long` / `float` / `double` , 运算符, 缺失值（ `.<.a` ）, `if` / `in` 限定符
`references/string-functions.md`	`substr()` , `regexm()` , `strtrim()` , `split` , `ustrlen()` , 正则表达式, Unicode
`references/date-time-functions.md`	`date()` , `clock()` , `%td` / `%tc` 格式, `mdy()` , `dofm()` , 商务日历
`references/mathematical-functions.md`	`round()` , `log()` , `exp()` , `abs()` , `mod()` , `cond()` , 分布函数, 随机数

Statistics & Econometrics

统计与计量经济学

File	Topics & Key Commands
`references/descriptive-statistics.md`	`summarize` , `tabulate` , `correlate` , `tabstat` , `codebook` , weighted stats
`references/linear-regression.md`	`regress` , `vce(robust)` , `vce(cluster)` , `test` , `lincom` , `margins` , `predict` , `ivregress`
`references/panel-data.md`	`xtset` , `xtreg fe` / `re` , Hausman test, `xtabond` , dynamic panels
`references/time-series.md`	`tsset` , ARIMA, VAR, `dfuller` , `pperron` , `irf` , forecasting
`references/limited-dependent-variables.md`	`logit` , `probit` , `tobit` , `poisson` , `nbreg` , `mlogit` , `ologit` , `margins` for nonlinear
`references/bootstrap-simulation.md`	`bootstrap` , `simulate` , `permute` , Monte Carlo
`references/survey-data-analysis.md`	`svyset` , `svy:` , `subpop()` , complex survey design, replicate weights
`references/missing-data-handling.md`	`mi impute` , `mi estimate` , FIML, `misstable` , diagnostics
`references/maximum-likelihood.md`	`ml model` , custom likelihood functions, `ml init` , gradient-based optimization
`references/gmm-estimation.md`	`gmm` , moment conditions, `estat overid` , J-test

文件	主题与关键命令
`references/descriptive-statistics.md`	`summarize` , `tabulate` , `correlate` , `tabstat` , `codebook` , 加权统计
`references/linear-regression.md`	`regress` , `vce(robust)` , `vce(cluster)` , `test` , `lincom` , `margins` , `predict` , `ivregress`
`references/panel-data.md`	`xtset` , `xtreg fe` / `re` , Hausman检验, `xtabond` , 动态面板
`references/time-series.md`	`tsset` , ARIMA, VAR, `dfuller` , `pperron` , `irf` , 预测
`references/limited-dependent-variables.md`	`logit` , `probit` , `tobit` , `poisson` , `nbreg` , `mlogit` , `ologit` , 非线性模型的 `margins`
`references/bootstrap-simulation.md`	`bootstrap` , `simulate` , `permute` , 蒙特卡洛模拟
`references/survey-data-analysis.md`	`svyset` , `svy:` , `subpop()` , 复杂调查设计, 重复权重
`references/missing-data-handling.md`	`mi impute` , `mi estimate` , FIML, `misstable` , 诊断
`references/maximum-likelihood.md`	`ml model` , 自定义似然函数, `ml init` , 梯度优化
`references/gmm-estimation.md`	`gmm` , 矩条件, `estat overid` , J检验

Causal Inference

因果推断

File	Topics & Key Commands
`references/treatment-effects.md`	`teffects ra/ipw/ipwra/aipw` , `stteffects` , ATE/ATT/ATET
`references/difference-in-differences.md`	DiD, parallel trends, event studies, staggered adoption
`references/regression-discontinuity.md`	Sharp/fuzzy RD, bandwidth selection, `rdplot`
`references/matching-methods.md`	PSM, nearest neighbor, kernel matching, `teffects nnmatch`
`references/sample-selection.md`	`heckman` , `heckprobit` , treatment models, exclusion restrictions

文件	主题与关键命令
`references/treatment-effects.md`	`teffects ra/ipw/ipwra/aipw` , `stteffects` , ATE/ATT/ATET
`references/difference-in-differences.md`	双重差分法(DiD), 平行趋势, 事件研究, staggered adoption
`references/regression-discontinuity.md`	精确/模糊断点回归, 带宽选择, `rdplot`
`references/matching-methods.md`	倾向得分匹配(PSM), 最近邻匹配, 核匹配, `teffects nnmatch`
`references/sample-selection.md`	`heckman` , `heckprobit` , 处理效应模型, 排他性约束

Advanced Methods

高级方法

File	Topics & Key Commands
`references/survival-analysis.md`	`stset` , `stcox` , `streg` , Kaplan-Meier, parametric models
`references/sem-factor-analysis.md`	`sem` , `gsem` , CFA, path analysis, `alpha` , reliability
`references/nonparametric-methods.md`	`kdensity` , rank tests, `qreg` , `npregress`
`references/spatial-analysis.md`	`spmatrix` , `spregress` , spatial weights, Moran's I
`references/machine-learning.md`	`lasso` , `elasticnet` , `cvlasso` , cross-validation

文件	主题与关键命令
`references/survival-analysis.md`	`stset` , `stcox` , `streg` , Kaplan-Meier, 参数模型
`references/sem-factor-analysis.md`	`sem` , `gsem` , 验证性因子分析(CFA), 路径分析, `alpha` , 信度分析
`references/nonparametric-methods.md`	`kdensity` , 秩检验, `qreg` , `npregress`
`references/spatial-analysis.md`	`spmatrix` , `spregress` , 空间权重矩阵, Moran's I
`references/machine-learning.md`	`lasso` , `elasticnet` , `cvlasso` , 交叉验证

Graphics

绘图

File	Topics & Key Commands
`references/graphics.md`	`twoway` , `scatter` , `line` , `bar` , `histogram` , `graph combine` , `graph export` , schemes

文件	主题与关键命令
`references/graphics.md`	`twoway` , `scatter` , `line` , `bar` , `histogram` , `graph combine` , `graph export` , 绘图样式

Programming

编程

File	Topics & Key Commands
`references/programming-basics.md`	`local` , `global` , `foreach` , `forvalues` , `program define` , `syntax` , `return`
`references/advanced-programming.md`	`syntax` , `mata` , classes, `_prefix` , dialog boxes, `tempfile` / `tempvar`
`references/mata-introduction.md`	Mata basics, when to use Mata vs ado, data types
`references/mata-programming.md`	Mata functions, flow control, structures, pointers
`references/mata-matrix-operations.md`	Matrix creation, decompositions, solvers, `st_matrix()`
`references/mata-data-access.md`	`st_data()` , `st_view()` , `st_store()` , performance tips

文件	主题与关键命令
`references/programming-basics.md`	`local` , `global` , `foreach` , `forvalues` , `program define` , `syntax` , `return`
`references/advanced-programming.md`	`syntax` , `mata` , 类, `_prefix` , 对话框, `tempfile` / `tempvar`
`references/mata-introduction.md`	Mata基础, Mata与ado的适用场景, 数据类型
`references/mata-programming.md`	Mata函数, 流程控制, 数据结构, 指针
`references/mata-matrix-operations.md`	矩阵创建, 矩阵分解, 求解器, `st_matrix()`
`references/mata-data-access.md`	`st_data()` , `st_view()` , `st_store()` , 性能优化技巧

Output & Workflow

输出与工作流

File	Topics & Key Commands
`references/tables-reporting.md`	`putexcel` , `putdocx` , `putpdf` , LaTeX integration, `collect`
`references/workflow-best-practices.md`	Project structure, master do-files, version control, debugging, common mistakes
`references/external-tools-integration.md`	Python via `python:` , R via `rsource` , shell commands, Git
`references/filing-issues.md`	User wants to report a Stata skill documentation gap or error to the repository

文件	主题与关键命令
`references/tables-reporting.md`	`putexcel` , `putdocx` , `putpdf` , LaTeX集成, `collect`
`references/workflow-best-practices.md`	项目结构, 主do文件, 版本控制, 调试, 常见错误
`references/external-tools-integration.md`	通过 `python:` 调用Python, 通过 `rsource` 调用R, shell命令, Git
`references/filing-issues.md`	用户需要向仓库报告Stata技能文档的漏洞或错误

Community Packages

社区包

File	What It Does
`packages/reghdfe.md`	High-dimensional fixed effects OLS (absorbs multiple FE sets efficiently)
`packages/estout.md`	`esttab` / `estout` : publication-quality regression tables
`packages/outreg2.md`	Alternative regression table exporter (Word, Excel, TeX)
`packages/asdoc.md`	One-command Word document creation for any Stata output
`packages/tabout.md`	Cross-tabulations and summary tables to file
`packages/coefplot.md`	Coefficient plots from stored estimates
`packages/graph-schemes.md`	`grstyle` , `schemepack` , `plotplain` — better graph themes
`packages/did.md`	Modern DiD: `csdid` , `did_multiplegt` , `did_imputation` (Callaway-Sant'Anna, de Chaisemartin-D'Haultfoeuille, Borusyak-Jaravel-Spiess)
`packages/event-study.md`	`eventstudyinteract` , `eventdd` — event study estimators
`packages/rdrobust.md`	Robust RD estimation with optimal bandwidth ( `rdrobust` , `rdplot` , `rdbwselect` )
`packages/psmatch2.md`	Propensity score matching (nearest neighbor, kernel, radius)
`packages/synth.md`	Synthetic control method ( `synth` , `synth_runner` )
`packages/ivreg2.md`	Enhanced IV/2SLS: `ivreg2` , `xtivreg2` with additional diagnostics
`packages/xtabond2.md`	Dynamic panel GMM (Arellano-Bond/Blundell-Bond)
`packages/binsreg.md`	Binned scatter plots with CI ( `binsreg` , `binstest` )
`packages/nprobust.md`	Nonparametric kernel estimation and inference
`packages/diagnostics.md`	`bacondecomp` , `xttest3` , collinearity, heteroskedasticity tests
`packages/winsor.md`	Winsorizing and trimming: `winsor2` , `winsor`
`packages/data-manipulation.md`	`gtools` (fast collapse/egen), `rangestat` , `egenmore`
`packages/package-management.md`	`ssc install` , `net install` , `ado update` , finding packages

文件	功能描述
`packages/reghdfe.md`	高维固定效应OLS（高效吸收多个固定效应集合）
`packages/estout.md`	`esttab` / `estout` : 符合出版要求的回归表格
`packages/outreg2.md`	替代回归表格导出工具（支持Word、Excel、TeX）
`packages/asdoc.md`	一键将任意Stata输出生成Word文档
`packages/tabout.md`	生成交叉表和汇总表到文件
`packages/coefplot.md`	从存储的估计结果绘制系数图
`packages/graph-schemes.md`	`grstyle` , `schemepack` , `plotplain` ——更美观的绘图主题
`packages/did.md`	现代双重差分法: `csdid` , `did_multiplegt` , `did_imputation` （Callaway-Sant'Anna、de Chaisemartin-D'Haultfoeuille、Borusyak-Jaravel-Spiess方法）
`packages/event-study.md`	`eventstudyinteract` , `eventdd` ——事件研究估计量
`packages/rdrobust.md`	稳健断点回归估计与最优带宽选择（ `rdrobust` , `rdplot` , `rdbwselect` ）
`packages/psmatch2.md`	倾向得分匹配（最近邻、核、半径匹配）
`packages/synth.md`	合成控制法（ `synth` , `synth_runner` ）
`packages/ivreg2.md`	增强版IV/2SLS: `ivreg2` , `xtivreg2` （含额外诊断）
`packages/xtabond2.md`	动态面板GMM（Arellano-Bond/Blundell-Bond方法）
`packages/binsreg.md`	带置信区间的分箱散点图（ `binsreg` , `binstest` ）
`packages/nprobust.md`	非参数核估计与推断
`packages/diagnostics.md`	`bacondecomp` , `xttest3` , 共线性检验, 异方差检验
`packages/winsor.md`	缩尾与截尾处理: `winsor2` , `winsor`
`packages/data-manipulation.md`	`gtools` （快速聚合/egen）, `rangestat` , `egenmore`
`packages/package-management.md`	`ssc install` , `net install` , `ado update` , 查找包

Common Patterns

常见代码模式

Regression Table Workflow

回归表格工作流

stata

* Estimate models
eststo clear
eststo: regress y x1 x2, vce(robust)
eststo: regress y x1 x2 x3, vce(robust)
eststo: regress y x1 x2 x3 x4, vce(cluster id)

* Export table
esttab using "results.tex", replace ///
    se star(* 0.10 ** 0.05 *** 0.01) ///
    label booktabs ///
    title("Main Results") ///
    mtitles("(1)" "(2)" "(3)")

stata

* 估计模型
eststo clear
eststo: regress y x1 x2, vce(robust)
eststo: regress y x1 x2 x3, vce(robust)
eststo: regress y x1 x2 x3 x4, vce(cluster id)

* 导出表格
esttab using "results.tex", replace ///
    se star(* 0.10 ** 0.05 *** 0.01) ///
    label booktabs ///
    title("主要结果") ///
    mtitles("(1)" "(2)" "(3)")

Panel Data Setup

面板数据设置

stata

xtset panelid timevar          // declare panel structure
xtdescribe                      // check balance
xtsum outcome                   // within/between variation

* Fixed effects
xtreg y x1 x2, fe vce(cluster panelid)
* Or with reghdfe (preferred for multiple FE)
reghdfe y x1 x2, absorb(panelid timevar) vce(cluster panelid)

stata

xtset panelid timevar          // 声明面板结构
xtdescribe                      // 检查面板平衡性
xtsum outcome                   // 组内/组间变异

* 固定效应模型
xtreg y x1 x2, fe vce(cluster panelid)
* 或使用reghdfe（推荐用于多固定效应）
reghdfe y x1 x2, absorb(panelid timevar) vce(cluster panelid)

Difference-in-Differences

双重差分法

stata

* Classic 2x2 DiD
gen post = (year >= treatment_year)
gen treat_post = treated * post
regress y treated post treat_post, vce(cluster id)

* Event study (uniform timing — must interact with treatment group)
reghdfe y ib(-1).rel_time#1.treated, absorb(id year) vce(cluster id)
testparm *.rel_time#1.treated   // pre-trend test

* Modern staggered DiD (Callaway & Sant'Anna)
csdid y x1 x2, ivar(id) time(year) gvar(first_treat) agg(event)
csdid_plot

stata

* 经典2x2双重差分
gen post = (year >= treatment_year)
gen treat_post = treated * post
regress y treated post treat_post, vce(cluster id)

* 事件研究（统一处理时间——必须与处理组交互）
reghdfe y ib(-1).rel_time#1.treated, absorb(id year) vce(cluster id)
testparm *.rel_time#1.treated   // 平行趋势检验

* 现代多期双重差分（Callaway & Sant'Anna方法）
csdid y x1 x2, ivar(id) time(year) gvar(first_treat) agg(event)
csdid_plot

Graph Export

图形导出

stata

* Publication-quality scatter with fit line
twoway (scatter y x, mcolor(navy%50) msize(small)) ///
       (lfit y x, lcolor(cranberry) lwidth(medthick)), ///
    title("Title Here") ///
    xtitle("X Label") ytitle("Y Label") ///
    legend(off) scheme(s2color)
graph export "figure1.pdf", replace as(pdf)
graph export "figure1.png", replace as(png) width(2400)

stata

* 符合出版要求的散点图与拟合线
twoway (scatter y x, mcolor(navy%50) msize(small)) ///
       (lfit y x, lcolor(cranberry) lwidth(medthick)), ///
    title("标题") ///
    xtitle("X轴标签") ytitle("Y轴标签") ///
    legend(off) scheme(s2color)
graph export "figure1.pdf", replace as(pdf)
graph export "figure1.png", replace as(png) width(2400)

Data Cleaning Pipeline

数据清洗流程

stata

* Load and inspect
import delimited "raw_data.csv", clear varnames(1)
describe
codebook, compact

* Clean
rename *, lower                 // lowercase all varnames
destring income, replace force  // convert string to numeric
replace income = . if income < 0

* Label
label variable income "Annual household income (USD)"
label define yesno 0 "No" 1 "Yes"
label values employed yesno

* Save
compress
save "clean_data.dta", replace

stata

* 加载并检查数据
import delimited "raw_data.csv", clear varnames(1)
describe
codebook, compact

* 数据清洗
rename *, lower                 // 所有变量名转为小写
destring income, replace force  // 将字符串转为数值型
replace income = . if income < 0

* 添加标签
label variable income "家庭年收入（美元）"
label define yesno 0 "否" 1 "是"
label values employed yesno

* 保存数据
compress
save "clean_data.dta", replace

Multiple Imputation

多重插补

stata

mi set mlong
mi register imputed income education
mi impute chained (regress) income (ologit) education = age i.gender, add(20) rseed(12345)
mi estimate: regress wage income education age i.gender

stata

mi set mlong
mi register imputed income education
mi impute chained (regress) income (ologit) education = age i.gender, add(20) rseed(12345)
mi estimate: regress wage income education age i.gender

Help Improve This Skill

帮助改进本技能

If you produce Stata code with a significant error — wrong syntax, incorrect command usage, or a gotcha you failed to catch — and the issue seems to stem from a gap in these reference files rather than a one-off mistake, consider suggesting to the user that they file an issue on the skill repository. This helps future users.

When to raise this: Only after you've already corrected the error and the user has working code. Frame it as optional: "I made an error with [X] that I think comes from a gap in the Stata skill documentation. If you'd like, I can help you file an issue or a PR so it gets fixed for everyone."

When NOT to raise this: If the user is on Claude Haiku, the error is more likely a model capability issue than a documentation gap. In that case, suggest they try Sonnet or Opus for complex Stata work instead of filing an issue.

If the user agrees, read

references/filing-issues.md

for instructions on writing a good issue report.

如果你生成的Stata代码存在严重错误——语法错误、命令使用不当，或者遗漏了本应注意的陷阱，且问题似乎源于参考文件的漏洞而非偶然错误，请考虑建议用户向仓库提交issue。这将帮助未来的用户。

建议提交issue的场景： 仅在你已经修正错误且用户获得可运行代码之后。请以可选的方式提出："我在[X]方面犯了一个错误，我认为这是Stata技能文档的漏洞导致的。如果你愿意，我可以帮你提交issue或PR，让这个问题被修复以帮助所有人。"

不建议提交issue的场景： 如果用户使用的是Claude Haiku，错误更可能是模型能力问题而非文档漏洞。这种情况下，建议用户尝试Sonnet或Opus来处理复杂的Stata工作，而非提交issue。

如果用户同意，请阅读

references/filing-issues.md

以了解如何编写高质量的issue报告。

stata

Original

Translation

Stata Skill

Stata 技能指南

Critical Gotchas

常见易犯陷阱

Missing Values Sort to +Infinity

缺失值会被排序到正无穷

= vs ==

= 与 == 的区别

Local Macro Syntax

局部宏语法

by Requires Prior Sort (Use bysort)

by 命令需要预先排序（推荐使用bysort）

Factor Variable Notation (i. and c.)

因子变量表示法（i. 和 c.）

generate vs replace

generate 与 replace 的区别

String Comparison Is Case-Sensitive

字符串比较区分大小写

merge Always Check _merge

merge 后务必检查 _merge

preserve / restore + tempfile for Collapse-Merge-Back

preserve / restore + tempfile 用于聚合-合并-还原场景

Weights Are Not Interchangeable

各类权重不可互换

capture Swallows Errors

capture 会捕获错误

Line Continuation Uses ///

行续行使用 ///

Stored Results: r() vs e() vs s()

存储结果：r() vs e() vs s()

Running Stata from the Command Line

从命令行运行Stata

Finding the Stata Binary

查找Stata可执行文件

Stata 18 / StataNow (most common)

Stata 18 / StataNow（最常见）

Other editions (SE, BE)

其他版本（SE、BE）

Batch Mode (-b)

批处理模式（-b）

Run a .do file in batch mode — output goes to <filename>.log

以批处理模式运行.do文件——输出会写入<filename>.log

If stata-mp is on PATH (e.g., via symlink or alias):

如果stata-mp已在PATH中（例如通过符号链接或别名）：

Running Inline Stata Code

运行单行Stata代码

Write a temp .do file and run it

创建临时.do文件并运行

Checking Results

检查执行结果

Check if it succeeded

检查是否执行成功

Search the log for pass/fail

在日志中搜索执行结果

Tips

注意事项

Routing Table

路由表

Data Operations

数据操作

Statistics & Econometrics

统计与计量经济学

Causal Inference

因果推断

Advanced Methods

高级方法

Graphics

绘图

Programming

编程

Output & Workflow

输出与工作流

Community Packages

社区包

Common Patterns

常见代码模式

Regression Table Workflow

`=`
vs
`==`

`=`
与
`==`
的区别

`by`
Requires Prior Sort (Use
`bysort`
)

`by`
命令需要预先排序（推荐使用
`bysort`
）

Factor Variable Notation (
`i.`
and
`c.`
)

因子变量表示法（
`i.`
和
`c.`
）

`generate`
vs
`replace`

`generate`
与
`replace`
的区别

`merge`
Always Check
`_merge`

`merge`
后务必检查
`_merge`

`preserve`
/
`restore`
+
`tempfile`
for Collapse-Merge-Back

`preserve`
/
`restore`
+
`tempfile`
用于聚合-合并-还原场景

`capture`
Swallows Errors

`capture`
会捕获错误

Line Continuation Uses
`///`

行续行使用
`///`

Stored Results:
`r()`
vs
`e()`
vs
`s()`

存储结果：
`r()`
vs
`e()`
vs
`s()`

Batch Mode (
`-b`
)

批处理模式（
`-b`
）