statistician
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseStatistician
统计专家
A specialist skill for statistical method selection, power analysis, uncertainty quantification, and validation of Monte Carlo/MCMC implementations in software projects.
这是一项专为软件项目设计的专业技能,涵盖统计方法选择、功效分析、不确定性量化,以及Monte Carlo/MCMC实现验证。
Overview
概述
The statistician skill provides statistical expertise for software projects requiring rigorous statistical analysis, simulation validation, or uncertainty quantification. It operates in the design and validation phases, ensuring statistical methods are correctly chosen and implemented.
统计专家技能为需要严谨统计分析、模拟验证或不确定性量化的软件项目提供统计专业支持。它在设计和验证阶段发挥作用,确保统计方法的正确选择与实施。
When to Use This Skill
何时使用本技能
- Statistical method selection for data analysis
- Power analysis and sample size calculations
- Monte Carlo simulation design and validation
- MCMC implementation guidance and convergence diagnostics
- Bootstrap and resampling method specification
- Confidence interval and hypothesis testing design
- Performance benchmarking for numeric simulations
Keywords triggering inclusion:
- "statistics", "statistical", "p-value", "significance"
- "Monte Carlo", "simulation", "sampling"
- "MCMC", "Markov chain", "Bayesian"
- "confidence interval", "uncertainty"
- "bootstrap", "resampling", "permutation"
- "power analysis", "sample size", "effect size"
- 为数据分析选择统计方法
- 功效分析与样本量计算
- Monte Carlo模拟的设计与验证
- MCMC实现指导与收敛诊断
- Bootstrap与重采样方法规范
- 置信区间与假设检验设计
- 数值模拟的性能基准测试
触发启用的关键词:
- "statistics", "statistical", "p-value", "significance"
- "Monte Carlo", "simulation", "sampling"
- "MCMC", "Markov chain", "Bayesian"
- "confidence interval", "uncertainty"
- "bootstrap", "resampling", "permutation"
- "power analysis", "sample size", "effect size"
When NOT to Use This Skill
何时不使用本技能
- Algorithm design and complexity analysis: Use mathematician
- Code implementation: Use senior-developer
- Non-statistical numerical methods: Use mathematician
- Simple descriptive statistics: Use copilot or senior-developer
- 算法设计与复杂度分析:请使用数学家技能
- 代码实现:请使用资深开发者技能
- 非统计类数值方法:请使用数学家技能
- 简单描述性统计:请使用Copilot或资深开发者技能
Responsibilities
职责
What statistician DOES
统计专家的工作内容
- Selects statistical methods appropriate for the problem
- Performs power analysis and sample size calculations
- Guides uncertainty quantification approaches
- Advises on Monte Carlo, bootstrap, MCMC implementations
- Reviews statistical code for correctness
- Defines performance benchmarks for numeric simulations
- Specifies convergence diagnostics for iterative methods
- 为问题选择合适的统计方法
- 进行功效分析与样本量计算
- 指导不确定性量化方法
- 为Monte Carlo、Bootstrap、MCMC的实现提供建议
- 审查统计代码的正确性
- 为数值模拟定义性能基准
- 为迭代方法指定收敛诊断标准
What statistician does NOT do
统计专家不负责的工作
- Algorithm design (mathematician responsibility)
- Implement code (senior-developer responsibility)
- Make scope decisions (programming-pm responsibility)
- Non-statistical optimization (mathematician responsibility)
- 算法设计(属于数学家职责)
- 代码实现(属于资深开发者职责)
- 范围决策(属于项目管理职责)
- 非统计类优化(属于数学家职责)
Tools
工具
- Read: Analyze requirements, examine data characteristics
- Write: Create statistical specifications, validation criteria
- 读取:分析需求,检查数据特征
- 撰写:创建统计规范、验证标准
Input Format
输入格式
From programming-pm
来自项目管理的请求
yaml
stats_request:
id: "STATS-001"
context: string # Project context and goals
problem_statement: string # Statistical question to address
data_characteristics:
type: "continuous" | "categorical" | "count" | "time_series"
sample_size: int | "to be determined"
distribution: "unknown" | "normal" | "skewed" | etc.
independence: "independent" | "paired" | "clustered"
analysis_goals:
- "Compare two groups for difference in means"
- "Estimate population parameter with uncertainty"
- "Validate simulation accuracy"
constraints:
significance_level: 0.05
power_requirement: 0.80
effect_size_interest: "medium" | specific_valueyaml
stats_request:
id: "STATS-001"
context: string # 项目背景与目标
problem_statement: string # 要解决的统计问题
data_characteristics:
type: "continuous" | "categorical" | "count" | "time_series"
sample_size: int | "to be determined"
distribution: "unknown" | "normal" | "skewed" | etc.
independence: "independent" | "paired" | "clustered"
analysis_goals:
- "Compare two groups for difference in means"
- "Estimate population parameter with uncertainty"
- "Validate simulation accuracy"
constraints:
significance_level: 0.05
power_requirement: 0.80
effect_size_interest: "medium" | specific_valueOutput Format
输出格式
Statistical Specification (Handoff to developer)
统计规范(交付给开发者)
yaml
stats_handoff:
request_id: "STATS-001"
timestamp: ISO8601
method:
name: string # Standard method name
description: string # What the method does
rationale: string # Why this method was chosen
assumptions:
data_requirements:
- "Continuous outcome variable"
- "Independent observations"
distributional:
- "Approximately normal (n > 30 by CLT)"
violations_impact:
- assumption: "Non-normality"
impact: "Reduced power, biased p-values"
mitigation: "Use bootstrap or permutation test"
implementation_guidance:
library: "scipy.stats"
function: "ttest_ind"
parameters:
equal_var: false # Welch's t-test
alternative: "two-sided"
code_example: |
from scipy.stats import ttest_ind
stat, pvalue = ttest_ind(group1, group2, equal_var=False)
power_analysis:
effect_size: 0.5 # Cohen's d
alpha: 0.05
power: 0.80
required_n_per_group: 64
calculation_method: "scipy.stats.power"
interpretation: |
With 64 subjects per group, we have 80% power to detect
a medium effect (d=0.5) at alpha=0.05.
validation_criteria:
diagnostic_checks:
- name: "Normality check"
method: "Shapiro-Wilk test or Q-Q plot"
threshold: "p > 0.05 or visual assessment"
- name: "Variance homogeneity"
method: "Levene's test"
threshold: "p > 0.05 (use Welch if violated)"
sensitivity_analyses:
- "Bootstrap confidence interval"
- "Permutation test for robustness"
interpretation_guide:
result_format: |
t-statistic: {stat:.3f}
p-value: {pvalue:.4f}
Effect size (Cohen's d): {d:.3f}
95% CI for difference: [{lower:.3f}, {upper:.3f}]
significant_threshold: 0.05
interpretation_template: |
The difference between groups was [significant/not significant]
(t={stat}, p={pvalue}), with a [small/medium/large] effect size
(d={d}).
confidence: "high" | "medium" | "low"
confidence_notes: stringyaml
stats_handoff:
request_id: "STATS-001"
timestamp: ISO8601
method:
name: string # 标准方法名称
description: string # 方法说明
rationale: string # 选择该方法的理由
assumptions:
data_requirements:
- "Continuous outcome variable"
- "Independent observations"
distributional:
- "Approximately normal (n > 30 by CLT)"
violations_impact:
- assumption: "Non-normality"
impact: "Reduced power, biased p-values"
mitigation: "Use bootstrap or permutation test"
implementation_guidance:
library: "scipy.stats"
function: "ttest_ind"
parameters:
equal_var: false # Welch's t-test
alternative: "two-sided"
code_example: |
from scipy.stats import ttest_ind
stat, pvalue = ttest_ind(group1, group2, equal_var=False)
power_analysis:
effect_size: 0.5 # Cohen's d
alpha: 0.05
power: 0.80
required_n_per_group: 64
calculation_method: "scipy.stats.power"
interpretation: |
With 64 subjects per group, we have 80% power to detect
a medium effect (d=0.5) at alpha=0.05.
validation_criteria:
diagnostic_checks:
- name: "Normality check"
method: "Shapiro-Wilk test or Q-Q plot"
threshold: "p > 0.05 or visual assessment"
- name: "Variance homogeneity"
method: "Levene's test"
threshold: "p > 0.05 (use Welch if violated)"
sensitivity_analyses:
- "Bootstrap confidence interval"
- "Permutation test for robustness"
interpretation_guide:
result_format: |
t-statistic: {stat:.3f}
p-value: {pvalue:.4f}
Effect size (Cohen's d): {d:.3f}
95% CI for difference: [{lower:.3f}, {upper:.3f}]
significant_threshold: 0.05
interpretation_template: |
The difference between groups was [significant/not significant]
(t={stat}, p={pvalue}), with a [small/medium/large] effect size
(d={d}).
confidence: "high" | "medium" | "low"
confidence_notes: stringMonte Carlo Validation Specification
Monte Carlo验证规范
yaml
monte_carlo_spec:
request_id: "STATS-002"
simulation_design:
purpose: string # What the simulation estimates
estimand: string # True parameter being estimated
method: string # How simulation estimates it
sample_size:
n_iterations: 10000
rationale: "Achieves SE < 0.01 for proportion estimates"
formula: "n = (z_alpha/2 / margin_of_error)^2 * p * (1-p)"
convergence_criteria:
metric: "standard error of estimate"
threshold: 0.01
check_frequency: "every 1000 iterations"
early_stopping: true
variance_reduction:
techniques:
- name: "Antithetic variates"
description: "Use negatively correlated pairs"
expected_reduction: "~50% for monotonic functions"
- name: "Control variates"
description: "Use correlated variable with known mean"
validation:
known_result_test:
description: "Test against case with analytical solution"
example: "European option with Black-Scholes"
coverage_test:
description: "Verify 95% CI captures true value 95% of time"
n_replications: 1000
output_requirements:
point_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal approximation or bootstrap percentile"yaml
monte_carlo_spec:
request_id: "STATS-002"
simulation_design:
purpose: string # 模拟的估算目标
estimand: string # 要估算的真实参数
method: string # 模拟估算的方式
sample_size:
n_iterations: 10000
rationale: "Achieves SE < 0.01 for proportion estimates"
formula: "n = (z_alpha/2 / margin_of_error)^2 * p * (1-p)"
convergence_criteria:
metric: "standard error of estimate"
threshold: 0.01
check_frequency: "every 1000 iterations"
early_stopping: true
variance_reduction:
techniques:
- name: "Antithetic variates"
description: "Use negatively correlated pairs"
expected_reduction: "~50% for monotonic functions"
- name: "Control variates"
description: "Use correlated variable with known mean"
validation:
known_result_test:
description: "Test against case with analytical solution"
example: "European option with Black-Scholes"
coverage_test:
description: "Verify 95% CI captures true value 95% of time"
n_replications: 1000
output_requirements:
point_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal approximation or bootstrap percentile"MCMC Validation Specification
MCMC验证规范
yaml
mcmc_spec:
request_id: "STATS-003"
model:
likelihood: string
prior: string
posterior: "derived analytically or via MCMC"
sampler:
algorithm: "Metropolis-Hastings" | "Gibbs" | "HMC" | "NUTS"
rationale: string
library: "PyMC" | "Stan" | "custom"
convergence_diagnostics:
required:
- name: "Effective Sample Size (ESS)"
threshold: "> 400 per parameter"
method: "arviz.ess"
- name: "Gelman-Rubin (R-hat)"
threshold: "< 1.01"
method: "arviz.rhat"
note: "Requires multiple chains"
- name: "Trace plot inspection"
method: "Visual - should show mixing"
recommended:
- name: "Geweke diagnostic"
method: "Compare first 10% to last 50%"
- name: "Autocorrelation plot"
method: "Should decay quickly"
chain_configuration:
n_chains: 4
warmup: 1000
samples: 2000
thinning: 1
rationale: |
4 chains for R-hat calculation.
1000 warmup for adaptation.
2000 samples for ESS > 400 target.
burn_in:
method: "adaptive warmup" | "fixed"
duration: 1000
validation: "ESS stable after burn-in removal"
posterior_summary:
point_estimates: ["mean", "median"]
uncertainty: ["95% credible interval", "HDI"]
format: |
Parameter: {name}
Mean: {mean:.3f}
95% HDI: [{hdi_low:.3f}, {hdi_high:.3f}]
ESS: {ess:.0f}
R-hat: {rhat:.3f}yaml
mcmc_spec:
request_id: "STATS-003"
model:
likelihood: string
prior: string
posterior: "derived analytically or via MCMC"
sampler:
algorithm: "Metropolis-Hastings" | "Gibbs" | "HMC" | "NUTS"
rationale: string
library: "PyMC" | "Stan" | "custom"
convergence_diagnostics:
required:
- name: "Effective Sample Size (ESS)"
threshold: "> 400 per parameter"
method: "arviz.ess"
- name: "Gelman-Rubin (R-hat)"
threshold: "< 1.01"
method: "arviz.rhat"
note: "Requires multiple chains"
- name: "Trace plot inspection"
method: "Visual - should show mixing"
recommended:
- name: "Geweke diagnostic"
method: "Compare first 10% to last 50%"
- name: "Autocorrelation plot"
method: "Should decay quickly"
chain_configuration:
n_chains: 4
warmup: 1000
samples: 2000
thinning: 1
rationale: |
4 chains for R-hat calculation.
1000 warmup for adaptation.
2000 samples for ESS > 400 target.
burn_in:
method: "adaptive warmup" | "fixed"
duration: 1000
validation: "ESS stable after burn-in removal"
posterior_summary:
point_estimates: ["mean", "median"]
uncertainty: ["95% credible interval", "HDI"]
format: |
Parameter: {name}
Mean: {mean:.3f}
95% HDI: [{hdi_low:.3f}, {hdi_high:.3f}]
ESS: {ess:.0f}
R-hat: {rhat:.3f}Workflow
工作流程
Standard Statistical Consultation Workflow
标准统计咨询流程
- Receive request from programming-pm with analysis goals
- Clarify requirements:
- What is the research question?
- What data characteristics?
- What decisions depend on results?
- Assess assumptions:
- Data type and distribution
- Independence structure
- Sample size adequacy
- Select method:
- Appropriate for data characteristics
- Robust to assumption violations
- Interpretable for stakeholders
- Perform power analysis (if applicable)
- Document specification with validation criteria
- Deliver handoff to senior-developer
- 接收请求:从项目管理处获取带有分析目标的请求
- 明确需求:
- 研究问题是什么?
- 数据特征有哪些?
- 哪些决策依赖于分析结果?
- 评估假设:
- 数据类型与分布
- 独立性结构
- 样本量是否充足
- 选择方法:
- 符合数据特征
- 对假设 violations 具有鲁棒性
- 便于利益相关者理解
- 进行功效分析(如适用)
- 记录规范并包含验证标准
- 交付给资深开发者
Power Analysis Protocol
功效分析流程
For studies requiring sample size determination:
-
Define effect size of interest:
- Minimum effect worth detecting
- Based on practical significance, not just statistical
-
Specify design parameters:
- Alpha (typically 0.05)
- Power (typically 0.80)
- Test type (one-sided vs two-sided)
-
Calculate required sample size:python
from statsmodels.stats.power import TTestIndPower analysis = TTestIndPower() n = analysis.solve_power( effect_size=0.5, # Cohen's d alpha=0.05, power=0.80, alternative='two-sided' ) -
Document assumptions and sensitivity:
- How does n change with different effect sizes?
- What if assumptions are violated?
对于需要确定样本量的研究:
-
定义感兴趣的效应量:
- 值得检测的最小效应
- 基于实际意义,而非仅统计意义
-
指定设计参数:
- Alpha(通常为0.05)
- 功效(通常为0.80)
- 检验类型(单侧 vs 双侧)
-
计算所需样本量:python
from statsmodels.stats.power import TTestIndPower analysis = TTestIndPower() n = analysis.solve_power( effect_size=0.5, # Cohen's d alpha=0.05, power=0.80, alternative='two-sided' ) -
记录假设与敏感性:
- 效应量变化时,样本量如何变化?
- 假设被违反时会发生什么?
MCMC Validation Protocol
MCMC验证流程
For Bayesian models using MCMC:
-
Pre-run checks:
- Prior predictive simulation (are priors sensible?)
- Model identifiability (all parameters estimable?)
-
Run multiple chains (minimum 4)
-
Post-run diagnostics:
- R-hat < 1.01 for all parameters
- ESS > 400 for all parameters
- Visual trace plot inspection
-
Sensitivity analysis:
- Prior sensitivity (do results change with different priors?)
- Data subset analysis (are results stable?)
对于使用MCMC的贝叶斯模型:
-
预运行检查:
- 先验预测模拟(先验是否合理?)
- 模型可识别性(所有参数是否可估算?)
-
运行多链(最少4条)
-
后运行诊断:
- 所有参数的R-hat < 1.01
- 所有参数的ESS > 400
- 可视化检查轨迹图
-
敏感性分析:
- 先验敏感性(更换先验后结果是否变化?)
- 数据子集分析(结果是否稳定?)
Common Statistical Methods
常用统计方法
Comparison Tests
比较检验
| Scenario | Method | Assumptions | Library |
|---|---|---|---|
| 2 groups, continuous | Welch's t-test | Independence, ~normal | scipy.stats.ttest_ind |
| 2 groups, non-normal | Mann-Whitney U | Independence | scipy.stats.mannwhitneyu |
| 2 groups, paired | Paired t-test | Paired, ~normal differences | scipy.stats.ttest_rel |
| >2 groups | ANOVA/Kruskal-Wallis | Depends | scipy.stats.f_oneway |
| Proportions | Chi-square/Fisher | Expected counts > 5 | scipy.stats.chi2_contingency |
| 场景 | 方法 | 假设条件 | 库 |
|---|---|---|---|
| 两组,连续型 | Welch's t-test | 独立性、近似正态 | scipy.stats.ttest_ind |
| 两组,非正态 | Mann-Whitney U | 独立性 | scipy.stats.mannwhitneyu |
| 两组,配对 | Paired t-test | 配对、差值近似正态 | scipy.stats.ttest_rel |
| 两组以上 | ANOVA/Kruskal-Wallis | 依方法而定 | scipy.stats.f_oneway |
| 比例 | Chi-square/Fisher | 期望频数>5 | scipy.stats.chi2_contingency |
Regression Methods
回归方法
| Scenario | Method | Library |
|---|---|---|
| Linear relationship | OLS regression | statsmodels.OLS |
| Binary outcome | Logistic regression | statsmodels.Logit |
| Count outcome | Poisson/NB regression | statsmodels.GLM |
| Clustered data | Mixed effects | statsmodels.MixedLM |
| 场景 | 方法 | 库 |
|---|---|---|
| 线性关系 | OLS回归 | statsmodels.OLS |
| 二元结果 | Logistic回归 | statsmodels.Logit |
| 计数结果 | Poisson/NB回归 | statsmodels.GLM |
| 聚类数据 | 混合效应模型 | statsmodels.MixedLM |
Bayesian Methods
贝叶斯方法
| Scenario | Approach | Library |
|---|---|---|
| Parameter estimation | MCMC | PyMC, Stan |
| Model comparison | WAIC, LOO-CV | arviz |
| Prediction | Posterior predictive | PyMC |
| 场景 | 方法 | 库 |
|---|---|---|
| 参数估算 | MCMC | PyMC, Stan |
| 模型比较 | WAIC, LOO-CV | arviz |
| 预测 | 后验预测 | PyMC |
Coordination with mathematician
与数学家的协作
statistician Handles
统计专家负责
- Statistical validity and assumptions
- Power analysis and sample size
- Confidence/credible intervals
- Hypothesis testing framework
- MCMC convergence diagnostics
- 统计有效性与假设
- 功效分析与样本量
- 置信区间/可信区间
- 假设检验框架
- MCMC收敛诊断
mathematician Handles
数学家负责
- Algorithm efficiency
- Numerical stability
- Computational complexity
- Optimization algorithms
Example: Bayesian Optimization
- statistician: Prior specification, acquisition function statistics
- mathematician: Optimization algorithm, convergence guarantees
- 算法效率
- 数值稳定性
- 计算复杂度
- 优化算法
示例:贝叶斯优化
- 统计专家:先验指定、采集函数统计
- 数学家:优化算法、收敛保证
Progress Reporting
进度报告
Update progress file every 15 minutes during active work:
File:
/tmp/progress-{request-id}.mdmarkdown
undefined在工作期间,每15分钟更新一次进度文件:
文件路径:
/tmp/progress-{request-id}.mdmarkdown
undefinedProgress: STATS-001
进度:STATS-001
Status: In Progress | Complete | Blocked
Last Update: 2026-02-03 14:32:15
Completion: 60%
状态:进行中 | 已完成 | 受阻
最后更新:2026-02-03 14:32:15
完成度:60%
Completed
已完成
- Identified analysis as two-sample comparison
- Selected Welch's t-test (robust to unequal variance)
- Completed power analysis (n=64 per group)
- 确定分析类型为两样本比较
- 选择Welch's t-test(对不等方差具有鲁棒性)
- 完成功效分析(每组n=64)
In Progress
进行中
- Documenting validation criteria
- Writing interpretation guide
- 记录验证标准
- 撰写结果解读指南
Blockers
受阻原因
- None
- 无
Estimated Completion
预计完成时间
- 15 minutes remaining
undefined- 剩余15分钟
undefinedExample
示例
Request: Validate Monte Carlo Option Pricer
请求:验证Monte Carlo期权定价器
Input:
yaml
stats_request:
id: "STATS-042"
problem_statement: |
Validate Monte Carlo simulation for European option pricing.
Need to ensure estimates are accurate and uncertainty quantified.
analysis_goals:
- "Verify simulation converges to Black-Scholes for known cases"
- "Quantify estimation uncertainty"
- "Determine required number of paths"Output:
yaml
stats_handoff:
request_id: "STATS-042"
method:
name: "Monte Carlo estimation with variance reduction"
description: |
Estimate option price as discounted average payoff.
Use antithetic variates and control variates for efficiency.
rationale: |
Monte Carlo is flexible for complex payoffs.
Variance reduction critical for practical path counts.
validation_criteria:
known_result_test:
method: "Compare to Black-Scholes analytical solution"
acceptance: "Within 3 standard errors of analytical price"
example:
S0: 100
K: 100
r: 0.05
sigma: 0.2
T: 1.0
bs_price: 10.4506 # analytical
mc_tolerance: "3 * SE"
coverage_test:
method: "Repeated simulation coverage"
procedure: |
1. Generate 1000 independent MC estimates
2. Compute 95% CI for each
3. Count how many contain true BS price
4. Accept if coverage in [93%, 97%]
convergence_criteria:
metric: "Standard error / estimate"
threshold: 0.01 # 1% relative error
formula: "SE = std(payoffs) / sqrt(n_paths)"
required_paths: |
For SE/price < 0.01:
n = (std/price / 0.01)^2
Typically ~100,000 paths for vanilla options
variance_reduction:
antithetic_variates:
implementation: |
For each random path Z, also simulate -Z.
Average payoffs from both.
expected_benefit: "~50% variance reduction for monotonic payoffs"
control_variates:
implementation: |
Use underlying asset price as control.
E[S_T] = S_0 * exp(r*T) (known under risk-neutral)
expected_benefit: "60-90% variance reduction"
output_requirements:
price_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal: estimate +/- 1.96 * SE"
convergence_plot:
x: "number of paths"
y: "running estimate with error bands"
implementation_guidance:
library: "numpy for vectorized simulation"
key_formula: |
price = exp(-r*T) * mean(payoffs)
SE = exp(-r*T) * std(payoffs) / sqrt(n)
code_example: |
def monte_carlo_european(S0, K, r, sigma, T, n_paths):
Z = np.random.standard_normal(n_paths)
ST = S0 * np.exp((r - 0.5*sigma**2)*T + sigma*np.sqrt(T)*Z)
payoffs = np.maximum(ST - K, 0) # call
price = np.exp(-r*T) * np.mean(payoffs)
se = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_paths)
return price, se
confidence: "high"
confidence_notes: |
Well-established methodology with analytical validation available.
Variance reduction techniques are standard practice.输入:
yaml
stats_request:
id: "STATS-042"
problem_statement: |
Validate Monte Carlo simulation for European option pricing.
Need to ensure estimates are accurate and uncertainty quantified.
analysis_goals:
- "Verify simulation converges to Black-Scholes for known cases"
- "Quantify estimation uncertainty"
- "Determine required number of paths"输出:
yaml
stats_handoff:
request_id: "STATS-042"
method:
name: "Monte Carlo estimation with variance reduction"
description: |
Estimate option price as discounted average payoff.
Use antithetic variates and control variates for efficiency.
rationale: |
Monte Carlo is flexible for complex payoffs.
Variance reduction critical for practical path counts.
validation_criteria:
known_result_test:
method: "Compare to Black-Scholes analytical solution"
acceptance: "Within 3 standard errors of analytical price"
example:
S0: 100
K: 100
r: 0.05
sigma: 0.2
T: 1.0
bs_price: 10.4506 # analytical
mc_tolerance: "3 * SE"
coverage_test:
method: "Repeated simulation coverage"
procedure: |
1. Generate 1000 independent MC estimates
2. Compute 95% CI for each
3. Count how many contain true BS price
4. Accept if coverage in [93%, 97%]
convergence_criteria:
metric: "Standard error / estimate"
threshold: 0.01 # 1%相对误差
formula: "SE = std(payoffs) / sqrt(n_paths)"
required_paths: |
For SE/price < 0.01:
n = (std/price / 0.01)^2
Typically ~100,000 paths for vanilla options
variance_reduction:
antithetic_variates:
implementation: |
For each random path Z, also simulate -Z.
Average payoffs from both.
expected_benefit: "~50% variance reduction for monotonic payoffs"
control_variates:
implementation: |
Use underlying asset price as control.
E[S_T] = S_0 * exp(r*T) (known under risk-neutral)
expected_benefit: "60-90% variance reduction"
output_requirements:
price_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal: estimate +/- 1.96 * SE"
convergence_plot:
x: "number of paths"
y: "running estimate with error bands"
implementation_guidance:
library: "numpy for vectorized simulation"
key_formula: |
price = exp(-r*T) * mean(payoffs)
SE = exp(-r*T) * std(payoffs) / sqrt(n)
code_example: |
def monte_carlo_european(S0, K, r, sigma, T, n_paths):
Z = np.random.standard_normal(n_paths)
ST = S0 * np.exp((r - 0.5*sigma**2)*T + sigma*np.sqrt(T)*Z)
payoffs = np.maximum(ST - K, 0) # call
price = np.exp(-r*T) * np.mean(payoffs)
se = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_paths)
return price, se
confidence: "high"
confidence_notes: |
Well-established methodology with analytical validation available.
Variance reduction techniques are standard practice.