Loading...
Loading...
Use when selecting statistical methods, performing power analysis, guiding uncertainty quantification, or validating MCMC/Monte Carlo implementations.
npx skill4agent add dangeles/claude statisticianstats_request:
id: "STATS-001"
context: string # Project context and goals
problem_statement: string # Statistical question to address
data_characteristics:
type: "continuous" | "categorical" | "count" | "time_series"
sample_size: int | "to be determined"
distribution: "unknown" | "normal" | "skewed" | etc.
independence: "independent" | "paired" | "clustered"
analysis_goals:
- "Compare two groups for difference in means"
- "Estimate population parameter with uncertainty"
- "Validate simulation accuracy"
constraints:
significance_level: 0.05
power_requirement: 0.80
effect_size_interest: "medium" | specific_valuestats_handoff:
request_id: "STATS-001"
timestamp: ISO8601
method:
name: string # Standard method name
description: string # What the method does
rationale: string # Why this method was chosen
assumptions:
data_requirements:
- "Continuous outcome variable"
- "Independent observations"
distributional:
- "Approximately normal (n > 30 by CLT)"
violations_impact:
- assumption: "Non-normality"
impact: "Reduced power, biased p-values"
mitigation: "Use bootstrap or permutation test"
implementation_guidance:
library: "scipy.stats"
function: "ttest_ind"
parameters:
equal_var: false # Welch's t-test
alternative: "two-sided"
code_example: |
from scipy.stats import ttest_ind
stat, pvalue = ttest_ind(group1, group2, equal_var=False)
power_analysis:
effect_size: 0.5 # Cohen's d
alpha: 0.05
power: 0.80
required_n_per_group: 64
calculation_method: "scipy.stats.power"
interpretation: |
With 64 subjects per group, we have 80% power to detect
a medium effect (d=0.5) at alpha=0.05.
validation_criteria:
diagnostic_checks:
- name: "Normality check"
method: "Shapiro-Wilk test or Q-Q plot"
threshold: "p > 0.05 or visual assessment"
- name: "Variance homogeneity"
method: "Levene's test"
threshold: "p > 0.05 (use Welch if violated)"
sensitivity_analyses:
- "Bootstrap confidence interval"
- "Permutation test for robustness"
interpretation_guide:
result_format: |
t-statistic: {stat:.3f}
p-value: {pvalue:.4f}
Effect size (Cohen's d): {d:.3f}
95% CI for difference: [{lower:.3f}, {upper:.3f}]
significant_threshold: 0.05
interpretation_template: |
The difference between groups was [significant/not significant]
(t={stat}, p={pvalue}), with a [small/medium/large] effect size
(d={d}).
confidence: "high" | "medium" | "low"
confidence_notes: stringmonte_carlo_spec:
request_id: "STATS-002"
simulation_design:
purpose: string # What the simulation estimates
estimand: string # True parameter being estimated
method: string # How simulation estimates it
sample_size:
n_iterations: 10000
rationale: "Achieves SE < 0.01 for proportion estimates"
formula: "n = (z_alpha/2 / margin_of_error)^2 * p * (1-p)"
convergence_criteria:
metric: "standard error of estimate"
threshold: 0.01
check_frequency: "every 1000 iterations"
early_stopping: true
variance_reduction:
techniques:
- name: "Antithetic variates"
description: "Use negatively correlated pairs"
expected_reduction: "~50% for monotonic functions"
- name: "Control variates"
description: "Use correlated variable with known mean"
validation:
known_result_test:
description: "Test against case with analytical solution"
example: "European option with Black-Scholes"
coverage_test:
description: "Verify 95% CI captures true value 95% of time"
n_replications: 1000
output_requirements:
point_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal approximation or bootstrap percentile"mcmc_spec:
request_id: "STATS-003"
model:
likelihood: string
prior: string
posterior: "derived analytically or via MCMC"
sampler:
algorithm: "Metropolis-Hastings" | "Gibbs" | "HMC" | "NUTS"
rationale: string
library: "PyMC" | "Stan" | "custom"
convergence_diagnostics:
required:
- name: "Effective Sample Size (ESS)"
threshold: "> 400 per parameter"
method: "arviz.ess"
- name: "Gelman-Rubin (R-hat)"
threshold: "< 1.01"
method: "arviz.rhat"
note: "Requires multiple chains"
- name: "Trace plot inspection"
method: "Visual - should show mixing"
recommended:
- name: "Geweke diagnostic"
method: "Compare first 10% to last 50%"
- name: "Autocorrelation plot"
method: "Should decay quickly"
chain_configuration:
n_chains: 4
warmup: 1000
samples: 2000
thinning: 1
rationale: |
4 chains for R-hat calculation.
1000 warmup for adaptation.
2000 samples for ESS > 400 target.
burn_in:
method: "adaptive warmup" | "fixed"
duration: 1000
validation: "ESS stable after burn-in removal"
posterior_summary:
point_estimates: ["mean", "median"]
uncertainty: ["95% credible interval", "HDI"]
format: |
Parameter: {name}
Mean: {mean:.3f}
95% HDI: [{hdi_low:.3f}, {hdi_high:.3f}]
ESS: {ess:.0f}
R-hat: {rhat:.3f}from statsmodels.stats.power import TTestIndPower
analysis = TTestIndPower()
n = analysis.solve_power(
effect_size=0.5, # Cohen's d
alpha=0.05,
power=0.80,
alternative='two-sided'
)| Scenario | Method | Assumptions | Library |
|---|---|---|---|
| 2 groups, continuous | Welch's t-test | Independence, ~normal | scipy.stats.ttest_ind |
| 2 groups, non-normal | Mann-Whitney U | Independence | scipy.stats.mannwhitneyu |
| 2 groups, paired | Paired t-test | Paired, ~normal differences | scipy.stats.ttest_rel |
| >2 groups | ANOVA/Kruskal-Wallis | Depends | scipy.stats.f_oneway |
| Proportions | Chi-square/Fisher | Expected counts > 5 | scipy.stats.chi2_contingency |
| Scenario | Method | Library |
|---|---|---|
| Linear relationship | OLS regression | statsmodels.OLS |
| Binary outcome | Logistic regression | statsmodels.Logit |
| Count outcome | Poisson/NB regression | statsmodels.GLM |
| Clustered data | Mixed effects | statsmodels.MixedLM |
| Scenario | Approach | Library |
|---|---|---|
| Parameter estimation | MCMC | PyMC, Stan |
| Model comparison | WAIC, LOO-CV | arviz |
| Prediction | Posterior predictive | PyMC |
/tmp/progress-{request-id}.md# Progress: STATS-001
**Status**: In Progress | Complete | Blocked
**Last Update**: 2026-02-03 14:32:15
**Completion**: 60%
## Completed
- Identified analysis as two-sample comparison
- Selected Welch's t-test (robust to unequal variance)
- Completed power analysis (n=64 per group)
## In Progress
- Documenting validation criteria
- Writing interpretation guide
## Blockers
- None
## Estimated Completion
- 15 minutes remainingstats_request:
id: "STATS-042"
problem_statement: |
Validate Monte Carlo simulation for European option pricing.
Need to ensure estimates are accurate and uncertainty quantified.
analysis_goals:
- "Verify simulation converges to Black-Scholes for known cases"
- "Quantify estimation uncertainty"
- "Determine required number of paths"stats_handoff:
request_id: "STATS-042"
method:
name: "Monte Carlo estimation with variance reduction"
description: |
Estimate option price as discounted average payoff.
Use antithetic variates and control variates for efficiency.
rationale: |
Monte Carlo is flexible for complex payoffs.
Variance reduction critical for practical path counts.
validation_criteria:
known_result_test:
method: "Compare to Black-Scholes analytical solution"
acceptance: "Within 3 standard errors of analytical price"
example:
S0: 100
K: 100
r: 0.05
sigma: 0.2
T: 1.0
bs_price: 10.4506 # analytical
mc_tolerance: "3 * SE"
coverage_test:
method: "Repeated simulation coverage"
procedure: |
1. Generate 1000 independent MC estimates
2. Compute 95% CI for each
3. Count how many contain true BS price
4. Accept if coverage in [93%, 97%]
convergence_criteria:
metric: "Standard error / estimate"
threshold: 0.01 # 1% relative error
formula: "SE = std(payoffs) / sqrt(n_paths)"
required_paths: |
For SE/price < 0.01:
n = (std/price / 0.01)^2
Typically ~100,000 paths for vanilla options
variance_reduction:
antithetic_variates:
implementation: |
For each random path Z, also simulate -Z.
Average payoffs from both.
expected_benefit: "~50% variance reduction for monotonic payoffs"
control_variates:
implementation: |
Use underlying asset price as control.
E[S_T] = S_0 * exp(r*T) (known under risk-neutral)
expected_benefit: "60-90% variance reduction"
output_requirements:
price_estimate: true
standard_error: true
confidence_interval:
level: 0.95
method: "normal: estimate +/- 1.96 * SE"
convergence_plot:
x: "number of paths"
y: "running estimate with error bands"
implementation_guidance:
library: "numpy for vectorized simulation"
key_formula: |
price = exp(-r*T) * mean(payoffs)
SE = exp(-r*T) * std(payoffs) / sqrt(n)
code_example: |
def monte_carlo_european(S0, K, r, sigma, T, n_paths):
Z = np.random.standard_normal(n_paths)
ST = S0 * np.exp((r - 0.5*sigma**2)*T + sigma*np.sqrt(T)*Z)
payoffs = np.maximum(ST - K, 0) # call
price = np.exp(-r*T) * np.mean(payoffs)
se = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_paths)
return price, se
confidence: "high"
confidence_notes: |
Well-established methodology with analytical validation available.
Variance reduction techniques are standard practice.