statistical-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseStatistical Analysis
统计分析
Overview
概述
Statistical analysis is a systematic process for testing hypotheses and quantifying relationships. Conduct hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, and Bayesian analyses with assumption checks and APA reporting. Apply this skill for academic research.
统计分析是一个用于检验假设和量化变量关系的系统性过程。本技能支持开展假设检验(t-test、ANOVA、卡方检验)、回归分析、相关性分析、贝叶斯分析,同时包含假设验证和APA格式报告生成功能,适用于学术研究场景。
When to Use This Skill
适用场景
This skill should be used when:
- Conducting statistical hypothesis tests (t-tests, ANOVA, chi-square)
- Performing regression or correlation analyses
- Running Bayesian statistical analyses
- Checking statistical assumptions and diagnostics
- Calculating effect sizes and conducting power analyses
- Reporting statistical results in APA format
- Analyzing experimental or observational data for research
本技能适用于以下场景:
- 开展统计假设检验(t-test、ANOVA、卡方检验)
- 执行回归或相关性分析
- 运行贝叶斯统计分析
- 验证统计假设并进行诊断
- 计算效应量并开展功效分析
- 以APA格式报告统计结果
- 分析实验或观测研究数据
Core Capabilities
核心能力
1. Test Selection and Planning
1. 检验方法选择与规划
- Choose appropriate statistical tests based on research questions and data characteristics
- Conduct a priori power analyses to determine required sample sizes
- Plan analysis strategies including multiple comparison corrections
- 根据研究问题和数据特征选择合适的统计检验方法
- 开展先验功效分析以确定所需样本量
- 规划分析策略,包括多重比较校正
2. Assumption Checking
2. 假设验证
- Automatically verify all relevant assumptions before running tests
- Provide diagnostic visualizations (Q-Q plots, residual plots, box plots)
- Recommend remedial actions when assumptions are violated
- 在运行检验前自动验证所有相关假设
- 提供诊断可视化图表(Q-Q图、残差图、箱线图)
- 当假设不满足时推荐补救措施
3. Statistical Testing
3. 统计检验
- Hypothesis testing: t-tests, ANOVA, chi-square, non-parametric alternatives
- Regression: linear, multiple, logistic, with diagnostics
- Correlations: Pearson, Spearman, with confidence intervals
- Bayesian alternatives: Bayesian t-tests, ANOVA, regression with Bayes Factors
- 假设检验:t-test、ANOVA、卡方检验及非参数替代方法
- 回归分析:线性回归、多重回归、逻辑回归及诊断分析
- 相关性分析:Pearson相关、Spearman相关及置信区间
- 贝叶斯替代方法:贝叶斯t-test、ANOVA、带贝叶斯因子的回归分析
4. Effect Sizes and Interpretation
4. 效应量与结果解读
- Calculate and interpret appropriate effect sizes for all analyses
- Provide confidence intervals for effect estimates
- Distinguish statistical from practical significance
- 计算并解读所有分析对应的效应量
- 提供效应估计的置信区间
- 区分统计显著性与实际显著性
5. Professional Reporting
5. 专业报告生成
- Generate APA-style statistical reports
- Create publication-ready figures and tables
- Provide complete interpretation with all required statistics
- 生成APA格式的统计报告
- 创建可用于发表的图表和表格
- 提供包含所有必要统计量的完整结果解读
Workflow Decision Tree
工作流决策树
Use this decision tree to determine your analysis path:
START
│
├─ Need to SELECT a statistical test?
│ └─ YES → See "Test Selection Guide"
│ └─ NO → Continue
│
├─ Ready to check ASSUMPTIONS?
│ └─ YES → See "Assumption Checking"
│ └─ NO → Continue
│
├─ Ready to run ANALYSIS?
│ └─ YES → See "Running Statistical Tests"
│ └─ NO → Continue
│
└─ Need to REPORT results?
└─ YES → See "Reporting Results"使用以下决策树确定分析路径:
START
│
├─ Need to SELECT a statistical test?
│ └─ YES → See "Test Selection Guide"
│ └─ NO → Continue
│
├─ Ready to check ASSUMPTIONS?
│ └─ YES → See "Assumption Checking"
│ └─ NO → Continue
│
├─ Ready to run ANALYSIS?
│ └─ YES → See "Running Statistical Tests"
│ └─ NO → Continue
│
└─ Need to REPORT results?
└─ YES → See "Reporting Results"Test Selection Guide
检验方法选择指南
Quick Reference: Choosing the Right Test
快速参考:选择合适的检验方法
Use for comprehensive guidance. Quick reference:
references/test_selection_guide.mdComparing Two Groups:
- Independent, continuous, normal → Independent t-test
- Independent, continuous, non-normal → Mann-Whitney U test
- Paired, continuous, normal → Paired t-test
- Paired, continuous, non-normal → Wilcoxon signed-rank test
- Binary outcome → Chi-square or Fisher's exact test
Comparing 3+ Groups:
- Independent, continuous, normal → One-way ANOVA
- Independent, continuous, non-normal → Kruskal-Wallis test
- Paired, continuous, normal → Repeated measures ANOVA
- Paired, continuous, non-normal → Friedman test
Relationships:
- Two continuous variables → Pearson (normal) or Spearman correlation (non-normal)
- Continuous outcome with predictor(s) → Linear regression
- Binary outcome with predictor(s) → Logistic regression
Bayesian Alternatives:
All tests have Bayesian versions that provide:
- Direct probability statements about hypotheses
- Bayes Factors quantifying evidence
- Ability to support null hypothesis
- See
references/bayesian_statistics.md
请查看获取全面指导。以下为快速参考:
references/test_selection_guide.md两组比较:
- 独立样本、连续型数据、符合正态分布 → 独立样本t-test
- 独立样本、连续型数据、不符合正态分布 → Mann-Whitney U检验
- 配对样本、连续型数据、符合正态分布 → 配对样本t-test
- 配对样本、连续型数据、不符合正态分布 → Wilcoxon符号秩检验
- 二分类结果变量 → 卡方检验或Fisher精确检验
三组及以上比较:
- 独立样本、连续型数据、符合正态分布 → 单因素ANOVA
- 独立样本、连续型数据、不符合正态分布 → Kruskal-Wallis检验
- 配对样本、连续型数据、符合正态分布 → 重复测量ANOVA
- 配对样本、连续型数据、不符合正态分布 → Friedman检验
变量关系分析:
- 两个连续型变量 → Pearson相关(正态分布)或Spearman相关(非正态分布)
- 连续型因变量与预测变量 → 线性回归
- 二分类因变量与预测变量 → 逻辑回归
贝叶斯替代方法:
所有检验方法均有对应的贝叶斯版本,可提供:
- 关于假设的直接概率陈述
- 量化证据的贝叶斯因子
- 支持零假设的能力
- 详情请查看
references/bayesian_statistics.md
Assumption Checking
假设验证
Systematic Assumption Verification
系统性假设验证
ALWAYS check assumptions before interpreting test results.
Use the provided module for automated checking:
scripts/assumption_checks.pypython
from scripts.assumption_checks import comprehensive_assumption_check在解读检验结果前,务必先验证假设。
使用提供的模块进行自动化验证:
scripts/assumption_checks.pypython
from scripts.assumption_checks import comprehensive_assumption_checkComprehensive check with visualizations
带可视化的全面检查
results = comprehensive_assumption_check(
data=df,
value_col='score',
group_col='group', # Optional: for group comparisons
alpha=0.05
)
This performs:
1. **Outlier detection** (IQR and z-score methods)
2. **Normality testing** (Shapiro-Wilk test + Q-Q plots)
3. **Homogeneity of variance** (Levene's test + box plots)
4. **Interpretation and recommendations**results = comprehensive_assumption_check(
data=df,
value_col='score',
group_col='group', # 可选:用于组间比较
alpha=0.05
)
该模块将执行:
1. **异常值检测**(IQR和z分数方法)
2. **正态性检验**(Shapiro-Wilk检验 + Q-Q图)
3. **方差齐性检验**(Levene检验 + 箱线图)
4. **结果解读与建议**Individual Assumption Checks
针对性假设验证
For targeted checks, use individual functions:
python
from scripts.assumption_checks import (
check_normality,
check_normality_per_group,
check_homogeneity_of_variance,
check_linearity,
detect_outliers
)如需针对性检查,可使用独立函数:
python
from scripts.assumption_checks import (
check_normality,
check_normality_per_group,
check_homogeneity_of_variance,
check_linearity,
detect_outliers
)Example: Check normality with visualization
示例:带可视化的正态性检验
result = check_normality(
data=df['score'],
name='Test Score',
alpha=0.05,
plot=True
)
print(result['interpretation'])
print(result['recommendation'])
undefinedresult = check_normality(
data=df['score'],
name='Test Score',
alpha=0.05,
plot=True
)
print(result['interpretation'])
print(result['recommendation'])
undefinedWhat to Do When Assumptions Are Violated
假设不满足时的处理方案
Normality violated:
- Mild violation + n > 30 per group → Proceed with parametric test (robust)
- Moderate violation → Use non-parametric alternative
- Severe violation → Transform data or use non-parametric test
Homogeneity of variance violated:
- For t-test → Use Welch's t-test
- For ANOVA → Use Welch's ANOVA or Brown-Forsythe ANOVA
- For regression → Use robust standard errors or weighted least squares
Linearity violated (regression):
- Add polynomial terms
- Transform variables
- Use non-linear models or GAM
See for comprehensive guidance.
references/assumptions_and_diagnostics.md正态性假设不满足:
- 轻度违反 + 每组样本量n > 30 → 继续使用参数检验(稳健性较好)
- 中度违反 → 使用非参数替代方法
- 严重违反 → 转换数据或使用非参数检验
方差齐性假设不满足:
- t-test → 使用Welch's t-test
- ANOVA → 使用Welch's ANOVA或Brown-Forsythe ANOVA
- 回归分析 → 使用稳健标准误或加权最小二乘法
线性假设不满足(回归分析):
- 添加多项式项
- 转换变量
- 使用非线性模型或GAM
详情请查看。
references/assumptions_and_diagnostics.mdRunning Statistical Tests
运行统计检验
Python Libraries
Python库
Primary libraries for statistical analysis:
- scipy.stats: Core statistical tests
- statsmodels: Advanced regression and diagnostics
- pingouin: User-friendly statistical testing with effect sizes
- pymc: Bayesian statistical modeling
- arviz: Bayesian visualization and diagnostics
用于统计分析的核心库:
- scipy.stats: 基础统计检验
- statsmodels: 高级回归分析与诊断
- pingouin: 易用的统计检验工具,支持效应量计算
- pymc: 贝叶斯统计建模
- arviz: 贝叶斯分析可视化与诊断
Example Analyses
示例分析
T-Test with Complete Reporting
带完整报告的t-test
python
import pingouin as pg
import numpy as nppython
import pingouin as pg
import numpy as npRun independent t-test
运行独立样本t-test
result = pg.ttest(group_a, group_b, correction='auto')
result = pg.ttest(group_a, group_b, correction='auto')
Extract results
提取结果
t_stat = result['T'].values[0]
df = result['dof'].values[0]
p_value = result['p-val'].values[0]
cohens_d = result['cohen-d'].values[0]
ci_lower = result['CI95%'].values[0][0]
ci_upper = result['CI95%'].values[0][1]
t_stat = result['T'].values[0]
df = result['dof'].values[0]
p_value = result['p-val'].values[0]
cohens_d = result['cohen-d'].values[0]
ci_lower = result['CI95%'].values[0][0]
ci_upper = result['CI95%'].values[0][1]
Report
报告结果
print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}")
print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")
undefinedprint(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}")
print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")
undefinedANOVA with Post-Hoc Tests
带事后检验的ANOVA
python
import pingouin as pgpython
import pingouin as pgOne-way ANOVA
单因素ANOVA
aov = pg.anova(dv='score', between='group', data=df, detailed=True)
print(aov)
aov = pg.anova(dv='score', between='group', data=df, detailed=True)
print(aov)
If significant, conduct post-hoc tests
若结果显著,执行事后检验
if aov['p-unc'].values[0] < 0.05:
posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)
print(posthoc)
if aov['p-unc'].values[0] < 0.05:
posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)
print(posthoc)
Effect size
计算效应量
eta_squared = aov['np2'].values[0] # Partial eta-squared
print(f"Partial η² = {eta_squared:.3f}")
undefinedeta_squared = aov['np2'].values[0] # 偏eta平方
print(f"Partial η² = {eta_squared:.3f}")
undefinedLinear Regression with Diagnostics
带诊断分析的线性回归
python
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factorpython
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factorFit model
拟合模型
X = sm.add_constant(X_predictors) # Add intercept
model = sm.OLS(y, X).fit()
X = sm.add_constant(X_predictors) # 添加截距项
model = sm.OLS(y, X).fit()
Summary
输出模型摘要
print(model.summary())
print(model.summary())
Check multicollinearity (VIF)
检查多重共线性(VIF)
vif_data = pd.DataFrame()
vif_data["Variable"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
print(vif_data)
vif_data = pd.DataFrame()
vif_data["Variable"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
print(vif_data)
Check assumptions
验证假设
residuals = model.resid
fitted = model.fittedvalues
residuals = model.resid
fitted = model.fittedvalues
Residual plots
绘制残差图
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
Residuals vs fitted
残差 vs 拟合值
axes[0, 0].scatter(fitted, residuals, alpha=0.6)
axes[0, 0].axhline(y=0, color='r', linestyle='--')
axes[0, 0].set_xlabel('Fitted values')
axes[0, 0].set_ylabel('Residuals')
axes[0, 0].set_title('Residuals vs Fitted')
axes[0, 0].scatter(fitted, residuals, alpha=0.6)
axes[0, 0].axhline(y=0, color='r', linestyle='--')
axes[0, 0].set_xlabel('Fitted values')
axes[0, 0].set_ylabel('Residuals')
axes[0, 0].set_title('Residuals vs Fitted')
Q-Q plot
Q-Q图
from scipy import stats
stats.probplot(residuals, dist="norm", plot=axes[0, 1])
axes[0, 1].set_title('Normal Q-Q')
from scipy import stats
stats.probplot(residuals, dist="norm", plot=axes[0, 1])
axes[0, 1].set_title('Normal Q-Q')
Scale-Location
尺度-位置图
axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6)
axes[1, 0].set_xlabel('Fitted values')
axes[1, 0].set_ylabel('√|Standardized residuals|')
axes[1, 0].set_title('Scale-Location')
axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6)
axes[1, 0].set_xlabel('Fitted values')
axes[1, 0].set_ylabel('√|Standardized residuals|')
axes[1, 0].set_title('Scale-Location')
Residuals histogram
残差直方图
axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)
axes[1, 1].set_xlabel('Residuals')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].set_title('Histogram of Residuals')
plt.tight_layout()
plt.show()
undefinedaxes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)
axes[1, 1].set_xlabel('Residuals')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].set_title('Histogram of Residuals')
plt.tight_layout()
plt.show()
undefinedBayesian T-Test
贝叶斯t-test
python
import pymc as pm
import arviz as az
import numpy as np
with pm.Model() as model:
# Priors
mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
sigma = pm.HalfNormal('sigma', sigma=10)
# Likelihood
y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)
# Derived quantity
diff = pm.Deterministic('difference', mu1 - mu2)
# Sample
trace = pm.sample(2000, tune=1000, return_inferencedata=True)python
import pymc as pm
import arviz as az
import numpy as np
with pm.Model() as model:
# 先验分布
mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
sigma = pm.HalfNormal('sigma', sigma=10)
# 似然函数
y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)
# 衍生变量
diff = pm.Deterministic('difference', mu1 - mu2)
# 采样
trace = pm.sample(2000, tune=1000, return_inferencedata=True)Summarize
汇总结果
print(az.summary(trace, var_names=['difference']))
print(az.summary(trace, var_names=['difference']))
Probability that group1 > group2
组1均值大于组2均值的概率
prob_greater = np.mean(trace.posterior['difference'].values > 0)
print(f"P(μ₁ > μ₂ | data) = {prob_greater:.3f}")
prob_greater = np.mean(trace.posterior['difference'].values > 0)
print(f"P(μ₁ > μ₂ | data) = {prob_greater:.3f}")
Plot posterior
绘制后验分布
az.plot_posterior(trace, var_names=['difference'], ref_val=0)
---az.plot_posterior(trace, var_names=['difference'], ref_val=0)
---Effect Sizes
效应量
Always Calculate Effect Sizes
务必计算效应量
Effect sizes quantify magnitude, while p-values only indicate existence of an effect.
See for comprehensive guidance.
references/effect_sizes_and_power.md效应量用于量化效应的大小,而p值仅能表明效应是否存在。
详情请查看。
references/effect_sizes_and_power.mdQuick Reference: Common Effect Sizes
常用效应量快速参考
| Test | Effect Size | Small | Medium | Large |
|---|---|---|---|---|
| T-test | Cohen's d | 0.20 | 0.50 | 0.80 |
| ANOVA | η²_p | 0.01 | 0.06 | 0.14 |
| Correlation | r | 0.10 | 0.30 | 0.50 |
| Regression | R² | 0.02 | 0.13 | 0.26 |
| Chi-square | Cramér's V | 0.07 | 0.21 | 0.35 |
Important: Benchmarks are guidelines. Context matters!
| 检验方法 | 效应量 | 小效应 | 中效应 | 大效应 |
|---|---|---|---|---|
| t-test | Cohen's d | 0.20 | 0.50 | 0.80 |
| ANOVA | η²_p | 0.01 | 0.06 | 0.14 |
| 相关性分析 | r | 0.10 | 0.30 | 0.50 |
| 回归分析 | R² | 0.02 | 0.13 | 0.26 |
| 卡方检验 | Cramér's V | 0.07 | 0.21 | 0.35 |
注意:上述基准仅为参考,实际解读需结合研究场景!
Calculating Effect Sizes
计算效应量
Most effect sizes are automatically calculated by pingouin:
python
undefined大多数效应量可由pingouin自动计算:
python
undefinedT-test returns Cohen's d
t-test返回Cohen's d
result = pg.ttest(x, y)
d = result['cohen-d'].values[0]
result = pg.ttest(x, y)
d = result['cohen-d'].values[0]
ANOVA returns partial eta-squared
ANOVA返回偏eta平方
aov = pg.anova(dv='score', between='group', data=df)
eta_p2 = aov['np2'].values[0]
aov = pg.anova(dv='score', between='group', data=df)
eta_p2 = aov['np2'].values[0]
Correlation: r is already an effect size
相关性分析:r本身就是效应量
corr = pg.corr(x, y)
r = corr['r'].values[0]
undefinedcorr = pg.corr(x, y)
r = corr['r'].values[0]
undefinedConfidence Intervals for Effect Sizes
效应量的置信区间
Always report CIs to show precision:
python
from pingouin import compute_effsize_from_t务必报告置信区间以体现结果的精度:
python
from pingouin import compute_effsize_from_tFor t-test
针对t-test
d, ci = compute_effsize_from_t(
t_statistic,
nx=len(group1),
ny=len(group2),
eftype='cohen'
)
print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")
---d, ci = compute_effsize_from_t(
t_statistic,
nx=len(group1),
ny=len(group2),
eftype='cohen'
)
print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")
---Power Analysis
功效分析
A Priori Power Analysis (Study Planning)
先验功效分析(研究规划阶段)
Determine required sample size before data collection:
python
from statsmodels.stats.power import (
tt_ind_solve_power,
FTestAnovaPower
)在数据收集前确定所需样本量:
python
from statsmodels.stats.power import (
tt_ind_solve_power,
FTestAnovaPower
)T-test: What n is needed to detect d = 0.5?
t-test:检测d=0.5的效应需要多少样本量?
n_required = tt_ind_solve_power(
effect_size=0.5,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Required n per group: {n_required:.0f}")
n_required = tt_ind_solve_power(
effect_size=0.5,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"每组所需样本量: {n_required:.0f}")
ANOVA: What n is needed to detect f = 0.25?
ANOVA:检测f=0.25的效应需要多少样本量?
anova_power = FTestAnovaPower()
n_per_group = anova_power.solve_power(
effect_size=0.25,
ngroups=3,
alpha=0.05,
power=0.80
)
print(f"Required n per group: {n_per_group:.0f}")
undefinedanova_power = FTestAnovaPower()
n_per_group = anova_power.solve_power(
effect_size=0.25,
ngroups=3,
alpha=0.05,
power=0.80
)
print(f"每组所需样本量: {n_per_group:.0f}")
undefinedSensitivity Analysis (Post-Study)
敏感性分析(研究完成后)
Determine what effect size you could detect:
python
undefined确定研究能够检测到的最小效应量:
python
undefinedWith n=50 per group, what effect could we detect?
每组样本量n=50时,能够检测到的效应量是多少?
detectable_d = tt_ind_solve_power(
effect_size=None, # Solve for this
nobs1=50,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Study could detect d ≥ {detectable_d:.2f}")
**Note**: Post-hoc power analysis (calculating power after study) is generally not recommended. Use sensitivity analysis instead.
See `references/effect_sizes_and_power.md` for detailed guidance.
---detectable_d = tt_ind_solve_power(
effect_size=None, # 求解该参数
nobs1=50,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"本研究可检测的最小d值: {detectable_d:.2f}")
**注意**:一般不推荐开展事后功效分析(研究完成后计算功效),建议使用敏感性分析替代。
详情请查看`references/effect_sizes_and_power.md`。
---Reporting Results
结果报告
APA Style Statistical Reporting
APA格式统计报告
Follow guidelines in .
references/reporting_standards.md请遵循中的指南。
references/reporting_standards.mdEssential Reporting Elements
报告核心要素
- Descriptive statistics: M, SD, n for all groups/variables
- Test statistics: Test name, statistic, df, exact p-value
- Effect sizes: With confidence intervals
- Assumption checks: Which tests were done, results, actions taken
- All planned analyses: Including non-significant findings
- 描述性统计:所有组/变量的均值M、标准差SD、样本量n
- 检验统计量:检验方法名称、统计量、自由度df、精确p值
- 效应量:附带置信区间
- 假设验证:执行了哪些检验、结果如何、采取了哪些措施
- 所有预设分析:包括不显著的结果
Example Report Templates
报告模板示例
Independent T-Test
独立样本t-test
Group A (n = 48, M = 75.2, SD = 8.5) scored significantly higher than
Group B (n = 52, M = 68.3, SD = 9.2), t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18], two-tailed. Assumptions of normality (Shapiro-Wilk:
Group A W = 0.97, p = .18; Group B W = 0.96, p = .12) and homogeneity
of variance (Levene's F(1, 98) = 1.23, p = .27) were satisfied.组A(n = 48, M = 75.2, SD = 8.5)的得分显著高于
组B(n = 52, M = 68.3, SD = 9.2),t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18],双侧检验。正态性假设(Shapiro-Wilk:
组A W = 0.97, p = .18;组B W = 0.96, p = .12)和方差齐性假设(Levene's F(1, 98) = 1.23, p = .27)均满足。One-Way ANOVA
单因素ANOVA
A one-way ANOVA revealed a significant main effect of treatment condition
on test scores, F(2, 147) = 8.45, p < .001, η²_p = .10. Post hoc
comparisons using Tukey's HSD indicated that Condition A (M = 78.2,
SD = 7.3) scored significantly higher than Condition B (M = 71.5,
SD = 8.1, p = .002, d = 0.87) and Condition C (M = 70.1, SD = 7.9,
p < .001, d = 1.07). Conditions B and C did not differ significantly
(p = .52, d = 0.18).单因素ANOVA结果显示,处理条件对测试得分存在显著主效应,F(2, 147) = 8.45, p < .001, η²_p = .10。使用Tukey's HSD进行事后比较发现,条件A(M = 78.2,
SD = 7.3)的得分显著高于条件B(M = 71.5, SD = 8.1, p = .002, d = 0.87)和条件C(M = 70.1, SD = 7.9, p < .001, d = 1.07)。条件B和条件C的得分无显著差异(p = .52, d = 0.18)。Multiple Regression
多重回归分析
Multiple linear regression was conducted to predict exam scores from
study hours, prior GPA, and attendance. The overall model was significant,
F(3, 146) = 45.2, p < .001, R² = .48, adjusted R² = .47. Study hours
(B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])
and prior GPA (B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001,
95% CI [4.66, 12.38]) were significant predictors, while attendance was
not (B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]).
Multicollinearity was not a concern (all VIF < 1.5).采用多重线性回归分析,以学习时长、前期GPA、出勤情况为预测变量,考试得分为因变量。整体模型显著,F(3, 146) = 45.2, p < .001, R² = .48, 调整后R² = .47。学习时长(B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])
和前期GPA(B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001, 95% CI [4.66, 12.38])是显著预测变量,而出勤情况不是(B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39])。多重共线性无异常(所有VIF < 1.5)。Bayesian Analysis
贝叶斯分析
A Bayesian independent samples t-test was conducted using weakly
informative priors (Normal(0, 1) for mean difference). The posterior
distribution indicated that Group A scored higher than Group B
(M_diff = 6.8, 95% credible interval [3.2, 10.4]). The Bayes Factor
BF₁₀ = 45.3 provided very strong evidence for a difference between
groups, with a 99.8% posterior probability that Group A's mean exceeded
Group B's mean. Convergence diagnostics were satisfactory (all R̂ < 1.01,
ESS > 1000).使用弱信息先验(均值差的Normal(0, 1)分布)开展贝叶斯独立样本t-test。后验分布结果显示,组A得分高于组B(M_diff = 6.8, 95%可信区间 [3.2, 10.4])。贝叶斯因子BF₁₀ = 45.3为组间差异提供了极强的证据,组A均值高于组B均值的后验概率为99.8%。收敛诊断结果良好(所有R̂ < 1.01, ESS > 1000)。Bayesian Statistics
贝叶斯统计
When to Use Bayesian Methods
贝叶斯方法适用场景
Consider Bayesian approaches when:
- You have prior information to incorporate
- You want direct probability statements about hypotheses
- Sample size is small or planning sequential data collection
- You need to quantify evidence for the null hypothesis
- The model is complex (hierarchical, missing data)
See for comprehensive guidance on:
references/bayesian_statistics.md- Bayes' theorem and interpretation
- Prior specification (informative, weakly informative, non-informative)
- Bayesian hypothesis testing with Bayes Factors
- Credible intervals vs. confidence intervals
- Bayesian t-tests, ANOVA, regression, and hierarchical models
- Model convergence checking and posterior predictive checks
建议在以下场景使用贝叶斯方法:
- 可纳入先验信息
- 需要直接获取关于假设的概率陈述
- 样本量较小或计划开展序贯数据收集
- 需要量化对零假设的支持程度
- 模型复杂(分层模型、缺失数据)
详情请查看,内容包括:
references/bayesian_statistics.md- 贝叶斯定理与解读
- 先验分布指定(信息性、弱信息性、无信息性)
- 基于贝叶斯因子的假设检验
- 可信区间与置信区间的对比
- 贝叶斯t-test、ANOVA、回归分析及分层模型
- 模型收敛检查与后验预测检查
Key Advantages
核心优势
- Intuitive interpretation: "Given the data, there is a 95% probability the parameter is in this interval"
- Evidence for null: Can quantify support for no effect
- Flexible: No p-hacking concerns; can analyze data as it arrives
- Uncertainty quantification: Full posterior distribution
- 解读直观:“基于现有数据,参数落在该区间的概率为95%”
- 支持零假设:可量化对无效应的支持程度
- 灵活性:无需担心p值操纵问题;可实时分析数据
- 不确定性量化:提供完整的后验分布
Resources
资源
This skill includes comprehensive reference materials:
本技能包含以下全面参考资料:
References Directory
参考文档目录
- test_selection_guide.md: Decision tree for choosing appropriate statistical tests
- assumptions_and_diagnostics.md: Detailed guidance on checking and handling assumption violations
- effect_sizes_and_power.md: Calculating, interpreting, and reporting effect sizes; conducting power analyses
- bayesian_statistics.md: Complete guide to Bayesian analysis methods
- reporting_standards.md: APA-style reporting guidelines with examples
- test_selection_guide.md: 选择合适统计检验方法的决策树
- assumptions_and_diagnostics.md: 假设验证与诊断的详细指南
- effect_sizes_and_power.md: 效应量的计算、解读与报告;功效分析指南
- bayesian_statistics.md: 贝叶斯分析方法的完整指南
- reporting_standards.md: APA格式报告指南及示例
Scripts Directory
脚本目录
- assumption_checks.py: Automated assumption checking with visualizations
- : Complete workflow
comprehensive_assumption_check() - : Normality testing with Q-Q plots
check_normality() - : Levene's test with box plots
check_homogeneity_of_variance() - : Regression linearity checks
check_linearity() - : IQR and z-score outlier detection
detect_outliers()
- assumption_checks.py: 带可视化的自动化假设验证脚本
- : 完整工作流检查
comprehensive_assumption_check() - : 带Q-Q图的正态性检验
check_normality() - : 带箱线图的方差齐性检验
check_homogeneity_of_variance() - : 回归分析线性假设检验
check_linearity() - : IQR和z分数异常值检测
detect_outliers()
Best Practices
最佳实践
- Pre-register analyses when possible to distinguish confirmatory from exploratory
- Always check assumptions before interpreting results
- Report effect sizes with confidence intervals
- Report all planned analyses including non-significant results
- Distinguish statistical from practical significance
- Visualize data before and after analysis
- Check diagnostics for regression/ANOVA (residual plots, VIF, etc.)
- Conduct sensitivity analyses to assess robustness
- Share data and code for reproducibility
- Be transparent about violations, transformations, and decisions
- 预先注册分析方案:尽可能区分验证性分析与探索性分析
- 务必验证假设:在解读结果前先完成假设验证
- 报告效应量:附带置信区间
- 报告所有预设分析:包括不显著的结果
- 区分统计显著性与实际显著性
- 可视化数据:分析前后均需进行数据可视化
- 检查诊断结果:回归/ANOVA的残差图、VIF等
- 开展敏感性分析:评估结果的稳健性
- 共享数据与代码:确保研究可重复
- 保持透明:如实报告假设违反、数据转换及决策过程
Common Pitfalls to Avoid
常见误区
- P-hacking: Don't test multiple ways until something is significant
- HARKing: Don't present exploratory findings as confirmatory
- Ignoring assumptions: Check them and report violations
- Confusing significance with importance: p < .05 ≠ meaningful effect
- Not reporting effect sizes: Essential for interpretation
- Cherry-picking results: Report all planned analyses
- Misinterpreting p-values: They're NOT probability that hypothesis is true
- Multiple comparisons: Correct for family-wise error when appropriate
- Ignoring missing data: Understand mechanism (MCAR, MAR, MNAR)
- Overinterpreting non-significant results: Absence of evidence ≠ evidence of absence
- p值操纵:不要通过多种检验方法直到得到显著结果
- HARKing:不要将探索性结果伪装成验证性结果
- 忽略假设验证:务必检查假设并报告违反情况
- 混淆显著性与重要性:p < .05 不代表效应有实际意义
- 未报告效应量:效应量是结果解读的关键
- 选择性报告结果:报告所有预设分析结果
- 错误解读p值:p值不是“假设为真的概率”
- 多重比较:必要时校正家族式误差
- 忽略缺失数据:了解缺失机制(MCAR、MAR、MNAR)
- 过度解读不显著结果:没有证据不代表不存在效应
Getting Started Checklist
入门检查清单
When beginning a statistical analysis:
- Define research question and hypotheses
- Determine appropriate statistical test (use test_selection_guide.md)
- Conduct power analysis to determine sample size
- Load and inspect data
- Check for missing data and outliers
- Verify assumptions using assumption_checks.py
- Run primary analysis
- Calculate effect sizes with confidence intervals
- Conduct post-hoc tests if needed (with corrections)
- Create visualizations
- Write results following reporting_standards.md
- Conduct sensitivity analyses
- Share data and code
开展统计分析前,请完成以下事项:
- 明确研究问题与假设
- 确定合适的统计检验方法(参考test_selection_guide.md)
- 开展功效分析以确定样本量
- 加载并检查数据
- 检查缺失数据与异常值
- 使用assumption_checks.py验证假设
- 运行核心分析
- 计算效应量并附带置信区间
- 必要时执行事后检验(需校正)
- 创建可视化图表
- 遵循reporting_standards.md撰写结果
- 开展敏感性分析
- 共享数据与代码
Support and Further Reading
支持与拓展阅读
For questions about:
- Test selection: See references/test_selection_guide.md
- Assumptions: See references/assumptions_and_diagnostics.md
- Effect sizes: See references/effect_sizes_and_power.md
- Bayesian methods: See references/bayesian_statistics.md
- Reporting: See references/reporting_standards.md
Key textbooks:
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics
- Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models
- Kruschke, J. K. (2014). Doing Bayesian Data Analysis
Online resources:
- APA Style Guide: https://apastyle.apa.org/
- Statistical Consulting: Cross Validated (stats.stackexchange.com)
如有以下问题,请参考对应资料:
- 检验方法选择:查看references/test_selection_guide.md
- 假设验证:查看references/assumptions_and_diagnostics.md
- 效应量:查看references/effect_sizes_and_power.md
- 贝叶斯方法:查看references/bayesian_statistics.md
- 报告规范:查看references/reporting_standards.md
核心教材:
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics
- Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models
- Kruschke, J. K. (2014). Doing Bayesian Data Analysis
在线资源:
- APA格式指南: https://apastyle.apa.org/
- 统计咨询: Cross Validated (stats.stackexchange.com)