statistical-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Statistical Analysis

统计分析

Overview

概述

Statistical analysis is a systematic process for testing hypotheses and quantifying relationships. Conduct hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, and Bayesian analyses with assumption checks and APA reporting. Apply this skill for academic research.

统计分析是一个用于检验假设和量化变量关系的系统性过程。本技能支持开展假设检验（t-test、ANOVA、卡方检验）、回归分析、相关性分析、贝叶斯分析，同时包含假设验证和APA格式报告生成功能，适用于学术研究场景。

When to Use This Skill

适用场景

This skill should be used when:

Conducting statistical hypothesis tests (t-tests, ANOVA, chi-square)
Performing regression or correlation analyses
Running Bayesian statistical analyses
Checking statistical assumptions and diagnostics
Calculating effect sizes and conducting power analyses
Reporting statistical results in APA format
Analyzing experimental or observational data for research

本技能适用于以下场景：

开展统计假设检验（t-test、ANOVA、卡方检验）
执行回归或相关性分析
运行贝叶斯统计分析
验证统计假设并进行诊断
计算效应量并开展功效分析
以APA格式报告统计结果
分析实验或观测研究数据

Core Capabilities

核心能力

1. Test Selection and Planning

1. 检验方法选择与规划

Choose appropriate statistical tests based on research questions and data characteristics
Conduct a priori power analyses to determine required sample sizes
Plan analysis strategies including multiple comparison corrections

根据研究问题和数据特征选择合适的统计检验方法
开展先验功效分析以确定所需样本量
规划分析策略，包括多重比较校正

2. Assumption Checking

2. 假设验证

Automatically verify all relevant assumptions before running tests
Provide diagnostic visualizations (Q-Q plots, residual plots, box plots)
Recommend remedial actions when assumptions are violated

在运行检验前自动验证所有相关假设
提供诊断可视化图表（Q-Q图、残差图、箱线图）
当假设不满足时推荐补救措施

3. Statistical Testing

3. 统计检验

Hypothesis testing: t-tests, ANOVA, chi-square, non-parametric alternatives
Regression: linear, multiple, logistic, with diagnostics
Correlations: Pearson, Spearman, with confidence intervals
Bayesian alternatives: Bayesian t-tests, ANOVA, regression with Bayes Factors

假设检验：t-test、ANOVA、卡方检验及非参数替代方法
回归分析：线性回归、多重回归、逻辑回归及诊断分析
相关性分析：Pearson相关、Spearman相关及置信区间
贝叶斯替代方法：贝叶斯t-test、ANOVA、带贝叶斯因子的回归分析

4. Effect Sizes and Interpretation

4. 效应量与结果解读

Calculate and interpret appropriate effect sizes for all analyses
Provide confidence intervals for effect estimates
Distinguish statistical from practical significance

计算并解读所有分析对应的效应量
提供效应估计的置信区间
区分统计显著性与实际显著性

5. Professional Reporting

5. 专业报告生成

Generate APA-style statistical reports
Create publication-ready figures and tables
Provide complete interpretation with all required statistics

生成APA格式的统计报告
创建可用于发表的图表和表格
提供包含所有必要统计量的完整结果解读

Workflow Decision Tree

工作流决策树

Use this decision tree to determine your analysis path:

START
│
├─ Need to SELECT a statistical test?
│  └─ YES → See "Test Selection Guide"
│  └─ NO → Continue
│
├─ Ready to check ASSUMPTIONS?
│  └─ YES → See "Assumption Checking"
│  └─ NO → Continue
│
├─ Ready to run ANALYSIS?
│  └─ YES → See "Running Statistical Tests"
│  └─ NO → Continue
│
└─ Need to REPORT results?
   └─ YES → See "Reporting Results"

使用以下决策树确定分析路径：

START
│
├─ Need to SELECT a statistical test?
│  └─ YES → See "Test Selection Guide"
│  └─ NO → Continue
│
├─ Ready to check ASSUMPTIONS?
│  └─ YES → See "Assumption Checking"
│  └─ NO → Continue
│
├─ Ready to run ANALYSIS?
│  └─ YES → See "Running Statistical Tests"
│  └─ NO → Continue
│
└─ Need to REPORT results?
   └─ YES → See "Reporting Results"

Test Selection Guide

检验方法选择指南

Quick Reference: Choosing the Right Test

快速参考：选择合适的检验方法

Use

references/test_selection_guide.md

for comprehensive guidance. Quick reference:

Comparing Two Groups:

Independent, continuous, normal → Independent t-test
Independent, continuous, non-normal → Mann-Whitney U test
Paired, continuous, normal → Paired t-test
Paired, continuous, non-normal → Wilcoxon signed-rank test
Binary outcome → Chi-square or Fisher's exact test

Comparing 3+ Groups:

Independent, continuous, normal → One-way ANOVA
Independent, continuous, non-normal → Kruskal-Wallis test
Paired, continuous, normal → Repeated measures ANOVA
Paired, continuous, non-normal → Friedman test

Relationships:

Two continuous variables → Pearson (normal) or Spearman correlation (non-normal)
Continuous outcome with predictor(s) → Linear regression
Binary outcome with predictor(s) → Logistic regression

Bayesian Alternatives: All tests have Bayesian versions that provide:

Direct probability statements about hypotheses
Bayes Factors quantifying evidence
Ability to support null hypothesis
See
```
references/bayesian_statistics.md
```

请查看

references/test_selection_guide.md

获取全面指导。以下为快速参考：

两组比较：

独立样本、连续型数据、符合正态分布 → 独立样本t-test
独立样本、连续型数据、不符合正态分布 → Mann-Whitney U检验
配对样本、连续型数据、符合正态分布 → 配对样本t-test
配对样本、连续型数据、不符合正态分布 → Wilcoxon符号秩检验
二分类结果变量 → 卡方检验或Fisher精确检验

三组及以上比较：

独立样本、连续型数据、符合正态分布 → 单因素ANOVA
独立样本、连续型数据、不符合正态分布 → Kruskal-Wallis检验
配对样本、连续型数据、符合正态分布 → 重复测量ANOVA
配对样本、连续型数据、不符合正态分布 → Friedman检验

变量关系分析：

两个连续型变量 → Pearson相关（正态分布）或Spearman相关（非正态分布）
连续型因变量与预测变量 → 线性回归
二分类因变量与预测变量 → 逻辑回归

贝叶斯替代方法： 所有检验方法均有对应的贝叶斯版本，可提供：

关于假设的直接概率陈述
量化证据的贝叶斯因子
支持零假设的能力
详情请查看
```
references/bayesian_statistics.md
```

Assumption Checking

假设验证

Systematic Assumption Verification

系统性假设验证

ALWAYS check assumptions before interpreting test results.

Use the provided

scripts/assumption_checks.py

module for automated checking:

python

from scripts.assumption_checks import comprehensive_assumption_check

在解读检验结果前，务必先验证假设。

使用提供的

scripts/assumption_checks.py

模块进行自动化验证：

python

from scripts.assumption_checks import comprehensive_assumption_check

Comprehensive check with visualizations

带可视化的全面检查

results = comprehensive_assumption_check( data=df, value_col='score', group_col='group', # Optional: for group comparisons alpha=0.05 )


This performs:
1. **Outlier detection** (IQR and z-score methods)
2. **Normality testing** (Shapiro-Wilk test + Q-Q plots)
3. **Homogeneity of variance** (Levene's test + box plots)
4. **Interpretation and recommendations**

results = comprehensive_assumption_check( data=df, value_col='score', group_col='group', # 可选：用于组间比较 alpha=0.05 )


该模块将执行：
1. **异常值检测**（IQR和z分数方法）
2. **正态性检验**（Shapiro-Wilk检验 + Q-Q图）
3. **方差齐性检验**（Levene检验 + 箱线图）
4. **结果解读与建议**

Individual Assumption Checks

针对性假设验证

For targeted checks, use individual functions:

python

from scripts.assumption_checks import (
    check_normality,
    check_normality_per_group,
    check_homogeneity_of_variance,
    check_linearity,
    detect_outliers
)

如需针对性检查，可使用独立函数：

python

from scripts.assumption_checks import (
    check_normality,
    check_normality_per_group,
    check_homogeneity_of_variance,
    check_linearity,
    detect_outliers
)

Example: Check normality with visualization

示例：带可视化的正态性检验

result = check_normality( data=df['score'], name='Test Score', alpha=0.05, plot=True ) print(result['interpretation']) print(result['recommendation'])

undefined

result = check_normality( data=df['score'], name='Test Score', alpha=0.05, plot=True ) print(result['interpretation']) print(result['recommendation'])

undefined

What to Do When Assumptions Are Violated

假设不满足时的处理方案

Normality violated:

Mild violation + n > 30 per group → Proceed with parametric test (robust)
Moderate violation → Use non-parametric alternative
Severe violation → Transform data or use non-parametric test

Homogeneity of variance violated:

For t-test → Use Welch's t-test
For ANOVA → Use Welch's ANOVA or Brown-Forsythe ANOVA
For regression → Use robust standard errors or weighted least squares

Linearity violated (regression):

Add polynomial terms
Transform variables
Use non-linear models or GAM

See

references/assumptions_and_diagnostics.md

for comprehensive guidance.

正态性假设不满足：

轻度违反 + 每组样本量n > 30 → 继续使用参数检验（稳健性较好）
中度违反 → 使用非参数替代方法
严重违反 → 转换数据或使用非参数检验

方差齐性假设不满足：

t-test → 使用Welch's t-test
ANOVA → 使用Welch's ANOVA或Brown-Forsythe ANOVA
回归分析 → 使用稳健标准误或加权最小二乘法

线性假设不满足（回归分析）：

添加多项式项
转换变量
使用非线性模型或GAM

详情请查看

references/assumptions_and_diagnostics.md

。

Running Statistical Tests

运行统计检验

Python Libraries

Python库

Primary libraries for statistical analysis:

scipy.stats: Core statistical tests
statsmodels: Advanced regression and diagnostics
pingouin: User-friendly statistical testing with effect sizes
pymc: Bayesian statistical modeling
arviz: Bayesian visualization and diagnostics

用于统计分析的核心库：

scipy.stats: 基础统计检验
statsmodels: 高级回归分析与诊断
pingouin: 易用的统计检验工具，支持效应量计算
pymc: 贝叶斯统计建模
arviz: 贝叶斯分析可视化与诊断

Example Analyses

示例分析

T-Test with Complete Reporting

带完整报告的t-test

python

import pingouin as pg
import numpy as np

python

import pingouin as pg
import numpy as np

Run independent t-test

运行独立样本t-test

result = pg.ttest(group_a, group_b, correction='auto')

Extract results

提取结果

t_stat = result['T'].values[0] df = result['dof'].values[0] p_value = result['p-val'].values[0] cohens_d = result['cohen-d'].values[0] ci_lower = result['CI95%'].values[0][0] ci_upper = result['CI95%'].values[0][1]

Report

报告结果

print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}") print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")

undefined

print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}") print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")

undefined

ANOVA with Post-Hoc Tests

带事后检验的ANOVA

python

import pingouin as pg

python

import pingouin as pg

One-way ANOVA

单因素ANOVA

aov = pg.anova(dv='score', between='group', data=df, detailed=True) print(aov)

If significant, conduct post-hoc tests

若结果显著，执行事后检验

if aov['p-unc'].values[0] < 0.05: posthoc = pg.pairwise_tukey(dv='score', between='group', data=df) print(posthoc)

Effect size

计算效应量

eta_squared = aov['np2'].values[0] # Partial eta-squared print(f"Partial η² = {eta_squared:.3f}")

undefined

eta_squared = aov['np2'].values[0] # 偏eta平方 print(f"Partial η² = {eta_squared:.3f}")

undefined

Linear Regression with Diagnostics

带诊断分析的线性回归

python

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factor

python

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factor

Fit model

拟合模型

X = sm.add_constant(X_predictors) # Add intercept model = sm.OLS(y, X).fit()

X = sm.add_constant(X_predictors) # 添加截距项 model = sm.OLS(y, X).fit()

Summary

输出模型摘要

print(model.summary())

Check multicollinearity (VIF)

检查多重共线性（VIF）

vif_data = pd.DataFrame() vif_data["Variable"] = X.columns vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])] print(vif_data)

Check assumptions

验证假设

residuals = model.resid fitted = model.fittedvalues

Residual plots

绘制残差图

import matplotlib.pyplot as plt fig, axes = plt.subplots(2, 2, figsize=(12, 10))

Residuals vs fitted

残差 vs 拟合值

axes[0, 0].scatter(fitted, residuals, alpha=0.6) axes[0, 0].axhline(y=0, color='r', linestyle='--') axes[0, 0].set_xlabel('Fitted values') axes[0, 0].set_ylabel('Residuals') axes[0, 0].set_title('Residuals vs Fitted')

Q-Q plot

Q-Q图

from scipy import stats stats.probplot(residuals, dist="norm", plot=axes[0, 1]) axes[0, 1].set_title('Normal Q-Q')

Scale-Location

尺度-位置图

axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6) axes[1, 0].set_xlabel('Fitted values') axes[1, 0].set_ylabel('√|Standardized residuals|') axes[1, 0].set_title('Scale-Location')

Residuals histogram

残差直方图

axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7) axes[1, 1].set_xlabel('Residuals') axes[1, 1].set_ylabel('Frequency') axes[1, 1].set_title('Histogram of Residuals')

plt.tight_layout() plt.show()

undefined

axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7) axes[1, 1].set_xlabel('Residuals') axes[1, 1].set_ylabel('Frequency') axes[1, 1].set_title('Histogram of Residuals')

plt.tight_layout() plt.show()

undefined

Bayesian T-Test

贝叶斯t-test

python

import pymc as pm
import arviz as az
import numpy as np

with pm.Model() as model:
    # Priors
    mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
    mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
    sigma = pm.HalfNormal('sigma', sigma=10)

    # Likelihood
    y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
    y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)

    # Derived quantity
    diff = pm.Deterministic('difference', mu1 - mu2)

    # Sample
    trace = pm.sample(2000, tune=1000, return_inferencedata=True)

python

import pymc as pm
import arviz as az
import numpy as np

with pm.Model() as model:
    # 先验分布
    mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
    mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
    sigma = pm.HalfNormal('sigma', sigma=10)

    # 似然函数
    y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
    y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)

    # 衍生变量
    diff = pm.Deterministic('difference', mu1 - mu2)

    # 采样
    trace = pm.sample(2000, tune=1000, return_inferencedata=True)

Summarize

汇总结果

print(az.summary(trace, var_names=['difference']))

Probability that group1 > group2

组1均值大于组2均值的概率

prob_greater = np.mean(trace.posterior['difference'].values > 0) print(f"P(μ₁ > μ₂ | data) = {prob_greater:.3f}")

Plot posterior

绘制后验分布

az.plot_posterior(trace, var_names=['difference'], ref_val=0)

---

az.plot_posterior(trace, var_names=['difference'], ref_val=0)

---

Effect Sizes

效应量

Always Calculate Effect Sizes

务必计算效应量

Effect sizes quantify magnitude, while p-values only indicate existence of an effect.

See

references/effect_sizes_and_power.md

for comprehensive guidance.

效应量用于量化效应的大小，而p值仅能表明效应是否存在。

详情请查看

references/effect_sizes_and_power.md

。

Quick Reference: Common Effect Sizes

常用效应量快速参考

Test	Effect Size	Small	Medium	Large
T-test	Cohen's d	0.20	0.50	0.80
ANOVA	η²_p	0.01	0.06	0.14
Correlation	r	0.10	0.30	0.50
Regression	R²	0.02	0.13	0.26
Chi-square	Cramér's V	0.07	0.21	0.35

Important: Benchmarks are guidelines. Context matters!

检验方法	效应量	小效应	中效应	大效应
t-test	Cohen's d	0.20	0.50	0.80
ANOVA	η²_p	0.01	0.06	0.14
相关性分析	r	0.10	0.30	0.50
回归分析	R²	0.02	0.13	0.26
卡方检验	Cramér's V	0.07	0.21	0.35

注意：上述基准仅为参考，实际解读需结合研究场景！

Calculating Effect Sizes

计算效应量

Most effect sizes are automatically calculated by pingouin:

python

undefined

大多数效应量可由pingouin自动计算：

python

undefined

T-test returns Cohen's d

t-test返回Cohen's d

result = pg.ttest(x, y) d = result['cohen-d'].values[0]

ANOVA returns partial eta-squared

ANOVA返回偏eta平方

aov = pg.anova(dv='score', between='group', data=df) eta_p2 = aov['np2'].values[0]

Correlation: r is already an effect size

For t-test

针对t-test

d, ci = compute_effsize_from_t( t_statistic, nx=len(group1), ny=len(group2), eftype='cohen' ) print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")

---

d, ci = compute_effsize_from_t( t_statistic, nx=len(group1), ny=len(group2), eftype='cohen' ) print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")

---

Power Analysis

功效分析

A Priori Power Analysis (Study Planning)

先验功效分析（研究规划阶段）

Determine required sample size before data collection:

python

from statsmodels.stats.power import (
    tt_ind_solve_power,
    FTestAnovaPower
)

在数据收集前确定所需样本量：

python

from statsmodels.stats.power import (
    tt_ind_solve_power,
    FTestAnovaPower
)

T-test: What n is needed to detect d = 0.5?

t-test：检测d=0.5的效应需要多少样本量？

n_required = tt_ind_solve_power( effect_size=0.5, alpha=0.05, power=0.80, ratio=1.0, alternative='two-sided' ) print(f"Required n per group: {n_required:.0f}")

n_required = tt_ind_solve_power( effect_size=0.5, alpha=0.05, power=0.80, ratio=1.0, alternative='two-sided' ) print(f"每组所需样本量: {n_required:.0f}")

ANOVA: What n is needed to detect f = 0.25?

ANOVA：检测f=0.25的效应需要多少样本量？

anova_power = FTestAnovaPower() n_per_group = anova_power.solve_power( effect_size=0.25, ngroups=3, alpha=0.05, power=0.80 ) print(f"Required n per group: {n_per_group:.0f}")

undefined

anova_power = FTestAnovaPower() n_per_group = anova_power.solve_power( effect_size=0.25, ngroups=3, alpha=0.05, power=0.80 ) print(f"每组所需样本量: {n_per_group:.0f}")

undefined

Sensitivity Analysis (Post-Study)

敏感性分析（研究完成后）

Determine what effect size you could detect:

python

undefined

确定研究能够检测到的最小效应量：

python

undefined

With n=50 per group, what effect could we detect?

每组样本量n=50时，能够检测到的效应量是多少？

detectable_d = tt_ind_solve_power( effect_size=None, # Solve for this nobs1=50, alpha=0.05, power=0.80, ratio=1.0, alternative='two-sided' ) print(f"Study could detect d ≥ {detectable_d:.2f}")


**Note**: Post-hoc power analysis (calculating power after study) is generally not recommended. Use sensitivity analysis instead.

See `references/effect_sizes_and_power.md` for detailed guidance.

---

detectable_d = tt_ind_solve_power( effect_size=None, # 求解该参数 nobs1=50, alpha=0.05, power=0.80, ratio=1.0, alternative='two-sided' ) print(f"本研究可检测的最小d值: {detectable_d:.2f}")


**注意**：一般不推荐开展事后功效分析（研究完成后计算功效），建议使用敏感性分析替代。

详情请查看`references/effect_sizes_and_power.md`。

---

Reporting Results

结果报告

APA Style Statistical Reporting

APA格式统计报告

Follow guidelines in

references/reporting_standards.md

请遵循

references/reporting_standards.md

中的指南。

Essential Reporting Elements

报告核心要素

Descriptive statistics: M, SD, n for all groups/variables
Test statistics: Test name, statistic, df, exact p-value
Effect sizes: With confidence intervals
Assumption checks: Which tests were done, results, actions taken
All planned analyses: Including non-significant findings

描述性统计：所有组/变量的均值M、标准差SD、样本量n
检验统计量：检验方法名称、统计量、自由度df、精确p值
效应量：附带置信区间
假设验证：执行了哪些检验、结果如何、采取了哪些措施
所有预设分析：包括不显著的结果

Example Report Templates

报告模板示例

Independent T-Test

独立样本t-test

Group A (n = 48, M = 75.2, SD = 8.5) scored significantly higher than
Group B (n = 52, M = 68.3, SD = 9.2), t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18], two-tailed. Assumptions of normality (Shapiro-Wilk:
Group A W = 0.97, p = .18; Group B W = 0.96, p = .12) and homogeneity
of variance (Levene's F(1, 98) = 1.23, p = .27) were satisfied.

组A（n = 48, M = 75.2, SD = 8.5）的得分显著高于
组B（n = 52, M = 68.3, SD = 9.2），t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18]，双侧检验。正态性假设（Shapiro-Wilk：
组A W = 0.97, p = .18；组B W = 0.96, p = .12）和方差齐性假设（Levene's F(1, 98) = 1.23, p = .27）均满足。

One-Way ANOVA

单因素ANOVA

A one-way ANOVA revealed a significant main effect of treatment condition
on test scores, F(2, 147) = 8.45, p < .001, η²_p = .10. Post hoc
comparisons using Tukey's HSD indicated that Condition A (M = 78.2,
SD = 7.3) scored significantly higher than Condition B (M = 71.5,
SD = 8.1, p = .002, d = 0.87) and Condition C (M = 70.1, SD = 7.9,
p < .001, d = 1.07). Conditions B and C did not differ significantly
(p = .52, d = 0.18).

单因素ANOVA结果显示，处理条件对测试得分存在显著主效应，F(2, 147) = 8.45, p < .001, η²_p = .10。使用Tukey's HSD进行事后比较发现，条件A（M = 78.2,
SD = 7.3）的得分显著高于条件B（M = 71.5, SD = 8.1, p = .002, d = 0.87）和条件C（M = 70.1, SD = 7.9, p < .001, d = 1.07）。条件B和条件C的得分无显著差异（p = .52, d = 0.18）。

Multiple Regression

多重回归分析

Multiple linear regression was conducted to predict exam scores from
study hours, prior GPA, and attendance. The overall model was significant,
F(3, 146) = 45.2, p < .001, R² = .48, adjusted R² = .47. Study hours
(B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])
and prior GPA (B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001,
95% CI [4.66, 12.38]) were significant predictors, while attendance was
not (B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]).
Multicollinearity was not a concern (all VIF < 1.5).

采用多重线性回归分析，以学习时长、前期GPA、出勤情况为预测变量，考试得分为因变量。整体模型显著，F(3, 146) = 45.2, p < .001, R² = .48, 调整后R² = .47。学习时长（B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42]）
和前期GPA（B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001, 95% CI [4.66, 12.38]）是显著预测变量，而出勤情况不是（B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]）。多重共线性无异常（所有VIF < 1.5）。

Bayesian Analysis

贝叶斯分析

A Bayesian independent samples t-test was conducted using weakly
informative priors (Normal(0, 1) for mean difference). The posterior
distribution indicated that Group A scored higher than Group B
(M_diff = 6.8, 95% credible interval [3.2, 10.4]). The Bayes Factor
BF₁₀ = 45.3 provided very strong evidence for a difference between
groups, with a 99.8% posterior probability that Group A's mean exceeded
Group B's mean. Convergence diagnostics were satisfactory (all R̂ < 1.01,
ESS > 1000).

使用弱信息先验（均值差的Normal(0, 1)分布）开展贝叶斯独立样本t-test。后验分布结果显示，组A得分高于组B（M_diff = 6.8, 95%可信区间 [3.2, 10.4]）。贝叶斯因子BF₁₀ = 45.3为组间差异提供了极强的证据，组A均值高于组B均值的后验概率为99.8%。收敛诊断结果良好（所有R̂ < 1.01, ESS > 1000）。

Bayesian Statistics

贝叶斯统计

When to Use Bayesian Methods

贝叶斯方法适用场景

Consider Bayesian approaches when:

You have prior information to incorporate
You want direct probability statements about hypotheses
Sample size is small or planning sequential data collection
You need to quantify evidence for the null hypothesis
The model is complex (hierarchical, missing data)

See

references/bayesian_statistics.md

for comprehensive guidance on:

Bayes' theorem and interpretation
Prior specification (informative, weakly informative, non-informative)
Bayesian hypothesis testing with Bayes Factors
Credible intervals vs. confidence intervals
Bayesian t-tests, ANOVA, regression, and hierarchical models
Model convergence checking and posterior predictive checks

建议在以下场景使用贝叶斯方法：

可纳入先验信息
需要直接获取关于假设的概率陈述
样本量较小或计划开展序贯数据收集
需要量化对零假设的支持程度
模型复杂（分层模型、缺失数据）

详情请查看

references/bayesian_statistics.md

，内容包括：

贝叶斯定理与解读
先验分布指定（信息性、弱信息性、无信息性）
基于贝叶斯因子的假设检验
可信区间与置信区间的对比
贝叶斯t-test、ANOVA、回归分析及分层模型
模型收敛检查与后验预测检查

Key Advantages

核心优势

Intuitive interpretation: "Given the data, there is a 95% probability the parameter is in this interval"
Evidence for null: Can quantify support for no effect
Flexible: No p-hacking concerns; can analyze data as it arrives
Uncertainty quantification: Full posterior distribution

解读直观：“基于现有数据，参数落在该区间的概率为95%”
支持零假设：可量化对无效应的支持程度
灵活性：无需担心p值操纵问题；可实时分析数据
不确定性量化：提供完整的后验分布

Resources

资源

This skill includes comprehensive reference materials:

本技能包含以下全面参考资料：

References Directory

参考文档目录

test_selection_guide.md: Decision tree for choosing appropriate statistical tests
assumptions_and_diagnostics.md: Detailed guidance on checking and handling assumption violations
effect_sizes_and_power.md: Calculating, interpreting, and reporting effect sizes; conducting power analyses
bayesian_statistics.md: Complete guide to Bayesian analysis methods
reporting_standards.md: APA-style reporting guidelines with examples

test_selection_guide.md: 选择合适统计检验方法的决策树
assumptions_and_diagnostics.md: 假设验证与诊断的详细指南
effect_sizes_and_power.md: 效应量的计算、解读与报告；功效分析指南
bayesian_statistics.md: 贝叶斯分析方法的完整指南
reporting_standards.md: APA格式报告指南及示例

Scripts Directory

脚本目录

assumption_checks.py: Automated assumption checking with visualizations
- ```
comprehensive_assumption_check()
```
  : Complete workflow
- ```
check_normality()
```
  : Normality testing with Q-Q plots
- ```
check_homogeneity_of_variance()
```
  : Levene's test with box plots
- ```
check_linearity()
```
  : Regression linearity checks
- ```
detect_outliers()
```
  : IQR and z-score outlier detection

assumption_checks.py: 带可视化的自动化假设验证脚本
- ```
comprehensive_assumption_check()
```
  : 完整工作流检查
- ```
check_normality()
```
  : 带Q-Q图的正态性检验
- ```
check_homogeneity_of_variance()
```
  : 带箱线图的方差齐性检验
- ```
check_linearity()
```
  : 回归分析线性假设检验
- ```
detect_outliers()
```
  : IQR和z分数异常值检测

Best Practices

最佳实践

Pre-register analyses when possible to distinguish confirmatory from exploratory
Always check assumptions before interpreting results
Report effect sizes with confidence intervals
Report all planned analyses including non-significant results
Distinguish statistical from practical significance
Visualize data before and after analysis
Check diagnostics for regression/ANOVA (residual plots, VIF, etc.)
Conduct sensitivity analyses to assess robustness
Share data and code for reproducibility
Be transparent about violations, transformations, and decisions

预先注册分析方案：尽可能区分验证性分析与探索性分析
务必验证假设：在解读结果前先完成假设验证
报告效应量：附带置信区间
报告所有预设分析：包括不显著的结果
区分统计显著性与实际显著性
可视化数据：分析前后均需进行数据可视化
检查诊断结果：回归/ANOVA的残差图、VIF等
开展敏感性分析：评估结果的稳健性
共享数据与代码：确保研究可重复
保持透明：如实报告假设违反、数据转换及决策过程

Common Pitfalls to Avoid

常见误区

P-hacking: Don't test multiple ways until something is significant
HARKing: Don't present exploratory findings as confirmatory
Ignoring assumptions: Check them and report violations
Confusing significance with importance: p < .05 ≠ meaningful effect
Not reporting effect sizes: Essential for interpretation
Cherry-picking results: Report all planned analyses
Misinterpreting p-values: They're NOT probability that hypothesis is true
Multiple comparisons: Correct for family-wise error when appropriate
Ignoring missing data: Understand mechanism (MCAR, MAR, MNAR)
Overinterpreting non-significant results: Absence of evidence ≠ evidence of absence

p值操纵：不要通过多种检验方法直到得到显著结果
HARKing：不要将探索性结果伪装成验证性结果
忽略假设验证：务必检查假设并报告违反情况
混淆显著性与重要性：p < .05 不代表效应有实际意义
未报告效应量：效应量是结果解读的关键
选择性报告结果：报告所有预设分析结果
错误解读p值：p值不是“假设为真的概率”
多重比较：必要时校正家族式误差
忽略缺失数据：了解缺失机制（MCAR、MAR、MNAR）
过度解读不显著结果：没有证据不代表不存在效应

Getting Started Checklist

入门检查清单

Support and Further Reading

支持与拓展阅读

For questions about:

Test selection: See references/test_selection_guide.md
Assumptions: See references/assumptions_and_diagnostics.md
Effect sizes: See references/effect_sizes_and_power.md
Bayesian methods: See references/bayesian_statistics.md
Reporting: See references/reporting_standards.md

Key textbooks:

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics
Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models
Kruschke, J. K. (2014). Doing Bayesian Data Analysis

Online resources:

APA Style Guide: https://apastyle.apa.org/
Statistical Consulting: Cross Validated (stats.stackexchange.com)

如有以下问题，请参考对应资料：

检验方法选择：查看references/test_selection_guide.md
假设验证：查看references/assumptions_and_diagnostics.md
效应量：查看references/effect_sizes_and_power.md
贝叶斯方法：查看references/bayesian_statistics.md
报告规范：查看references/reporting_standards.md

核心教材:

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics
Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models
Kruschke, J. K. (2014). Doing Bayesian Data Analysis

在线资源:

APA格式指南: https://apastyle.apa.org/
统计咨询: Cross Validated (stats.stackexchange.com)