Loading...
Loading...
Perform statistical modeling and regression analysis on biomedical datasets. Supports linear regression, logistic regression (binary/ordinal/multinomial), mixed-effects models, Cox proportional hazards survival analysis, Kaplan-Meier estimation, and comprehensive model diagnostics. Extracts odds ratios, hazard ratios, confidence intervals, p-values, and effect sizes. Designed to solve BixBench statistical reasoning questions involving clinical/experimental data. Use when asked to fit regression models, compute odds ratios, perform survival analysis, run statistical tests, or interpret model coefficients from provided data.
npx skill4agent add mims-harvard/tooluniverse tooluniverse-statistical-modelingimport statsmodels.formula.api as smf
import numpy as np
# Fit logistic regression
model = smf.logit('disease ~ exposure + age + sex', data=df).fit(disp=0)
# Extract odds ratios
odds_ratios = np.exp(model.params)
conf_int = np.exp(model.conf_int())
print(f"Odds Ratio for exposure: {odds_ratios['exposure']:.4f}")
print(f"95% CI: ({conf_int.loc['exposure', 0]:.4f}, {conf_int.loc['exposure', 1]:.4f})")
print(f"P-value: {model.pvalues['exposure']:.6f}")from lifelines import CoxPHFitter
# Fit Cox model
cph = CoxPHFitter()
cph.fit(df[['time', 'event', 'treatment', 'age', 'stage']],
duration_col='time', event_col='event')
# Get hazard ratio
hr = cph.hazard_ratios_['treatment']
print(f"Hazard Ratio: {hr:.4f}")
print(f"Concordance: {cph.concordance_index_:.4f}")QUICK_START.mdSTART: What type of outcome variable?
│
├─ CONTINUOUS (height, blood pressure, score)
│ ├─ Independent observations → Linear Regression (OLS)
│ ├─ Repeated measures → Mixed-Effects Model (LMM)
│ └─ Count data → Poisson/Negative Binomial
│
├─ BINARY (yes/no, disease/healthy)
│ ├─ Independent observations → Logistic Regression
│ ├─ Repeated measures → Logistic Mixed-Effects (GLMM/GEE)
│ └─ Rare events → Firth logistic regression
│
├─ ORDINAL (mild/moderate/severe, stages I/II/III/IV)
│ └─ Ordinal Logistic Regression (Proportional Odds)
│
├─ MULTINOMIAL (>2 unordered categories)
│ └─ Multinomial Logistic Regression
│
└─ TIME-TO-EVENT (survival time + censoring)
├─ Regression → Cox Proportional Hazards
└─ Survival curves → Kaplan-Meierprint(df.columns.tolist())import pandas as pd
import numpy as np
# Load data
df = pd.read_csv('data.csv')
# Check structure
print(f"Observations: {len(df)}")
print(f"Variables: {len(df.columns)}")
print(f"Missing: {df.isnull().sum().sum()}")
# Detect variable types
for col in df.columns:
n_unique = df[col].nunique()
if n_unique == 2:
print(f"{col}: binary")
elif n_unique <= 10 and df[col].dtype == 'object':
print(f"{col}: categorical ({n_unique} levels)")
elif df[col].dtype in ['float64', 'int64']:
print(f"{col}: continuous (mean={df[col].mean():.2f})")import statsmodels.formula.api as smf
# R-style formula (recommended)
model = smf.ols('outcome ~ predictor1 + predictor2 + age', data=df).fit()
# Results
print(f"R-squared: {model.rsquared:.4f}")
print(f"AIC: {model.aic:.2f}")
print(model.summary())# Fit model
model = smf.logit('disease ~ exposure + age + sex', data=df).fit(disp=0)
# Odds ratios
ors = np.exp(model.params)
ci = np.exp(model.conf_int())
for var in ['exposure', 'age', 'sex_M']:
print(f"{var}: OR={ors[var]:.4f}, CI=({ci.loc[var, 0]:.4f}, {ci.loc[var, 1]:.4f})")from statsmodels.miscmodels.ordinal_model import OrderedModel
# Prepare ordered outcome
severity_order = ['Mild', 'Moderate', 'Severe']
df['severity'] = pd.Categorical(df['severity'], categories=severity_order, ordered=True)
y = df['severity'].cat.codes
# Fit model
X = pd.get_dummies(df[['exposure', 'age', 'sex']], drop_first=True, dtype=float)
model = OrderedModel(y, X, distr='logit').fit(method='bfgs', disp=0)
# Odds ratios
ors = np.exp(model.params[:len(X.columns)])
print(f"Exposure OR: {ors[0]:.4f}")from lifelines import CoxPHFitter
# Fit model
cph = CoxPHFitter()
cph.fit(df[['time', 'event', 'treatment', 'age']],
duration_col='time', event_col='event')
# Hazard ratios
print(f"HR (treatment): {cph.hazard_ratios_['treatment']:.4f}")
print(f"Concordance: {cph.concordance_index_:.4f}")references/from scipy import stats
# Single ANOVA (one outcome, multiple groups)
group1 = df[df['celltype'] == 'CD4']['expression']
group2 = df[df['celltype'] == 'CD8']['expression']
group3 = df[df['celltype'] == 'CD14']['expression']
f_stat, p_value = stats.f_oneway(group1, group2, group3)
print(f"F-statistic: {f_stat:.4f}, p-value: {p_value:.6f}")Question: "What is the F-statistic comparing [feature] expression across groups?"
DECISION TREE:
│
├─ Does question specify "the F-statistic" (singular)?
│ │
│ ├─ YES, singular → Likely asking for SPECIFIC FEATURE(S) F-statistic
│ │ │
│ │ ├─ Are there thousands of features (genes, miRNAs)?
│ │ │ YES → Per-feature approach (Method B below)
│ │ │
│ │ └─ Is there one feature of interest?
│ │ YES → Single feature ANOVA (Method A below)
│ │
│ └─ NO, asks about "all features" or "genes" (plural)?
│ YES → Aggregate approach or per-feature summary
│
└─ When unsure: Calculate PER-FEATURE and report summary statistics# Flatten all features across all samples per group
groups_agg = []
for celltype in ['CD4', 'CD8', 'CD14']:
samples = df[df['celltype'] == celltype]
# Flatten: all features × all samples in this group
all_values = expression_matrix.loc[:, samples.index].values.flatten()
groups_agg.append(all_values)
f_stat_agg, p_value = stats.f_oneway(*groups_agg)
print(f"Aggregate F-statistic: {f_stat_agg:.4f}")
# Result: Very large F-statistic (e.g., 153.8)# Calculate F-statistic FOR EACH FEATURE separately
per_feature_f_stats = []
for feature in expression_matrix.index: # For each gene/miRNA/metabolite
groups = []
for celltype in ['CD4', 'CD8', 'CD14']:
samples = df[df['celltype'] == celltype]
# Get expression of THIS feature in THIS cell type
values = expression_matrix.loc[feature, samples.index].values
groups.append(values)
f_stat, _ = stats.f_oneway(*groups)
if not np.isnan(f_stat):
per_feature_f_stats.append((feature, f_stat))
# Summary statistics
f_values = [f for _, f in per_feature_f_stats]
print(f"Per-feature F-statistics:")
print(f" Median: {np.median(f_values):.4f}")
print(f" Mean: {np.mean(f_values):.4f}")
print(f" Range: [{np.min(f_values):.4f}, {np.max(f_values):.4f}]")
# Find features in specific range (e.g., 0.76-0.78)
target_features = [(name, f) for name, f in per_feature_f_stats
if 0.76 <= f <= 0.78]
if target_features:
print(f"Features with F ∈ [0.76, 0.78]: {len(target_features)}")
for name, f_val in target_features:
print(f" {name}: F = {f_val:.6f}")| Aspect | Method A (Aggregate) | Method B (Per-feature) |
|---|---|---|
| Interpretation | Overall expression difference | Feature-specific differences |
| Result | 1 F-statistic | N F-statistics (N = # features) |
| Typical value | Very large (e.g., 153.8) | Small to large (e.g., 0.1 to 100+) |
| Use case | Global effect size | Gene/biomarker discovery |
| Common in | Rarely used | Genomics, proteomics, metabolomics ⭐ |
from scipy import stats as scipy_stats
from statsmodels.stats.diagnostic import het_breuschpagan
# Residual normality
residuals = model.resid
sw_stat, sw_p = scipy_stats.shapiro(residuals)
print(f"Shapiro-Wilk: p={sw_p:.4f} (normal if p>0.05)")
# Heteroscedasticity
bp_stat, bp_p, _, _ = het_breuschpagan(residuals, model.model.exog)
print(f"Breusch-Pagan: p={bp_p:.4f} (homoscedastic if p>0.05)")
# VIF (multicollinearity)
from statsmodels.stats.outliers_influence import variance_inflation_factor
X = model.model.exog
for i in range(1, X.shape[1]): # Skip intercept
vif = variance_inflation_factor(X, i)
print(f"{model.model.exog_names[i]}: VIF={vif:.2f}")# Test PH assumption for Cox model
results = cph.check_assumptions(df, p_value_threshold=0.05, show_plots=False)
if len(results) == 0:
print("✅ Proportional hazards assumption met")
else:
print(f"⚠️ PH violated for: {results}")references/troubleshooting.mddef interpret_odds_ratio(or_val, ci_lower, ci_upper, p_value):
"""Interpret odds ratio with clinical meaning."""
if or_val > 1:
pct_increase = (or_val - 1) * 100
direction = f"{pct_increase:.1f}% increase in odds"
else:
pct_decrease = (1 - or_val) * 100
direction = f"{pct_decrease:.1f}% decrease in odds"
sig = "significant" if p_value < 0.05 else "not significant"
ci_contains_null = ci_lower <= 1 <= ci_upper
return f"{direction} (OR={or_val:.4f}, 95% CI [{ci_lower:.4f}, {ci_upper:.4f}], p={p_value:.6f}, {sig})"# Unadjusted model
model_crude = smf.logit('outcome ~ exposure', data=df).fit(disp=0)
or_crude = np.exp(model_crude.params['exposure'])
# Adjusted model
model_adj = smf.logit('outcome ~ exposure + age + sex', data=df).fit(disp=0)
or_adj = np.exp(model_adj.params['exposure'])
# Percentage reduction
pct_reduction = (or_crude - or_adj) / or_crude * 100
print(f"Percentage reduction: {pct_reduction:.1f}%")# Fit model with interaction
model = smf.logit('outcome ~ A * B + age', data=df).fit(disp=0)
# Interaction OR
interaction_coef = model.params['A:B']
interaction_or = np.exp(interaction_coef)
print(f"Interaction OR: {interaction_or:.4f}")references/bixbench_patterns.md| Use Case | Library | Reason |
|---|---|---|
| Inference (p-values, CIs, ORs) | statsmodels | Full statistical output |
| Prediction (accuracy, AUC) | scikit-learn | Better prediction tools |
| Mixed-effects models | statsmodels | Only option |
| Regularization (LASSO, Ridge) | scikit-learn | Better optimization |
| Survival analysis | lifelines | Specialized library |
statsmodels>=0.14.0
scikit-learn>=1.3.0
lifelines>=0.27.0
pandas>=2.0.0
numpy>=1.24.0
scipy>=1.10.0tooluniverse-statistical-modeling/
├── SKILL.md # This file
├── QUICK_START.md # 8 quick examples
├── EXAMPLES.md # Legacy examples (kept for reference)
├── TOOLS_REFERENCE.md # ToolUniverse tool catalog
├── test_skill.py # Comprehensive test suite
├── references/
│ ├── logistic_regression.md # Detailed logistic examples
│ ├── ordinal_logistic.md # Ordinal logit guide
│ ├── cox_regression.md # Survival analysis guide
│ ├── linear_models.md # OLS and mixed-effects
│ ├── bixbench_patterns.md # 15+ question patterns
│ └── troubleshooting.md # Diagnostic issues
└── scripts/
├── format_statistical_output.py # Format results for reporting
└── model_diagnostics.py # Automated diagnostics| Use Case | Tools |
|---|---|
| Clinical trial data | |
| Drug safety outcomes | |
| Gene-disease associations | |
| Biomarker data | |
TOOLS_REFERENCE.mdreferences/logistic_regression.mdreferences/ordinal_logistic.mdreferences/cox_regression.mdreferences/linear_models.mdreferences/bixbench_patterns.mdreferences/troubleshooting.md