pymc-modeling

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PyMC Modeling

PyMC 建模

Modern Bayesian modeling with PyMC v5+. Key defaults: nutpie sampler (2-5x faster), non-centered parameterization for hierarchical models, HSGP over exact GPs, coords/dims for readable InferenceData, and save-early workflow to prevent data loss from late crashes.
Modeling strategy: Build models iteratively — start simple, check prior predictions, fit and diagnose, check posterior predictions, expand one piece at a time. See references/workflow.md for the full workflow.
Notebook preference: Use marimo for interactive modeling unless the project already uses Jupyter.
基于PyMC v5+的现代贝叶斯建模。核心默认配置:nutpie采样器(速度提升2-5倍)、层次模型采用非中心化参数化、优先使用HSGP而非精确高斯过程(GPs)、通过coords/dims提升InferenceData的可读性,以及提前保存的工作流以避免后期崩溃导致的数据丢失。
建模策略:迭代式构建模型——从简单模型开始,检查先验预测结果,拟合并诊断模型,检查后验预测结果,每次仅增加一项复杂度。完整工作流请参考references/workflow.md
笔记本偏好:除非项目已使用Jupyter,否则优先使用marimo进行交互式建模。

Model Specification

模型定义

Basic Structure

基础结构

python
import pymc as pm
import arviz as az

with pm.Model(coords=coords) as model:
    # Data containers (for out-of-sample prediction)
    x = pm.Data("x", x_obs, dims="obs")

    # Priors
    beta = pm.Normal("beta", mu=0, sigma=1, dims="features")
    sigma = pm.HalfNormal("sigma", sigma=1)

    # Likelihood
    mu = pm.math.dot(x, beta)
    y = pm.Normal("y", mu=mu, sigma=sigma, observed=y_obs, dims="obs")

    # Inference
    idata = pm.sample(nuts_sampler="nutpie", random_seed=42)
python
import pymc as pm
import arviz as az

with pm.Model(coords=coords) as model:
    # 数据容器(用于样本外预测)
    x = pm.Data("x", x_obs, dims="obs")

    # 先验分布
    beta = pm.Normal("beta", mu=0, sigma=1, dims="features")
    sigma = pm.HalfNormal("sigma", sigma=1)

    # 似然函数
    mu = pm.math.dot(x, beta)
    y = pm.Normal("y", mu=mu, sigma=sigma, observed=y_obs, dims="obs")

    # 推理过程
    idata = pm.sample(nuts_sampler="nutpie", random_seed=42)

Coords and Dims

Coords与Dims

Use coords/dims for interpretable InferenceData when model has meaningful structure:
python
coords = {
    "obs": np.arange(n_obs),
    "features": ["intercept", "age", "income"],
    "group": group_labels,
}
Skip for simple models where overhead exceeds benefit.
当模型具有有意义的结构时,使用coords/dims来生成可读性更强的InferenceData:
python
coords = {
    "obs": np.arange(n_obs),
    "features": ["intercept", "age", "income"],
    "group": group_labels,
}
对于简单模型,若使用成本超过收益则可跳过该配置。

Parameterization

参数化

Prefer non-centered parameterization for hierarchical models with weak data:
python
undefined
对于数据量较少的层次模型,优先采用非中心化参数化:
python
undefined

Non-centered (better for divergences)

非中心化(更适合解决发散问题)

offset = pm.Normal("offset", 0, 1, dims="group") alpha = mu_alpha + sigma_alpha * offset
offset = pm.Normal("offset", 0, 1, dims="group") alpha = mu_alpha + sigma_alpha * offset

Centered (better with strong data)

中心化(数据充足时表现更佳)

alpha = pm.Normal("alpha", mu_alpha, sigma_alpha, dims="group")
undefined
alpha = pm.Normal("alpha", mu_alpha, sigma_alpha, dims="group")
undefined

Inference

推理过程

Default Sampling (nutpie preferred)

默认采样(优先使用nutpie)

python
with model:
    idata = pm.sample(
        draws=1000, tune=1000, chains=4,
        nuts_sampler="nutpie",
        random_seed=42,
    )
idata.to_netcdf("results.nc")  # Save immediately after sampling
Important: nutpie does not store log_likelihood automatically (it silently ignores
idata_kwargs={"log_likelihood": True}
). If you need LOO-CV or model comparison, compute it after sampling:
python
pm.compute_log_likelihood(idata, model=model)
python
with model:
    idata = pm.sample(
        draws=1000, tune=1000, chains=4,
        nuts_sampler="nutpie",
        random_seed=42,
    )
idata.to_netcdf("results.nc")  # 采样完成后立即保存
重要提示:nutpie不会自动存储log_likelihood(它会静默忽略
idata_kwargs={"log_likelihood": True}
)。若需要LOO-CV或模型对比,请在采样后计算:
python
pm.compute_log_likelihood(idata, model=model)

When to Use PyMC's Default NUTS Instead

何时改用PyMC默认NUTS采样器

nutpie cannot handle discrete parameters or certain transforms (e.g.,
ordered
transform with
OrderedLogistic
/
OrderedProbit
). For these models, omit
nuts_sampler="nutpie"
:
python
idata = pm.sample(draws=1000, tune=1000, chains=4, random_seed=42)
Never change the model specification to work around sampler limitations.
If nutpie is not installed, install it (
pip install nutpie
) or fall back to
nuts_sampler="numpyro"
.
nutpie无法处理离散参数或某些变换(例如
OrderedLogistic
/
OrderedProbit
ordered
变换)。对于此类模型,省略
nuts_sampler="nutpie"
即可:
python
idata = pm.sample(draws=1000, tune=1000, chains=4, random_seed=42)
绝对不要为适配采样器而修改模型定义。
若未安装nutpie,请先安装(
pip install nutpie
)或切换至
nuts_sampler="numpyro"

Alternative MCMC Backends

替代MCMC后端

See references/inference.md for:
  • NumPyro/JAX: GPU acceleration, vectorized chains
更多内容请参考references/inference.md
  • NumPyro/JAX:GPU加速、向量化链

Approximate Inference

近似推理

For fast (but inexact) posterior approximations:
  • ADVI/DADVI: Variational inference with Gaussian approximation
  • Pathfinder: Quasi-Newton optimization for initialization or screening
如需快速(但非精确)的后验近似方法:
  • ADVI/DADVI:基于高斯近似的变分推理
  • Pathfinder:用于初始化或筛选的拟牛顿优化算法

Diagnostics and ArviZ Workflow

诊断与ArviZ工作流

Minimum workflow checklist — every model script should include:
  1. Prior predictive check (
    pm.sample_prior_predictive
    )
  2. Save results immediately after sampling (
    idata.to_netcdf(...)
    )
  3. Divergence count + r_hat + ESS check
  4. Posterior predictive check (
    pm.sample_posterior_predictive
    )
Follow this systematic workflow after every sampling run:
最低要求工作流清单——所有模型脚本均应包含:
  1. 先验预测检查(
    pm.sample_prior_predictive
  2. 采样完成后立即保存结果(
    idata.to_netcdf(...)
  3. 发散计数 + r_hat + ESS检查
  4. 后验预测检查(
    pm.sample_posterior_predictive
每次采样完成后,请遵循以下系统化工作流:

Phase 1: Immediate Checks (Required)

阶段1:即时检查(必填)

python
undefined
python
undefined

1. Check for divergences (must be 0 or near 0)

1. 检查发散情况(必须为0或接近0)

n_div = idata.sample_stats["diverging"].sum().item() print(f"Divergences: {n_div}")
n_div = idata.sample_stats["diverging"].sum().item() print(f"发散次数: {n_div}")

2. Summary with convergence diagnostics

2. 包含收敛诊断的汇总信息

summary = az.summary(idata, var_names=["~offset"]) # exclude auxiliary print(summary[["mean", "sd", "hdi_3%", "hdi_97%", "ess_bulk", "ess_tail", "r_hat"]])
summary = az.summary(idata, var_names=["~offset"]) # 排除辅助变量 print(summary[["mean", "sd", "hdi_3%", "hdi_97%", "ess_bulk", "ess_tail", "r_hat"]])

3. Visual convergence check

3. 可视化收敛检查

az.plot_trace(idata, compact=True) az.plot_rank(idata, var_names=["beta", "sigma"])

**Pass criteria** (all must pass before proceeding):
- Zero divergences (or < 0.1% and randomly scattered)
- `r_hat < 1.01` for all parameters
- `ess_bulk > 400` and `ess_tail > 400`
- Trace plots show good mixing (overlapping densities, fuzzy caterpillar)
az.plot_trace(idata, compact=True) az.plot_rank(idata, var_names=["beta", "sigma"])

**通过标准**(必须全部满足才能继续):
- 发散次数为0(或<0.1%且随机分布)
- 所有参数的`r_hat < 1.01`
- `ess_bulk > 400` 且 `ess_tail > 400`
- 迹图显示良好的混合性(重叠的密度分布、模糊的毛虫状曲线)

Phase 2: Deep Convergence (If Phase 1 marginal)

阶段2:深度收敛检查(若阶段1结果边缘)

python
undefined
python
undefined

ESS evolution (should grow linearly)

ESS演化(应呈线性增长)

az.plot_ess(idata, kind="evolution")
az.plot_ess(idata, kind="evolution")

Energy diagnostic (HMC health)

能量诊断(HMC健康状况)

az.plot_energy(idata)
az.plot_energy(idata)

Autocorrelation (should decay rapidly)

自相关性(应快速衰减)

az.plot_autocorr(idata, var_names=["beta"])
undefined
az.plot_autocorr(idata, var_names=["beta"])
undefined

Phase 3: Model Criticism (Required)

阶段3:模型验证(必填)

python
undefined
python
undefined

Generate posterior predictive

生成后验预测数据

with model: pm.sample_posterior_predictive(idata, extend_inferencedata=True)
with model: pm.sample_posterior_predictive(idata, extend_inferencedata=True)

Does the model capture the data?

模型是否拟合数据?

az.plot_ppc(idata, kind="cumulative")
az.plot_ppc(idata, kind="cumulative")

Calibration check

校准检查

az.plot_loo_pit(idata, y="y")

**Critical rule**: Never interpret parameters until Phases 1-3 pass.
az.plot_loo_pit(idata, y="y")

**关键规则**:在完成阶段1-3之前,绝对不要解释参数含义。

Phase 4: Parameter Interpretation

阶段4:参数解读

python
undefined
python
undefined

Posterior summaries

后验分布汇总

az.plot_posterior(idata, var_names=["beta"], ref_val=0)
az.plot_posterior(idata, var_names=["beta"], ref_val=0)

Forest plots for hierarchical parameters

层次参数的森林图

az.plot_forest(idata, var_names=["alpha"], combined=True)
az.plot_forest(idata, var_names=["alpha"], combined=True)

Parameter correlations (identify non-identifiability)

参数相关性(识别不可识别性)

az.plot_pair(idata, var_names=["alpha", "beta", "sigma"])

See [references/arviz.md](references/arviz.md) for comprehensive ArviZ usage.
See [references/diagnostics.md](references/diagnostics.md) for troubleshooting.
az.plot_pair(idata, var_names=["alpha", "beta", "sigma"])

ArviZ的全面用法请参考[references/arviz.md](references/arviz.md)。
故障排查请参考[references/diagnostics.md](references/diagnostics.md)。

Prior and Posterior Predictive Checks

先验与后验预测检查

Prior Predictive (Before Fitting)

先验预测(拟合前)

Always check prior implications before fitting:
python
with model:
    prior_pred = pm.sample_prior_predictive(draws=500)

az.plot_ppc(prior_pred, group="prior", kind="cumulative")
prior_y = prior_pred.prior_predictive["y"].values.flatten()
print(f"Prior predictive range: [{prior_y.min():.1f}, {prior_y.max():.1f}]")
Rule: Run prior predictive checks before
pm.sample()
on any new model. If the range is implausible (negative counts, probabilities > 1), adjust priors before proceeding.
拟合模型前务必检查先验分布的含义:
python
with model:
    prior_pred = pm.sample_prior_predictive(draws=500)

az.plot_ppc(prior_pred, group="prior", kind="cumulative")
prior_y = prior_pred.prior_predictive["y"].values.flatten()
print(f"先验预测范围: [{prior_y.min():.1f}, {prior_y.max():.1f}]")
规则:对任何新模型,在运行
pm.sample()
前先执行先验预测检查。若范围不符合实际(例如出现负计数、概率>1),则需调整先验分布后再继续。

Posterior Predictive (After Fitting)

后验预测(拟合后)

python
with model:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)

az.plot_ppc(idata, kind="cumulative")
az.plot_loo_pit(idata, y="y")
Observed data (dark line) should fall within posterior predictive distribution. See references/arviz.md for detailed interpretation.
python
with model:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)

az.plot_ppc(idata, kind="cumulative")
az.plot_loo_pit(idata, y="y")
观测数据(深色线条)应落在后验预测分布范围内。详细解读请参考references/arviz.md

Model Debugging

模型调试

Before sampling, validate the model with
model.debug()
and
model.point_logps()
. Use
print(model)
for structure and
pm.model_to_graphviz(model)
for a DAG visualization.
采样前,使用
model.debug()
model.point_logps()
验证模型。使用
print(model)
查看模型结构,
pm.model_to_graphviz(model)
生成DAG可视化图。

Common Issues

常见问题

SymptomLikely CauseFix
ValueError: Shape mismatch
Parameter vs observation dimensionsUse index vectors:
alpha[group_idx]
Initial evaluation failed
Data outside distribution supportCheck bounds; use
init="adapt_diag"
Mass matrix contains zeros
Unscaled predictors or flat priorsStandardize features; use weakly informative priors
High divergence countFunnel geometryNon-centered parameterization
NaN
in log-probability
Invalid parameter combinationsCheck parameter constraints, add bounds
-inf
log-probability
Observations outside likelihood supportVerify data matches distribution domain
Slow discrete samplingNUTS incompatible with discreteMarginalize discrete variables
See references/troubleshooting.md for comprehensive problem-solution guide.
For debugging divergences, use
az.plot_pair(idata, divergences=True)
to locate clusters. See references/diagnostics.md § Divergence Troubleshooting.
For profiling slow models, see references/troubleshooting.md § Performance Issues.
症状可能原因解决方法
ValueError: Shape mismatch
参数与观测数据维度不匹配使用索引向量:
alpha[group_idx]
Initial evaluation failed
数据超出分布支持范围检查边界;使用
init="adapt_diag"
Mass matrix contains zeros
预测变量未标准化或先验分布过于平坦标准化特征;使用弱信息先验
高发散计数漏斗状几何结构采用非中心化参数化
对数概率出现
NaN
参数组合无效检查参数约束,添加边界
对数概率为
-inf
观测数据超出似然函数支持范围验证数据是否匹配分布域
离散变量采样缓慢NUTS与离散变量不兼容边缘化离散变量
全面的问题解决方案请参考references/troubleshooting.md
调试发散问题时,使用
az.plot_pair(idata, divergences=True)
定位聚类。详细内容请参考references/diagnostics.md中的「发散问题排查」章节。
排查模型性能问题请参考references/troubleshooting.md中的「性能问题」章节。

Model Comparison

模型对比

LOO-CV (Preferred)

LOO-CV(优先选择)

python
undefined
python
undefined

Compute LOO with pointwise diagnostics

计算带有点状诊断的LOO

loo = az.loo(idata, pointwise=True) print(f"ELPD: {loo.elpd_loo:.1f} ± {loo.se:.1f}")
loo = az.loo(idata, pointwise=True) print(f"ELPD: {loo.elpd_loo:.1f} ± {loo.se:.1f}")

Check Pareto k values (must be < 0.7 for reliable LOO)

检查Pareto k值(可靠LOO要求<0.7)

print(f"Bad k (>0.7): {(loo.pareto_k > 0.7).sum().item()}") az.plot_khat(idata)
undefined
print(f"异常k值(>0.7): {(loo.pareto_k > 0.7).sum().item()}") az.plot_khat(idata)
undefined

Comparing Models

模型对比

python
undefined
python
undefined

If using nutpie, compute log-likelihood first (nutpie doesn't store it automatically)

若使用nutpie,需先计算对数似然(nutpie不会自动存储)

pm.compute_log_likelihood(idata_a, model=model_a) pm.compute_log_likelihood(idata_b, model=model_b)
comparison = az.compare({ "model_a": idata_a, "model_b": idata_b, }, ic="loo")
print(comparison[["rank", "elpd_loo", "elpd_diff", "weight"]]) az.plot_compare(comparison)

**Decision rule**: If two models have similar stacking weights, they are effectively equivalent.

See [references/arviz.md](references/arviz.md) for detailed model comparison workflow.
pm.compute_log_likelihood(idata_a, model=model_a) pm.compute_log_likelihood(idata_b, model=model_b)
comparison = az.compare({ "model_a": idata_a, "model_b": idata_b, }, ic="loo")
print(comparison[["rank", "elpd_loo", "elpd_diff", "weight"]]) az.plot_compare(comparison)

**决策规则**:若两个模型的堆叠权重相近,则它们在效果上等价。

详细的模型对比工作流请参考[references/arviz.md](references/arviz.md)。

Iterative Model Building

迭代式模型构建

Build complexity incrementally: fit the simplest plausible model first, diagnose it, check posterior predictions, then add ONE piece of complexity at a time. Compare each expansion via LOO. If stacking weights are similar, the models are effectively equivalent. See references/workflow.md for the full iterative workflow.
逐步增加模型复杂度:先拟合最简单的合理模型,诊断模型,检查后验预测结果,然后每次仅增加一项复杂度。通过LOO对比每次扩展后的模型。若堆叠权重相近,则模型效果等价。完整迭代工作流请参考references/workflow.md

Saving and Loading Results

结果保存与加载

InferenceData Persistence

InferenceData持久化

Save sampling results for later analysis or sharing:
python
undefined
保存采样结果以便后续分析或共享:
python
undefined

Save to NetCDF (recommended format)

保存为NetCDF(推荐格式)

idata.to_netcdf("results/model_v1.nc")
idata.to_netcdf("results/model_v1.nc")

Load

加载

idata = az.from_netcdf("results/model_v1.nc")

For compressed storage of large InferenceData objects, see [references/workflow.md](references/workflow.md).

**Critical**: Save IMMEDIATELY after sampling — late crashes destroy valid results:

```python
with model:
    idata = pm.sample(nuts_sampler="nutpie")
idata.to_netcdf("results.nc")  # Save before any post-processing!

with model:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)
idata.to_netcdf("results.nc")  # Update with posterior predictive
idata = az.from_netcdf("results/model_v1.nc")

大型InferenceData对象的压缩存储请参考[references/workflow.md](references/workflow.md)。

**关键提示**:采样完成后立即保存——后期崩溃会导致有效结果丢失:

```python
with model:
    idata = pm.sample(nuts_sampler="nutpie")
idata.to_netcdf("results.nc")  # 先保存,再进行后续处理!

with model:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)
idata.to_netcdf("results.nc")  # 更新后验预测数据

Prior Selection

先验分布选择

See references/priors.md for:
  • Weakly informative defaults by distribution type
  • Prior predictive checking workflow
  • Domain-specific recommendations
以下内容请参考references/priors.md
  • 不同分布类型的弱信息默认先验
  • 先验预测检查工作流
  • 特定领域的推荐方案

Common Patterns

常见模型模式

Hierarchical/Multilevel

层次/多水平模型

python
with pm.Model(coords={"group": groups, "obs": obs_idx}) as hierarchical:
    # Hyperpriors
    mu_alpha = pm.Normal("mu_alpha", 0, 1)
    sigma_alpha = pm.HalfNormal("sigma_alpha", 1)

    # Group-level (non-centered)
    alpha_offset = pm.Normal("alpha_offset", 0, 1, dims="group")
    alpha = pm.Deterministic("alpha", mu_alpha + sigma_alpha * alpha_offset, dims="group")

    # Likelihood
    y = pm.Normal("y", alpha[group_idx], sigma, observed=y_obs, dims="obs")
python
with pm.Model(coords={"group": groups, "obs": obs_idx}) as hierarchical:
    # 超先验分布
    mu_alpha = pm.Normal("mu_alpha", 0, 1)
    sigma_alpha = pm.HalfNormal("sigma_alpha", 1)

    # 组水平(非中心化)
    alpha_offset = pm.Normal("alpha_offset", 0, 1, dims="group")
    alpha = pm.Deterministic("alpha", mu_alpha + sigma_alpha * alpha_offset, dims="group")

    # 似然函数
    y = pm.Normal("y", alpha[group_idx], sigma, observed=y_obs, dims="obs")

GLMs

广义线性模型(GLMs)

python
undefined
python
undefined

Logistic regression

逻辑回归

with pm.Model() as logistic: alpha = pm.Normal("alpha", 0, 2.5) beta = pm.Normal("beta", 0, 2.5, dims="features") p = pm.math.sigmoid(alpha + pm.math.dot(X, beta)) y = pm.Bernoulli("y", p=p, observed=y_obs)
with pm.Model() as logistic: alpha = pm.Normal("alpha", 0, 2.5) beta = pm.Normal("beta", 0, 2.5, dims="features") p = pm.math.sigmoid(alpha + pm.math.dot(X, beta)) y = pm.Bernoulli("y", p=p, observed=y_obs)

Poisson regression

泊松回归

with pm.Model() as poisson: beta = pm.Normal("beta", 0, 1, dims="features") y = pm.Poisson("y", mu=pm.math.exp(pm.math.dot(X, beta)), observed=y_obs)
undefined
with pm.Model() as poisson: beta = pm.Normal("beta", 0, 1, dims="features") y = pm.Poisson("y", mu=pm.math.exp(pm.math.dot(X, beta)), observed=y_obs)
undefined

Gaussian Processes

高斯过程

Always prefer HSGP for GP problems with 1-3D inputs. It's O(nm) instead of O(n³), and even at n=200 exact GP (
pm.gp.Marginal
) is prohibitively slow for MCMC:
python
with pm.Model() as gp_model:
    # Hyperparameters
    ell = pm.InverseGamma("ell", alpha=5, beta=5)
    eta = pm.HalfNormal("eta", sigma=2)
    sigma = pm.HalfNormal("sigma", sigma=0.5)

    # Covariance function (Matern52 recommended)
    cov = eta**2 * pm.gp.cov.Matern52(1, ls=ell)

    # HSGP approximation
    gp = pm.gp.HSGP(m=[20], c=1.5, cov_func=cov)
    f = gp.prior("f", X=X[:, None])  # X must be 2D

    # Likelihood
    y = pm.Normal("y", mu=f, sigma=sigma, observed=y_obs)
For periodic patterns, use
pm.gp.HSGPPeriodic
. Only use
pm.gp.Marginal
or
pm.gp.Latent
for very small datasets (n < ~50) where exact inference is specifically needed.
See references/gp.md for HSGP parameter selection (m, c), HSGPPeriodic, covariance functions, and common patterns.
对于1-3D输入的GP问题,优先选择HSGP。它的时间复杂度为O(nm)而非O(n³),当n=200时,精确GP(
pm.gp.Marginal
)在MCMC中会异常缓慢:
python
with pm.Model() as gp_model:
    # 超参数
    ell = pm.InverseGamma("ell", alpha=5, beta=5)
    eta = pm.HalfNormal("eta", sigma=2)
    sigma = pm.HalfNormal("sigma", sigma=0.5)

    # 协方差函数(推荐Matern52)
    cov = eta**2 * pm.gp.cov.Matern52(1, ls=ell)

    # HSGP近似
    gp = pm.gp.HSGP(m=[20], c=1.5, cov_func=cov)
    f = gp.prior("f", X=X[:, None])  # X必须为二维

    # 似然函数
    y = pm.Normal("y", mu=f, sigma=sigma, observed=y_obs)
对于周期性模式,使用
pm.gp.HSGPPeriodic
。仅在极小数据集(n < ~50)且明确需要精确推断时,才使用
pm.gp.Marginal
pm.gp.Latent
HSGP参数选择(m, c)、HSGPPeriodic、协方差函数及常见模式请参考references/gp.md

Time Series

时间序列

python
with pm.Model(coords={"time": range(T)}) as ar_model:
    rho = pm.Uniform("rho", -1, 1)
    sigma = pm.HalfNormal("sigma", sigma=1)

    y = pm.AR("y", rho=[rho], sigma=sigma, constant=True,
              observed=y_obs, dims="time")
See references/timeseries.md for AR/ARMA, random walks, structural time series, state space models, and forecasting patterns.
python
with pm.Model(coords={"time": range(T)}) as ar_model:
    rho = pm.Uniform("rho", -1, 1)
    sigma = pm.HalfNormal("sigma", sigma=1)

    y = pm.AR("y", rho=[rho], sigma=sigma, constant=True,
              observed=y_obs, dims="time")
AR/ARMA、随机游走、结构化时间序列、状态空间模型及预测模式请参考references/timeseries.md

BART (Bayesian Additive Regression Trees)

BART(贝叶斯加法回归树)

python
import pymc_bart as pmb

with pm.Model() as bart_model:
    mu = pmb.BART("mu", X=X, Y=y, m=50)
    sigma = pm.HalfNormal("sigma", 1)
    y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
See references/bart.md for regression/classification, variable importance, and configuration.
python
import pymc_bart as pmb

with pm.Model() as bart_model:
    mu = pmb.BART("mu", X=X, Y=y, m=50)
    sigma = pm.HalfNormal("sigma", 1)
    y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
回归/分类、变量重要性及配置请参考references/bart.md

Mixture Models

混合模型

python
import numpy as np

coords = {"component": range(K)}

with pm.Model(coords=coords) as gmm:
    # Mixture weights
    w = pm.Dirichlet("w", a=np.ones(K), dims="component")

    # Component parameters (with ordering to avoid label switching)
    mu = pm.Normal("mu", mu=0, sigma=10, dims="component",
                   transform=pm.distributions.transforms.ordered,
                   initval=np.linspace(y_obs.min(), y_obs.max(), K))
    sigma = pm.HalfNormal("sigma", sigma=2, dims="component")

    # Mixture likelihood
    y = pm.NormalMixture("y", w=w, mu=mu, sigma=sigma, observed=y_obs)
Important: Mixture models often need
target_accept=0.9
or higher to avoid divergences from the multimodal geometry. Always provide
initval
on ordered means — without it, components can start overlapping and the sampler struggles to separate them.
See references/mixtures.md for label switching solutions, marginalized mixtures, and mixture diagnostics.
python
import numpy as np

coords = {"component": range(K)}

with pm.Model(coords=coords) as gmm:
    # 混合权重
    w = pm.Dirichlet("w", a=np.ones(K), dims="component")

    # 组件参数(有序变换避免标签切换)
    mu = pm.Normal("mu", mu=0, sigma=10, dims="component",
                   transform=pm.distributions.transforms.ordered,
                   initval=np.linspace(y_obs.min(), y_obs.max(), K))
    sigma = pm.HalfNormal("sigma", sigma=2, dims="component")

    # 混合似然函数
    y = pm.NormalMixture("y", w=w, mu=mu, sigma=sigma, observed=y_obs)
重要提示:混合模型通常需要设置
target_accept=0.9
或更高,以避免多峰几何结构导致的发散问题。务必为有序均值提供
initval
——否则组件可能初始重叠,采样器难以将其分离。
标签切换解决方案、边缘化混合模型及混合模型诊断请参考references/mixtures.md

Sparse Regression / Horseshoe

稀疏回归 / Horseshoe

Use the regularized (Finnish) horseshoe prior for high-dimensional regression with expected sparsity. Horseshoe priors create double-funnel geometry — use
target_accept=0.95
or higher.
See references/priors.md for full regularized horseshoe code, Laplace, R2D2, and spike-and-slab alternatives.
对于预期存在稀疏性的高维回归,使用正则化(芬兰)Horseshoe先验。Horseshoe先验会形成双漏斗几何结构——需设置
target_accept=0.95
或更高。
完整的正则化Horseshoe代码、Laplace、R2D2及spike-and-slab替代方案请参考references/priors.md

Specialized Likelihoods

专用似然函数

python
undefined
python
undefined

Zero-Inflated Poisson (excess zeros)

零膨胀泊松(存在过多零值)

with pm.Model() as zip_model: psi = pm.Beta("psi", alpha=2, beta=2) # P(structural zero) mu = pm.Exponential("mu", lam=1) y = pm.ZeroInflatedPoisson("y", psi=psi, mu=mu, observed=y_obs)
with pm.Model() as zip_model: psi = pm.Beta("psi", alpha=2, beta=2) # 结构零值概率 mu = pm.Exponential("mu", lam=1) y = pm.ZeroInflatedPoisson("y", psi=psi, mu=mu, observed=y_obs)

Censored data (e.g., right-censored survival)

删失数据(例如右删失生存数据)

with pm.Model() as censored_model: mu = pm.Normal("mu", mu=0, sigma=10) sigma = pm.HalfNormal("sigma", sigma=5) y = pm.Censored("y", dist=pm.Normal.dist(mu=mu, sigma=sigma), lower=None, upper=censoring_time, observed=y_obs)
with pm.Model() as censored_model: mu = pm.Normal("mu", mu=0, sigma=10) sigma = pm.HalfNormal("sigma", sigma=5) y = pm.Censored("y", dist=pm.Normal.dist(mu=mu, sigma=sigma), lower=None, upper=censoring_time, observed=y_obs)

Ordinal regression

有序回归

with pm.Model() as ordinal: beta = pm.Normal("beta", mu=0, sigma=2, dims="features") cutpoints = pm.Normal("cutpoints", mu=0, sigma=2, transform=pm.distributions.transforms.ordered, shape=n_categories - 1) y = pm.OrderedLogistic("y", eta=pm.math.dot(X, beta), cutpoints=cutpoints, observed=y_obs)

**Note**: Don't use the same name for a variable and a dimension. For example, if you have a dimension called `"cutpoints"`, don't also name a variable `"cutpoints"` — this causes shape errors.

See [references/specialized_likelihoods.md](references/specialized_likelihoods.md) for zero-inflated, hurdle, censored/truncated, ordinal, and robust regression models.
with pm.Model() as ordinal: beta = pm.Normal("beta", mu=0, sigma=2, dims="features") cutpoints = pm.Normal("cutpoints", mu=0, sigma=2, transform=pm.distributions.transforms.ordered, shape=n_categories - 1) y = pm.OrderedLogistic("y", eta=pm.math.dot(X, beta), cutpoints=cutpoints, observed=y_obs)

**注意**:不要为变量和维度使用相同名称。例如,若存在名为`"cutpoints"`的维度,则不要将变量也命名为`"cutpoints"`——这会导致形状错误。

零膨胀、 hurdle、删失/截断、有序及稳健回归模型请参考[references/specialized_likelihoods.md](references/specialized_likelihoods.md)。

Common Pitfalls

常见陷阱

See references/troubleshooting.md for comprehensive problem-solution guide covering:
  • Shape and dimension errors, initialization failures, mass matrix issues
  • Divergences and geometry problems (centered vs non-centered)
  • PyMC API issues (variable naming, deprecated parameters)
  • Performance issues (GPs, large Deterministics, recompilation)
  • Identifiability, multicollinearity, prior-data conflict
  • Discrete variable challenges, data containers, prediction
全面的问题解决方案请参考references/troubleshooting.md,涵盖:
  • 形状与维度错误、初始化失败、质量矩阵问题
  • 发散与几何结构问题(中心化vs非中心化)
  • PyMC API问题(变量命名、已弃用参数)
  • 性能问题(GPs、大型Deterministics、重新编译)
  • 可识别性、多重共线性、先验-数据冲突
  • 离散变量挑战、数据容器、预测

Causal Inference Operations

因果推断操作

PyMC supports do-calculus for causal queries:
python
undefined
PyMC支持用于因果查询的do-calculus:
python
undefined

pm.do — intervene (breaks incoming edges)

pm.do — 干预(切断传入边)

with pm.do(causal_model, {"x": 2}) as intervention_model: idata = pm.sample_prior_predictive() # P(y, z | do(x=2))
with pm.do(causal_model, {"x": 2}) as intervention_model: idata = pm.sample_prior_predictive() # P(y, z | do(x=2))

pm.observe — condition (preserves causal structure)

pm.observe — 条件(保留因果结构)

with pm.observe(causal_model, {"y": 1}) as conditioned_model: idata = pm.sample(nuts_sampler="nutpie") # P(x, z | y=1)
with pm.observe(causal_model, {"y": 1}) as conditioned_model: idata = pm.sample(nuts_sampler="nutpie") # P(x, z | y=1)

Combine: P(y | do(x=2), z=0)

组合使用: P(y | do(x=2), z=0)

with pm.do(causal_model, {"x": 2}) as m1: with pm.observe(m1, {"z": 0}) as m2: idata = pm.sample(nuts_sampler="nutpie")

See [references/causal.md](references/causal.md) for detailed causal inference patterns.
with pm.do(causal_model, {"x": 2}) as m1: with pm.observe(m1, {"z": 0}) as m2: idata = pm.sample(nuts_sampler="nutpie")

详细的因果推断模式请参考[references/causal.md](references/causal.md)。

pymc-extras

pymc-extras

Key extensions via
import pymc_extras as pmx
:
  • pmx.marginalize(model, ["discrete_var"])
    — marginalize discrete parameters for NUTS
  • pmx.R2D2M2CP(...)
    — R2D2 prior for regression (see references/priors.md)
  • pmx.fit_laplace(model)
    — Laplace approximation for fast inference
通过
import pymc_extras as pmx
获取关键扩展功能:
  • pmx.marginalize(model, ["discrete_var"])
    — 边缘化离散参数以适配NUTS
  • pmx.R2D2M2CP(...)
    — 用于回归的R2D2先验(参考references/priors.md
  • pmx.fit_laplace(model)
    — 用于快速推理的拉普拉斯近似

Custom Distributions and Model Components

自定义分布与模型组件

python
undefined
python
undefined

Soft constraints via Potential

通过Potential实现软约束

import pytensor.tensor as pt pm.Potential("sum_to_zero", -100 * pt.sqr(alpha.sum()))

See [references/custom_models.md](references/custom_models.md) for `pm.DensityDist`, `pm.Potential`, `pm.Simulator`, and `pm.CustomDist`.
import pytensor.tensor as pt pm.Potential("sum_to_zero", -100 * pt.sqr(alpha.sum()))

`pm.DensityDist`、`pm.Potential`、`pm.Simulator`及`pm.CustomDist`的使用请参考[references/custom_models.md](references/custom_models.md)。