adaptive-rejection-sampler

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Adaptive Rejection Sampler

Overview

概述

This skill provides guidance for implementing Adaptive Rejection Sampling (ARS) algorithms. ARS is a method for generating samples from log-concave probability distributions by constructing piecewise linear upper and lower envelopes of the log-density function. The skill focuses on procedural approaches, performance optimization, and verification strategies rather than providing implementation code.

本技能提供Adaptive Rejection Sampling（ARS）算法的实现指南。ARS是一种通过构建对数密度函数的分段线性上下包络，从log-concave概率分布中生成样本的方法。本技能聚焦于流程方法、性能优化和验证策略，而非提供具体的实现代码。

When to Use This Skill

适用场景

Use this skill when:

Implementing adaptive rejection sampling from scratch
Working with log-concave distribution samplers
Building statistical sampling algorithms that require envelope construction
Debugging or optimizing existing ARS implementations
The task involves R, Python, or other statistical computing environments

在以下场景中使用本技能：

从零开始实现自适应拒绝采样
开发log-concave分布采样器
构建需要包络构建的统计采样算法
调试或优化现有的ARS实现
涉及R、Python或其他统计计算环境的任务

Implementation Approach

实现步骤

Phase 1: Algorithm Design Before Coding

阶段1：编码前的算法设计

Before writing any code:

Understand the mathematical foundations
- Review the ARS algorithm requirements for log-concave functions
- Understand how piecewise linear envelopes are constructed from tangent lines
- Identify the squeeze function optimization
Design with performance in mind
- Plan iteration limits and safeguards from the start
- Consider worst-case computational complexity of the sampling loop
- Design for timeout constraints that may be stricter than development testing
Plan the module structure
- Log-concavity verification module
- Envelope construction and update module
- Sampling loop with rejection logic
- Initialization point selection

在编写任何代码之前：

理解数学基础
- 回顾ARS算法对log-concave函数的要求
- 理解如何通过切线构建分段线性包络
- 明确压缩函数的优化方式
以性能为导向进行设计
- 从一开始就规划迭代限制和防护措施
- 考虑采样循环的最坏情况计算复杂度
- 针对可能比开发测试更严格的超时约束进行设计
规划模块结构
- log-concavity验证模块
- 包络构建与更新模块
- 带有拒绝逻辑的采样循环
- 初始点选择模块

Phase 2: Critical Implementation Considerations

阶段2：关键实现注意事项

Log-Concavity Checking

Log-Concavity检查

When implementing log-concavity verification:

Use appropriate tolerance values for numerical comparison
Consider that some valid distributions have constant second derivatives (e.g., exponential distribution)
Use
```
<= tolerance
```
instead of strict
```
<
```
comparisons when checking for non-positive second derivatives
Test with edge cases like exponential, normal, and gamma distributions

在实现log-concavity验证时：

使用合适的数值比较容差
注意部分有效分布的二阶导数为常数（例如指数分布）
检查非正二阶导数时使用
```
<= 容差
```
而非严格的
```
<
```
比较
使用指数分布、正态分布和伽马分布等边缘案例进行测试

Initialization Points

初始点设置

Proper initialization significantly affects sampling quality and performance:

Handle shifted distributions by adjusting initialization points relative to the mode
Consider the support bounds when selecting initial points
Use multiple initial points to ensure good envelope coverage
Test with distributions that have modes far from the origin

合理的初始化对采样质量和性能有显著影响：

通过相对于众数调整初始点来处理偏移分布
选择初始点时考虑分布的支持边界
使用多个初始点以确保良好的包络覆盖
对众数远离原点的分布进行测试

Iteration Limits and Safeguards

迭代限制与防护措施

Critical for preventing infinite loops and timeouts:

Implement maximum iteration limits in all sampling loops
Add progress indicators or logging for long-running computations
Include timeout protection mechanisms
Design early exit conditions for convergence

这对防止无限循环和超时至关重要：

在所有采样循环中实现最大迭代限制
为长时间运行的计算添加进度指示器或日志
包含超时保护机制
设计收敛的提前退出条件

Phase 3: Performance-First Development

阶段3：以性能为核心的开发

Timeout Considerations

超时考量

External test frameworks may use shorter timeouts than development testing
If development tests use 180-second timeouts, production may use 60 seconds
Profile performance early to catch issues before they become blocking
Test with varying timeout constraints to ensure robustness

外部测试框架的超时时间可能比开发测试更短
如果开发测试使用180秒超时，生产环境可能使用60秒
尽早进行性能分析，在问题成为阻塞之前发现它们
使用不同的超时约束进行测试，以确保鲁棒性

Computational Efficiency

计算效率

Analyze the computational complexity of the sampling loop
Optimize envelope updates to minimize recalculation
Consider caching frequently computed values
Profile with realistic sample sizes

分析采样循环的计算复杂度
优化包络更新以最小化重复计算
考虑缓存频繁计算的值
使用真实样本量进行性能分析

Verification Strategies

验证策略

Unit Testing Approach

单元测试方法

Test each module independently
- Log-concavity checker with known log-concave and non-log-concave functions
- Envelope construction with simple distributions
- Sampling loop with controlled random seeds
Test known distributions
- Standard normal distribution
- Exponential distribution (constant second derivative edge case)
- Truncated normal distributions
- Gamma distributions with various parameters
Test edge cases explicitly
- Non-function inputs
- Negative sample counts
- Invalid bounds (lower > upper)
- Very small and very large sample sizes

独立测试每个模块
- 使用已知的log-concave和非log-concave函数测试log-concavity检查器
- 使用简单分布测试包络构建
- 使用受控随机种子测试采样循环
测试已知分布
- 标准正态分布
- 指数分布（二阶导数为常数的边缘案例）
- 截断正态分布
- 不同参数的伽马分布
显式测试边缘案例
- 非函数输入
- 负样本数量
- 无效边界（下限 > 上限）
- 极小和极大样本量

Performance Testing

性能测试

Match testing timeouts to expected production constraints
Test worst-case behavior, not just average case
Profile with different random seeds to catch stochastic failures
Measure time per sample to identify performance degradation

使测试超时时间与预期的生产约束匹配
测试最坏情况行为，而非仅平均情况
使用不同的随机种子进行性能分析，以发现随机故障
测量每个样本的耗时，以识别性能下降

Integration Testing

集成测试

Test the complete pipeline from input to samples
Verify statistical properties of generated samples (mean, variance, distribution shape)
Use statistical tests (KS test, chi-square) to verify sample quality
Test with distributions that have known analytical moments

测试从输入到生成样本的完整流程
验证生成样本的统计特性（均值、方差、分布形状）
使用统计检验（KS test、chi-square）验证样本质量
对具有已知解析矩的分布进行测试

Common Pitfalls and Mistakes

常见陷阱与错误

Critical Mistakes to Avoid

需避免的关键错误

Missing iteration limits
- Always include maximum iteration safeguards in rejection sampling loops
- An unbounded rejection loop will cause timeouts on difficult distributions
Inadequate performance testing
- Development tests passing does not guarantee production success
- Test with the same timeout constraints as the target environment
Tolerance issues in log-concavity checking
- Strict inequality checks (
```
< 0
```
  ) may incorrectly reject valid distributions
- Use tolerant comparisons (
```
<= tolerance
```
  ) for numerical stability
Poor initialization for shifted distributions
- Hard-coded initialization points fail for distributions with non-zero modes
- Always compute initialization relative to the distribution's characteristics
Incomplete file verification
- After writing large code blocks, verify the complete content was written
- Truncated files can cause subtle bugs

缺少迭代限制
- 始终在拒绝采样循环中添加最大迭代防护
- 无界的拒绝循环会在处理复杂分布时导致超时
性能测试不充分
- 开发测试通过并不代表生产环境能成功运行
- 使用与目标环境相同的超时约束进行测试
Log-Concavity检查中的容差问题
- 严格的不等式检查（
```
< 0
```
  ）可能会错误地拒绝有效分布
- 为了数值稳定性，使用容错比较（
```
<= 容差
```
  ）
偏移分布的初始化不当
- 硬编码的初始点在处理非零众数的分布时会失效
- 始终根据分布的特征计算初始点
文件验证不完整
- 编写大段代码后，验证内容是否完整写入
- 截断的文件会导致难以察觉的bug

Debugging Approach

调试方法

When tests fail or timeout:

Systematic debugging over trial-and-error
- Understand root causes before implementing fixes
- Use logging to identify where time is being spent
Isolate the problematic component
- Test log-concavity checking separately
- Test envelope construction separately
- Test sampling loop with mock envelopes
Check for infinite loops
- Add iteration counters to all loops
- Log when iteration limits are approached
- Verify exit conditions are reachable

当测试失败或超时时：

系统化调试而非试错
- 在实施修复前先理解根本原因
- 使用日志确定时间消耗的位置
隔离有问题的组件
- 单独测试log-concavity检查
- 单独测试包络构建
- 使用模拟包络测试采样循环
检查是否存在无限循环
- 为所有循环添加迭代计数器
- 当接近迭代限制时记录日志
- 验证退出条件是否可达

adaptive-rejection-sampler

Original

Translation

Adaptive Rejection Sampler

Adaptive Rejection Sampler

Overview

概述

When to Use This Skill

适用场景

Implementation Approach

实现步骤

Phase 1: Algorithm Design Before Coding

阶段1：编码前的算法设计

Phase 2: Critical Implementation Considerations

阶段2：关键实现注意事项

Log-Concavity Checking

Log-Concavity检查

Initialization Points

初始点设置

Iteration Limits and Safeguards

迭代限制与防护措施

Phase 3: Performance-First Development

阶段3：以性能为核心的开发

Timeout Considerations

超时考量

Computational Efficiency

计算效率

Verification Strategies

验证策略

Unit Testing Approach

单元测试方法

Performance Testing

性能测试

Integration Testing

集成测试

Common Pitfalls and Mistakes

常见陷阱与错误

Critical Mistakes to Avoid

需避免的关键错误

Debugging Approach

调试方法

Quality Checklist

质量检查清单