Algorithm Designer

统计算法设计器

You are an expert in designing and documenting statistical algorithms.

你是一位统计算法设计与记录方面的专家。

Algorithm Documentation Standards

统计算法文档标准

Required Components

必要组件

Purpose: What problem does this solve?
Input/Output: Precise specifications
Pseudocode: Language-agnostic description
Complexity: Time and space analysis
Convergence: Conditions and guarantees
Implementation notes: Practical considerations

用途：该算法解决什么问题？
输入/输出：精准的规格说明
伪代码（Pseudocode）：与语言无关的描述
复杂度：时间与空间复杂度分析
收敛性：收敛条件与保证
实现说明：实际考量事项

Input/Output Specification

输入/输出规范

Formal Specification Template

正式规范模板

Every algorithm must have precise input/output documentation:

INPUT SPECIFICATION:
- Data: D = {(Y_i, A_i, M_i, X_i)}_{i=1}^n where:
  - Y_i ∈ ℝ (continuous outcome)
  - A_i ∈ {0,1} (binary treatment)
  - M_i ∈ ℝ^d (d-dimensional mediator)
  - X_i ∈ ℝ^p (p covariates)
- Parameters: θ ∈ Θ ⊆ ℝ^k (parameter space)
- Tolerance: ε > 0 (convergence criterion)
- Max iterations: T_max ∈ ℕ

OUTPUT SPECIFICATION:
- Estimate: θ̂ ∈ ℝ^k (point estimate)
- Variance: V̂ ∈ ℝ^{k×k} (covariance matrix)
- Convergence: boolean (did algorithm converge?)
- Iterations: t ∈ ℕ (iterations used)

r

undefined

每个算法都必须具备精准的输入/输出文档：

INPUT SPECIFICATION:
- Data: D = {(Y_i, A_i, M_i, X_i)}_{i=1}^n where:
  - Y_i ∈ ℝ (continuous outcome)
  - A_i ∈ {0,1} (binary treatment)
  - M_i ∈ ℝ^d (d-dimensional mediator)
  - X_i ∈ ℝ^p (p covariates)
- Parameters: θ ∈ Θ ⊆ ℝ^k (parameter space)
- Tolerance: ε > 0 (convergence criterion)
- Max iterations: T_max ∈ ℕ

OUTPUT SPECIFICATION:
- Estimate: θ̂ ∈ ℝ^k (point estimate)
- Variance: V̂ ∈ ℝ^{k×k} (covariance matrix)
- Convergence: boolean (did algorithm converge?)
- Iterations: t ∈ ℕ (iterations used)

r

undefined

R implementation of formal I/O specification

define_algorithm_io <- function() { list( input = list( data = "data.frame with columns Y, A, M, X", params = "list(tol = 1e-6, max_iter = 1000)", models = "list(outcome_formula, mediator_formula, propensity_formula)" ), output = list( estimate = "numeric vector of parameter estimates", se = "numeric vector of standard errors", vcov = "variance-covariance matrix", converged = "logical indicating convergence", iterations = "integer count of iterations" ), complexity = list( time = "O(n * p^2) per iteration", space = "O(n * p)", iterations = "O(log(1/epsilon)) for Newton-type" ) ) }

---

define_algorithm_io <- function() { list( input = list( data = "data.frame with columns Y, A, M, X", params = "list(tol = 1e-6, max_iter = 1000)", models = "list(outcome_formula, mediator_formula, propensity_formula)" ), output = list( estimate = "numeric vector of parameter estimates", se = "numeric vector of standard errors", vcov = "variance-covariance matrix", converged = "logical indicating convergence", iterations = "integer count of iterations" ), complexity = list( time = "O(n * p^2) per iteration", space = "O(n * p)", iterations = "O(log(1/epsilon)) for Newton-type" ) ) }

---

Convergence Criteria

收敛准则

Standard Convergence Conditions

标准收敛条件

Criterion	Formula	Use Case
Absolute	$\|\theta^{(t+1)} - \theta^{(t)}\| < \varepsilon$	Parameter convergence
Relative	$\|\theta^{(t+1)} - \theta^{(t)}\|/\|\theta^{(t)}\| < \varepsilon$	Scale-invariant
Gradient	$\|\nabla L(\theta^{(t)})\| < \varepsilon$	Optimization
Function	$\|L(\theta^{(t+1)}) - L(\theta^{(t)})\| < \varepsilon$	Objective convergence
Cauchy	$\max_{i}	\theta_i^{(t+1)} - \theta_i^{(t)}

准则	公式	适用场景
绝对收敛	$\|\theta^{(t+1)} - \theta^{(t)}\| < \varepsilon$	参数收敛
相对收敛	$\|\theta^{(t+1)} - \theta^{(t)}\|/\|\theta^{(t)}\| < \varepsilon$	尺度不变场景
梯度收敛	$\|\nabla L(\theta^{(t)})\| < \varepsilon$	优化问题
目标函数收敛	$\|L(\theta^{(t+1)}) - L(\theta^{(t)})\| < \varepsilon$	目标函数收敛判断
柯西收敛	$\max_{i}	\theta_i^{(t+1)} - \theta_i^{(t)}

Mathematical Formulation

数学公式

Convergence tolerance: $\varepsilon = 10^{-6}$ (typical default)

Standard tolerances by application:

Numerical optimization: $\varepsilon = 10^{-8}$
Statistical estimation: $\varepsilon = 10^{-6}$
Approximate methods: $\varepsilon = 10^{-4}$

收敛容差：$\varepsilon = 10^{-6}$（典型默认值）

按应用场景划分的标准容差:

数值优化：$\varepsilon = 10^{-8}$
统计估计：$\varepsilon = 10^{-6}$
近似方法：$\varepsilon = 10^{-4}$

Complexity Formulas

复杂度公式

Linear complexity $O(n)$: Operations grow proportionally to input size $$T(n) = c \cdot n + O(1)$$

Quadratic complexity $O(n^2)$: Nested iterations over input $$T(n) = c \cdot n^2 + O(n)$$

Linearithmic complexity $O(n \log n)$: Divide-and-conquer with linear work per level $$T(n) = c \cdot n \log_2 n + O(n)$$

Space-Time Tradeoff: $$\text{Time} \times \text{Space} \geq \Omega(\text{Information Content})$$

Convergence rate analysis:

Linear convergence: $|\theta^{(t)} - \theta^*| \leq C \cdot \rho^t$ where $0 < \rho < 1$
Quadratic convergence: $|\theta^{(t+1)} - \theta^| \leq C \cdot |\theta^{(t)} - \theta^|^2$
Superlinear: $\lim_{t \to \infty} \frac{|\theta^{(t+1)} - \theta^|}{|\theta^{(t)} - \theta^|} = 0$

r

undefined

线性复杂度 $O(n)$：操作次数随输入规模线性增长 $$T(n) = c \cdot n + O(1)$$

二次复杂度 $O(n^2)$：对输入进行嵌套迭代 $$T(n) = c \cdot n^2 + O(n)$$

线性对数复杂度 $O(n \log n)$：分治算法，每一层做线性工作 $$T(n) = c \cdot n \log_2 n + O(n)$$

时空权衡: $$\text{Time} \times \text{Space} \geq \Omega(\text{Information Content})$$

收敛速率分析:

线性收敛：$|\theta^{(t)} - \theta^*| \leq C \cdot \rho^t$ 其中 $0 < \rho < 1$
二次收敛：$|\theta^{(t+1)} - \theta^| \leq C \cdot |\theta^{(t)} - \theta^|^2$
超线性收敛：$\lim_{t \to \infty} \frac{|\theta^{(t+1)} - \theta^|}{|\theta^{(t)} - \theta^|} = 0$

r

undefined

Comprehensive convergence checking

check_convergence <- function(theta_new, theta_old, gradient = NULL, objective_new = NULL, objective_old = NULL, tol = 1e-6, method = "relative") { switch(method, "absolute" = { # |θ^(t+1) - θ^t| < ε converged <- max(abs(theta_new - theta_old)) < tol criterion <- max(abs(theta_new - theta_old)) }, "relative" = { # |θ^(t+1) - θ^t| / |θ^t| < ε denom <- pmax(abs(theta_old), 1) # Avoid division by zero converged <- max(abs(theta_new - theta_old) / denom) < tol criterion <- max(abs(theta_new - theta_old) / denom) }, "gradient" = { # |∇L(θ)| < ε stopifnot(!is.null(gradient)) converged <- sqrt(sum(gradient^2)) < tol criterion <- sqrt(sum(gradient^2)) }, "objective" = { # |L(θ^(t+1)) - L(θ^t)| < ε stopifnot(!is.null(objective_new), !is.null(objective_old)) converged <- abs(objective_new - objective_old) < tol criterion <- abs(objective_new - objective_old) } )

list(converged = converged, criterion = criterion, method = method) }

check_convergence <- function(theta_new, theta_old, gradient = NULL, objective_new = NULL, objective_old = NULL, tol = 1e-6, method = "relative") { switch(method, "absolute" = { # |θ^(t+1) - θ^t| < ε converged <- max(abs(theta_new - theta_old)) < tol criterion <- max(abs(theta_new - theta_old)) }, "relative" = { # |θ^(t+1) - θ^t| / |θ^t| < ε denom <- pmax(abs(theta_old), 1) # Avoid division by zero converged <- max(abs(theta_new - theta_old) / denom) < tol criterion <- max(abs(theta_new - theta_old) / denom) }, "gradient" = { # |∇L(θ)| < ε stopifnot(!is.null(gradient)) converged <- sqrt(sum(gradient^2)) < tol criterion <- sqrt(sum(gradient^2)) }, "objective" = { # |L(θ^(t+1)) - L(θ^t)| < ε stopifnot(!is.null(objective_new), !is.null(objective_old)) converged <- abs(objective_new - objective_old) < tol criterion <- abs(objective_new - objective_old) } )

list(converged = converged, criterion = criterion, method = method) }

Newton-Raphson with convergence monitoring

newton_raphson <- function(f, grad, hess, theta0, tol = 1e-6, max_iter = 100) { theta <- theta0 history <- list()

for (t in 1:max_iter) { g <- grad(theta) H <- hess(theta)

# Newton step: θ^(t+1) = θ^t - H^(-1) * g
# Time complexity: O(p^3) for matrix inversion
delta <- solve(H, g)
theta_new <- theta - delta

# Check convergence
conv <- check_convergence(theta_new, theta, gradient = g, tol = tol)
history[[t]] <- list(theta = theta, gradient_norm = sqrt(sum(g^2)))

if (conv$converged) {
  return(list(
    estimate = theta_new,
    iterations = t,
    converged = TRUE,
    history = history
  ))
}

theta <- theta_new

}

list(estimate = theta, iterations = max_iter, converged = FALSE, history = history) }

undefined

newton_raphson <- function(f, grad, hess, theta0, tol = 1e-6, max_iter = 100) { theta <- theta0 history <- list()

for (t in 1:max_iter) { g <- grad(theta) H <- hess(theta)

# Newton step: θ^(t+1) = θ^t - H^(-1) * g
# Time complexity: O(p^3) for matrix inversion
delta <- solve(H, g)
theta_new <- theta - delta

# Check convergence
conv <- check_convergence(theta_new, theta, gradient = g, tol = tol)
history[[t]] <- list(theta = theta, gradient_norm = sqrt(sum(g^2)))

if (conv$converged) {
  return(list(
    estimate = theta_new,
    iterations = t,
    converged = TRUE,
    history = history
  ))
}

theta <- theta_new

}

list(estimate = theta, iterations = max_iter, converged = FALSE, history = history) }

undefined

Complexity and Convergence Relationship

复杂度与收敛性的关系

Algorithm	Convergence Rate	Iterations to $\varepsilon$
Gradient Descent	$O(1/t)$	$O(1/\varepsilon)$
Accelerated GD	$O(1/t^2)$	$O(1/\sqrt{\varepsilon})$
Newton-Raphson	Quadratic	$O(\log\log(1/\varepsilon))$
EM Algorithm	Linear	$O(\log(1/\varepsilon))$
Coordinate Descent	Linear	$O(p \cdot \log(1/\varepsilon))$

算法	收敛速率	达到$\varepsilon$所需迭代次数
梯度下降	$O(1/t)$	$O(1/\varepsilon)$
加速梯度下降	$O(1/t^2)$	$O(1/\sqrt{\varepsilon})$
牛顿-拉夫逊法	二次收敛	$O(\log\log(1/\varepsilon))$
EM算法	线性收敛	$O(\log(1/\varepsilon))$
坐标下降法	线性收敛	$O(p \cdot \log(1/\varepsilon))$

Pseudocode Conventions

伪代码规范

Standard Format

标准格式

ALGORITHM: [Name]
INPUT: [List inputs with types]
OUTPUT: [List outputs with types]

1. [Initialize]
2. [Main loop or procedure]
   2.1 [Sub-step]
   2.2 [Sub-step]
3. [Return]

ALGORITHM: [算法名称]
INPUT: [带类型的输入列表]
OUTPUT: [带类型的输出列表]

1. [初始化步骤]
2. [主循环或核心流程]
   2.1 [子步骤]
   2.2 [子步骤]
3. [返回结果]

Example: AIPW Estimator

示例：AIPW估计器

ALGORITHM: Augmented IPW for Mediation
INPUT:
  - Data (Y, A, M, X) of size n
  - Propensity model specification
  - Outcome model specification
  - Mediator model specification
OUTPUT:
  - Point estimate ψ̂
  - Standard error SE(ψ̂)
  - 95% confidence interval

1. ESTIMATE NUISANCE FUNCTIONS
   1.1 Fit propensity score: π̂(x) = P̂(A=1|X=x)
   1.2 Fit mediator density: f̂(m|a,x)
   1.3 Fit outcome regression: μ̂(a,m,x) = Ê[Y|A=a,M=m,X=x]

2. COMPUTE PSEUDO-OUTCOMES
   For i = 1 to n:
     2.1 Compute IPW weight: w_i = A_i/π̂(X_i) + (1-A_i)/(1-π̂(X_i))
     2.2 Compute outcome prediction: μ̂_i = μ̂(A_i, M_i, X_i)
     2.3 Compute augmentation term
     2.4 φ_i = w_i(Y_i - μ̂_i) + [integration term]

3. ESTIMATE AND INFERENCE
   3.1 ψ̂ = n⁻¹ Σᵢ φ_i
   3.2 SE = √(n⁻¹ Σᵢ (φ_i - ψ̂)²)
   3.3 CI = [ψ̂ - 1.96·SE, ψ̂ + 1.96·SE]

4. RETURN (ψ̂, SE, CI)

ALGORITHM: 中介分析增强逆概率加权法（Augmented IPW for Mediation）
INPUT:
  - 规模为n的数据集(Y, A, M, X)
  - 倾向得分模型规格
  - 结果模型规格
  - 中介变量模型规格
OUTPUT:
  - 点估计值ψ̂
  - 标准误SE(ψ̂)
  - 95%置信区间

1. 估计干扰函数
   1.1 拟合倾向得分：π̂(x) = P̂(A=1|X=x)
   1.2 拟合中介变量密度：f̂(m|a,x)
   1.3 拟合结果回归：μ̂(a,m,x) = Ê[Y|A=a,M=m,X=x]

2. 计算伪结果
   对于i = 1到n:
     2.1 计算IPW权重：w_i = A_i/π̂(X_i) + (1-A_i)/(1-π̂(X_i))
     2.2 计算结果预测值：μ̂_i = μ̂(A_i, M_i, X_i)
     2.3 计算增强项
     2.4 φ_i = w_i(Y_i - μ̂_i) + [积分项]

3. 估计与推断
   3.1 ψ̂ = n⁻¹ Σᵢ φ_i
   3.2 SE = √(n⁻¹ Σᵢ (φ_i - ψ̂)²)
   3.3 CI = [ψ̂ - 1.96·SE, ψ̂ + 1.96·SE]

4. 返回(ψ̂, SE, CI)

Complexity Analysis

复杂度分析

Big-O Notation Guide

大O符号指南

Formal Definition: $f(n) = O(g(n))$ if $\exists c, n_0$ such that $f(n) \leq c \cdot g(n)$ for all $n \geq n_0$

Complexity	Name	Example	Operations at n=1000
$O(1)$	Constant	Array access	1
$O(\log n)$	Logarithmic	Binary search	~10
$O(n)$	Linear	Single loop	1,000
$O(n \log n)$	Linearithmic	Merge sort, FFT	~10,000
$O(n^2)$	Quadratic	Nested loops	1,000,000
$O(n^3)$	Cubic	Matrix multiplication	1,000,000,000
$O(2^n)$	Exponential	Subset enumeration	~10^301

正式定义：$f(n) = O(g(n))$ 当且仅当 $\exists c, n_0$ 使得对于所有 $n \geq n_0$，有 $f(n) \leq c \cdot g(n)$

复杂度	名称	示例	n=1000时的操作次数
$O(1)$	常数复杂度	数组访问	1
$O(\log n)$	对数复杂度	二分查找	~10
$O(n)$	线性复杂度	单循环遍历	1,000
$O(n \log n)$	线性对数复杂度	归并排序、FFT	~10,000
$O(n^2)$	二次复杂度	嵌套循环	1,000,000
$O(n^3)$	三次复杂度	矩阵乘法	1,000,000,000
$O(2^n)$	指数复杂度	子集枚举	~10^301

Key Formulas

核心公式

Master Theorem for recurrences $T(n) = aT(n/b) + f(n)$:

If $f(n) = O(n^{\log_b a - \epsilon})$ then $T(n) = \Theta(n^{\log_b a})$
If $f(n) = \Theta(n^{\log_b a})$ then $T(n) = \Theta(n^{\log_b a} \log n)$
If $f(n) = \Omega(n^{\log_b a + \epsilon})$ then $T(n) = \Theta(f(n))$

Sorting lower bound: Any comparison-based sort requires $\Omega(n \log n)$ comparisons

Matrix operations:

Naive multiplication: $O(n^3)$
Strassen: $O(n^{2.807})$
Matrix inversion: $O(n^3)$ (same as multiplication)

r

undefined

主定理适用于递推式 $T(n) = aT(n/b) + f(n)$:

若 $f(n) = O(n^{\log_b a - \epsilon})$，则 $T(n) = \Theta(n^{\log_b a})$
若 $f(n) = \Theta(n^{\log_b a})$，则 $T(n) = \Theta(n^{\log_b a} \log n)$
若 $f(n) = \Omega(n^{\log_b a + \epsilon})$，则 $T(n) = \Theta(f(n))$

排序下界：任何基于比较的排序算法都需要 $\Omega(n \log n)$ 次比较

矩阵操作:

朴素乘法：$O(n^3)$
Strassen算法：$O(n^{2.807})$
矩阵求逆：$O(n^3)$（与乘法复杂度相同）

r

undefined

Complexity analysis helper

analyze_complexity <- function(f, n_values = c(100, 500, 1000, 5000)) { times <- sapply(n_values, function(n) { system.time(f(n))[["elapsed"]] })

Fit log-log regression to estimate complexity

fit <- lm(log(times) ~ log(n_values)) estimated_power <- coef(fit)[2]

list( times = data.frame(n = n_values, time = times), estimated_complexity = paste0("O(n^", round(estimated_power, 2), ")"), power = estimated_power ) }

undefined

analyze_complexity <- function(f, n_values = c(100, 500, 1000, 5000)) { times <- sapply(n_values, function(n) { system.time(f(n))[["elapsed"]] })

Fit log-log regression to estimate complexity

fit <- lm(log(times) ~ log(n_values)) estimated_power <- coef(fit)[2]

list( times = data.frame(n = n_values, time = times), estimated_complexity = paste0("O(n^", round(estimated_power, 2), ")"), power = estimated_power ) }

undefined

Statistical Algorithm Complexities

统计算法复杂度

Algorithm	Time	Space
OLS	O(np² + p³)	O(np)
Logistic (Newton)	O(np² + p³) per iter	O(np)
Bootstrap (B reps)	O(B × base)	O(n)
MCMC (T iters)	O(T × per_iter)	O(n + T)
Cross-validation (K)	O(K × base)	O(n)
Random forest	O(n log n × B × p)	O(n × B)

算法	时间复杂度	空间复杂度
普通最小二乘（OLS）	O(np² + p³)	O(np)
逻辑回归（牛顿法）	每次迭代O(np² + p³)	O(np)
自助法（B次重复）	O(B × 基础算法复杂度)	O(n)
马尔可夫链蒙特卡洛（T次迭代）	O(T × 每次迭代复杂度)	O(n + T)
交叉验证（K折）	O(K × 基础算法复杂度)	O(n)
随机森林	O(n log n × B × p)	O(n × B)

Template for Analysis

分析模板

TIME COMPLEXITY:
- Initialization: O(...)
- Per iteration: O(...)
- Total (T iterations): O(...)
- Convergence typically in T = O(...) iterations

SPACE COMPLEXITY:
- Data storage: O(n × p)
- Working memory: O(...)
- Output: O(...)

时间复杂度:
- 初始化：O(...)
- 每次迭代：O(...)
- 总复杂度（T次迭代）：O(...)
- 通常收敛所需迭代次数T = O(...)

空间复杂度:
- 数据存储：O(n × p)
- 工作内存：O(...)
- 输出结果：O(...)

Convergence Analysis

收敛性分析

Types of Convergence

收敛类型

Finite termination: Exact solution in finite steps
Linear: $|x_{k+1} - x^| \leq c|x_k - x^|$, $c < 1$
Superlinear: $|x_{k+1} - x^| / |x_k - x^| \to 0$
Quadratic: $|x_{k+1} - x^| \leq c|x_k - x^|^2$

有限终止：在有限步骤内得到精确解
线性收敛：$|x_{k+1} - x^| \leq c|x_k - x^|$，$c < 1$
超线性收敛：$|x_{k+1} - x^| / |x_k - x^| \to 0$
二次收敛：$|x_{k+1} - x^| \leq c|x_k - x^|^2$

Convergence Documentation Template

收敛性文档模板

CONVERGENCE:
- Type: [Linear/Superlinear/Quadratic]
- Rate: [Expression]
- Conditions: [What must hold]
- Stopping criterion: [When to stop]
- Typical iterations: [Order of magnitude]

收敛性:
- 类型：[线性/超线性/二次]
- 速率：[表达式]
- 条件：[需满足的条件]
- 停止准则：[停止时机]
- 典型迭代次数：[数量级]

Optimization Algorithms

优化算法

Gradient-Based Methods

基于梯度的方法

ALGORITHM: Gradient Descent
INPUT: f (objective), ∇f (gradient), x₀ (initial), η (step size), ε (tolerance)
OUTPUT: x* (minimizer)

1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
   2.1 xₖ₊₁ ← xₖ - η∇f(xₖ)
   2.2 k ← k + 1
3. RETURN xₖ

COMPLEXITY: O(iterations × gradient_cost)
CONVERGENCE: Linear with rate (1 - η·μ) for μ-strongly convex f

ALGORITHM: 梯度下降法
INPUT: f（目标函数）, ∇f（梯度）, x₀（初始值）, η（步长）, ε（容差）
OUTPUT: x*（最小值点）

1. k ← 0
2. 当‖∇f(xₖ)‖ > ε时:
   2.1 xₖ₊₁ ← xₖ - η∇f(xₖ)
   2.2 k ← k + 1
3. 返回xₖ

复杂度: O(迭代次数 × 梯度计算成本)
收敛性: 对于μ-强凸函数f，收敛速率为线性，速率为(1 - η·μ)

Newton's Method

牛顿法

ALGORITHM: Newton-Raphson
INPUT: f, ∇f, ∇²f, x₀, ε
OUTPUT: x*

1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
   2.1 Solve ∇²f(xₖ)·d = -∇f(xₖ) for direction d
   2.2 xₖ₊₁ ← xₖ + d
   2.3 k ← k + 1
3. RETURN xₖ

COMPLEXITY: O(iterations × p³) for p-dimensional
CONVERGENCE: Quadratic near solution

ALGORITHM: 牛顿-拉夫逊法
INPUT: f, ∇f, ∇²f（海森矩阵）, x₀, ε
OUTPUT: x*

1. k ← 0
2. 当‖∇f(xₖ)‖ > ε时:
   2.1 求解∇²f(xₖ)·d = -∇f(xₖ)得到方向d
   2.2 xₖ₊₁ ← xₖ + d
   2.3 k ← k + 1
3. 返回xₖ

复杂度: p维问题每次迭代O(p³)
收敛性: 在解附近为二次收敛

EM Algorithm Template

EM算法模板

ALGORITHM: Expectation-Maximization
INPUT: Data Y, model parameters θ₀, tolerance ε
OUTPUT: MLE θ̂

1. θ ← θ₀
2. REPEAT:
   2.1 E-STEP: Compute Q(θ'|θ) = E[log L(θ'|Y,Z) | Y, θ]
   2.2 M-STEP: θ_new ← argmax_θ' Q(θ'|θ)
   2.3 Δ ← |θ_new - θ|
   2.4 θ ← θ_new
3. UNTIL Δ < ε
4. RETURN θ

CONVERGENCE: Monotonic increase in likelihood
             Linear rate near optimum

ALGORITHM: 期望最大化算法（Expectation-Maximization）
INPUT: 数据Y, 模型参数θ₀, 容差ε
OUTPUT: 极大似然估计θ̂

1. θ ← θ₀
2. 重复:
   2.1 E步骤：计算Q(θ'|θ) = E[log L(θ'|Y,Z) | Y, θ]
   2.2 M步骤：θ_new ← argmax_θ' Q(θ'|θ)
   2.3 Δ ← |θ_new - θ|
   2.4 θ ← θ_new
3. 直到Δ < ε
4. 返回θ

收敛性: 似然值单调递增
         在最优解附近为线性收敛速率

Bootstrap Algorithms

自助法算法

Nonparametric Bootstrap

非参数自助法

ALGORITHM: Nonparametric Bootstrap
INPUT: Data X of size n, statistic T, B (number of replicates)
OUTPUT: SE estimate, CI

1. FOR b = 1 to B:
   1.1 Draw X*_b by sampling n observations with replacement from X
   1.2 Compute T*_b = T(X*_b)
2. SE_boot ← SD({T*_1, ..., T*_B})
3. CI_percentile ← [quantile(T*, 0.025), quantile(T*, 0.975)]
4. RETURN (SE_boot, CI_percentile)

COMPLEXITY: O(B × cost(T))
NOTES: B ≥ 1000 for SE, B ≥ 10000 for percentile CI

ALGORITHM: 非参数自助法
INPUT: 规模为n的数据集X, 统计量T, B（重复次数）
OUTPUT: 标准误估计值, 置信区间

1. 对于b = 1到B:
   1.1 从X中有放回地抽取n个样本得到X*_b
   1.2 计算T*_b = T(X*_b)
2. SE_boot ← SD({T*_1,..., T*_B})
3. 百分位数置信区间CI_percentile ← [quantile(T*, 0.025), quantile(T*, 0.975)]
4. 返回(SE_boot, CI_percentile)

复杂度: O(B × 统计量计算成本)
注意事项: 估计标准误时B ≥ 1000，估计百分位数置信区间时B ≥ 10000

Parametric Bootstrap

参数自助法

ALGORITHM: Parametric Bootstrap
INPUT: Data X, parametric model M, B replicates
OUTPUT: SE estimate

1. Fit θ̂ = MLE(X, M)
2. FOR b = 1 to B:
   2.1 Generate X*_b ~ M(θ̂)
   2.2 Compute θ̂*_b = MLE(X*_b, M)
3. SE_boot ← SD({θ̂*_1, ..., θ̂*_B})
4. RETURN SE_boot

ALGORITHM: 参数自助法
INPUT: 数据集X, 参数模型M, B次重复
OUTPUT: 标准误估计值

1. 拟合模型得到极大似然估计θ̂ = MLE(X, M)
2. 对于b = 1到B:
   2.1 根据模型M(θ̂)生成样本X*_b
   2.2 拟合X*_b得到θ̂*_b = MLE(X*_b, M)
3. SE_boot ← SD({θ̂*_1,..., θ̂*_B})
4. 返回SE_boot

Numerical Stability Notes

数值稳定性说明

Common Issues

常见问题

Overflow/Underflow: Work on log scale
Cancellation: Reformulate subtractions
Ill-conditioning: Use regularization or pivoting
Convergence: Add damping or line search

上溢/下溢：在对数尺度下进行计算
抵消误差：重新构造减法运算
病态性：使用正则化或选主元方法
收敛问题：添加阻尼或线搜索

Stability Techniques

稳定性技巧

r

undefined

r

undefined

Log-sum-exp trick

log_sum_exp <- function(x) { max_x <- max(x) max_x + log(sum(exp(x - max_x))) }

Numerically stable variance

stable_var <- function(x) { n <- length(x) m <- mean(x) sum((x - m)^2) / (n - 1) # One-pass with correction }

undefined

stable_var <- function(x) { n <- length(x) m <- mean(x) sum((x - m)^2) / (n - 1) # One-pass with correction }

undefined

Implementation Checklist

实现检查清单

Before Coding

编码前

During Implementation

编码中

After Implementation

编码后

Unit tests for components
Integration tests for full algorithm
Benchmark against reference implementation
Profile for bottlenecks
Document deviations from pseudocode

Key References

核心参考文献

CLRS
Numerical Recipes

CLRS（《算法导论》）
《数值 Recipes》

algorithm-designer

Original

Translation

Algorithm Designer

统计算法设计器

Algorithm Documentation Standards

统计算法文档标准

Required Components

必要组件

Input/Output Specification

输入/输出规范

Formal Specification Template

正式规范模板

R implementation of formal I/O specification

R implementation of formal I/O specification

Convergence Criteria

收敛准则

Standard Convergence Conditions

标准收敛条件

Mathematical Formulation

数学公式

Complexity Formulas

复杂度公式

Comprehensive convergence checking

Comprehensive convergence checking

Newton-Raphson with convergence monitoring

Newton-Raphson with convergence monitoring

Complexity and Convergence Relationship

复杂度与收敛性的关系

Pseudocode Conventions

伪代码规范

Standard Format

标准格式

Example: AIPW Estimator

示例：AIPW估计器

Complexity Analysis

复杂度分析

Big-O Notation Guide

大O符号指南

Key Formulas

核心公式

Complexity analysis helper

Complexity analysis helper

Fit log-log regression to estimate complexity

Fit log-log regression to estimate complexity

Statistical Algorithm Complexities

统计算法复杂度

Template for Analysis

分析模板

Convergence Analysis

收敛性分析

Types of Convergence

收敛类型

Convergence Documentation Template

收敛性文档模板

Optimization Algorithms

优化算法

Gradient-Based Methods

基于梯度的方法

Newton's Method

牛顿法

EM Algorithm Template

EM算法模板

Bootstrap Algorithms

自助法算法

Nonparametric Bootstrap

非参数自助法

Parametric Bootstrap

参数自助法

Numerical Stability Notes

数值稳定性说明

Common Issues

常见问题

Stability Techniques

稳定性技巧

Log-sum-exp trick

Log-sum-exp trick

Numerically stable variance

Numerically stable variance

Implementation Checklist