numerical-methods

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Numerical Methods

数值方法

You are an expert in numerical stability and computational aspects of statistical methods.

您是数值稳定性和统计方法计算方面的专家。

Floating-Point Fundamentals

浮点数基础

IEEE 754 Double Precision

IEEE 754双精度

Precision: ~15-17 significant decimal digits
Range: ~10⁻³⁰⁸ to 10³⁰⁸
Machine epsilon: ε ≈ 2.2 × 10⁻¹⁶
Special values: Inf, -Inf, NaN

精度：约15-17位有效十进制数字
范围：约10⁻³⁰⁸ 至 10³⁰⁸
机器精度：ε ≈ 2.2 × 10⁻¹⁶
特殊值：Inf, -Inf, NaN

Key Constants in R

R语言中的关键常量

.Machine$double.eps      # ~2.22e-16 (machine epsilon)
.Machine$double.xmax     # ~1.80e+308 (max finite)
.Machine$double.xmin     # ~2.23e-308 (min positive normalized)
.Machine$double.neg.eps  # ~1.11e-16 (negative epsilon)

.Machine$double.eps      # ~2.22e-16 (机器精度)
.Machine$double.xmax     # ~1.80e+308 (最大有限值)
.Machine$double.xmin     # ~2.23e-308 (最小正规格化值)
.Machine$double.neg.eps  # ~1.11e-16 (负精度值)

Common Numerical Issues

常见数值问题

1. Catastrophic Cancellation

1. 灾难性抵消

When subtracting nearly equal numbers:

undefined

当减去几乎相等的数字时：

undefined

BAD: loses precision

糟糕：丢失精度

x <- 1e10 + 1 y <- 1e10 result <- x - y # Should be 1, may have errors

x <- 1e10 + 1 y <- 1e10 result <- x - y # 应为1，但可能存在误差

BETTER: reformulate to avoid subtraction

更佳：重新表述以避免减法

Example: Computing variance

示例：计算方差

var_bad <- mean(x^2) - mean(x)^2 # Can be negative! var_good <- sum((x - mean(x))^2) / (n-1) # Always non-negative

undefined

var_bad <- mean(x^2) - mean(x)^2 # 可能为负数！ var_good <- sum((x - mean(x))^2) / (n-1) # 始终非负

undefined

2. Overflow/Underflow

2. 上溢/下溢

undefined

undefined

BAD: overflow

糟糕：上溢

prod(1:200) # Inf

GOOD: work on log scale

更佳：在对数尺度上计算

sum(log(1:200)) # Then exp() if needed

sum(log(1:200)) # 如有需要再取exp()

BAD: underflow in probabilities

糟糕：概率计算中的下溢

prod(dnorm(x)) # 0 for large x

prod(dnorm(x)) # 当x很大时结果为0

GOOD: sum log probabilities

更佳：对对数概率求和

sum(dnorm(x, log = TRUE))

undefined

sum(dnorm(x, log = TRUE))

undefined

3. Log-Sum-Exp Trick

3. Log-Sum-Exp技巧

Essential for working with log probabilities:

log_sum_exp <- function(log_x) {
  max_log <- max(log_x)
  if (is.infinite(max_log)) return(max_log)
  max_log + log(sum(exp(log_x - max_log)))
}

处理对数概率的必备技巧：

log_sum_exp <- function(log_x) {
  max_log <- max(log_x)
  if (is.infinite(max_log)) return(max_log)
  max_log + log(sum(exp(log_x - max_log)))
}

Example: log(exp(-1000) + exp(-1001))

示例：log(exp(-1000) + exp(-1001))

log_sum_exp(c(-1000, -1001)) # Correct: ~-999.69 log(exp(-1000) + exp(-1001)) # Wrong: -Inf

undefined

log_sum_exp(c(-1000, -1001)) # 正确结果：~-999.69 log(exp(-1000) + exp(-1001)) # 错误结果：-Inf

undefined

4. Softmax Stability

4. Softmax稳定性

undefined

undefined

BAD

糟糕的实现

softmax_bad <- function(x) exp(x) / sum(exp(x))

GOOD

更佳的实现

softmax <- function(x) { x_max <- max(x) exp_x <- exp(x - x_max) exp_x / sum(exp_x) }

undefined

softmax <- function(x) { x_max <- max(x) exp_x <- exp(x - x_max) exp_x / sum(exp_x) }

undefined

Matrix Computations

矩阵计算

Conditioning

条件数

The condition number κ(A) measures sensitivity to perturbation:

κ(A) = ‖A‖ · ‖A⁻¹‖
Rule: Expect to lose log₁₀(κ) digits of accuracy
κ > 10¹⁵ means matrix is numerically singular

undefined

条件数κ(A)衡量矩阵对扰动的敏感度：

κ(A) = ‖A‖ · ‖A⁻¹‖
规则：预计会丢失log₁₀(κ)位精度
κ > 10¹⁵ 表示矩阵在数值上是奇异的

undefined

Check condition number

检查条件数

kappa(X, exact = TRUE)

For regression: check X'X conditioning

回归场景：检查X'X的条件数

kappa(crossprod(X))

undefined

kappa(crossprod(X))

undefined

Solving Linear Systems

线性方程组求解

Prefer: Decomposition methods over explicit inversion

undefined

优先选择：分解方法而非显式求逆

undefined

BAD: explicit inverse

糟糕：显式求逆

beta <- solve(t(X) %% X) %% t(X) %*% y

GOOD: QR decomposition

更佳：QR分解

beta <- qr.coef(qr(X), y)

BETTER for positive definite: Cholesky

对正定矩阵更优：Cholesky分解

R <- chol(crossprod(X)) beta <- backsolve(R, forwardsolve(t(R), crossprod(X, y)))

For ill-conditioned: SVD/pseudoinverse

针对病态矩阵：SVD/伪逆

beta <- MASS::ginv(X) %*% y

undefined

beta <- MASS::ginv(X) %*% y

undefined

Symmetric Positive Definite Matrices

对称正定矩阵

Always use specialized methods:

undefined

始终使用专门的方法：

undefined

Cholesky for SPD

对称正定矩阵的Cholesky分解

L <- chol(Sigma)

Eigendecomposition

特征分解

eig <- eigen(Sigma, symmetric = TRUE)

Check positive definiteness

检查正定性

all(eigen(Sigma, symmetric = TRUE, only.values = TRUE)$values > 0)

undefined

all(eigen(Sigma, symmetric = TRUE, only.values = TRUE)$values > 0)

undefined

Optimization Stability

优化稳定性

Gradient Computation

梯度计算

undefined

undefined

Numerical gradient (for verification)

数值梯度（用于验证）

numerical_grad <- function(f, x, h = sqrt(.Machine$double.eps)) { sapply(seq_along(x), function(i) { x_plus <- x_minus <- x x_plus[i] <- x[i] + h x_minus[i] <- x[i] - h (f(x_plus) - f(x_minus)) / (2 * h) }) }

Central difference is O(h²) accurate

中心差分的精度为O(h²)

Forward difference is O(h) accurate

前向差分的精度为O(h)

undefined

undefined

Hessian Stability

海森矩阵稳定性

undefined

undefined

Check Hessian is positive definite at optimum

检查最优解处的海森矩阵是否为正定矩阵

check_hessian <- function(H, tol = 1e-8) { eigs <- eigen(H, symmetric = TRUE, only.values = TRUE)$values min_eig <- min(eigs)

list( positive_definite = min_eig > tol, min_eigenvalue = min_eig, condition_number = max(eigs) / min_eig ) }

undefined

check_hessian <- function(H, tol = 1e-8) { eigs <- eigen(H, symmetric = TRUE, only.values = TRUE)$values min_eig <- min(eigs)

list( positive_definite = min_eig > tol, min_eigenvalue = min_eig, condition_number = max(eigs) / min_eig ) }

undefined

Line Search

线搜索

For gradient descent stability:

backtracking_line_search <- function(f, x, d, grad, alpha = 1, rho = 0.5, c = 1e-4) {
  # Armijo condition
  while (f(x + alpha * d) > f(x) + c * alpha * sum(grad * d)) {
    alpha <- rho * alpha
    if (alpha < 1e-10) break
  }
  alpha
}

为梯度下降提供稳定性：

backtracking_line_search <- function(f, x, d, grad, alpha = 1, rho = 0.5, c = 1e-4) {
  # Armijo条件
  while (f(x + alpha * d) > f(x) + c * alpha * sum(grad * d)) {
    alpha <- rho * alpha
    if (alpha < 1e-10) break
  }
  alpha
}

Integration and Quadrature

积分与数值积分

Numerical Integration Guidelines

数值积分指南

undefined

undefined

Adaptive quadrature (default choice)

自适应积分（默认选择）

integrate(f, lower, upper)

For infinite limits

无限区间积分

integrate(f, -Inf, Inf)

For highly oscillatory or peaked functions

针对高振荡或峰值函数

Increase subdivisions

增加细分次数

integrate(f, lower, upper, subdivisions = 1000)

For known singularities, split the domain

对于已知奇点，拆分积分域

undefined

undefined

Monte Carlo Integration

蒙特卡洛积分

mc_integrate <- function(f, n, lower, upper) {
  x <- runif(n, lower, upper)
  fx <- sapply(x, f)

  estimate <- (upper - lower) * mean(fx)
  se <- (upper - lower) * sd(fx) / sqrt(n)

  list(value = estimate, se = se)
}

mc_integrate <- function(f, n, lower, upper) {
  x <- runif(n, lower, upper)
  fx <- sapply(x, f)

  estimate <- (upper - lower) * mean(fx)
  se <- (upper - lower) * sd(fx) / sqrt(n)

  list(value = estimate, se = se)
}

Root Finding

根查找

Newton-Raphson Stability

牛顿-拉夫逊法稳定性

newton_raphson <- function(f, df, x0, tol = 1e-8, max_iter = 100) {
  x <- x0
  for (i in 1:max_iter) {
    fx <- f(x)
    dfx <- df(x)

    # Check for near-zero derivative
    if (abs(dfx) < .Machine$double.eps * 100) {
      warning("Near-zero derivative")
      break
    }

    x_new <- x - fx / dfx

    if (abs(x_new - x) < tol) break
    x <- x_new
  }
  x
}

newton_raphson <- function(f, df, x0, tol = 1e-8, max_iter = 100) {
  x <- x0
  for (i in 1:max_iter) {
    fx <- f(x)
    dfx <- df(x)

    # 检查导数是否接近零
    if (abs(dfx) < .Machine$double.eps * 100) {
      warning("Near-zero derivative")
      break
    }

    x_new <- x - fx / dfx

    if (abs(x_new - x) < tol) break
    x <- x_new
  }
  x
}

Brent's Method

Brent方法

For robust root finding without derivatives:

uniroot(f, interval = c(lower, upper), tol = .Machine$double.eps^0.5)

无需导数的稳健根查找方法：

uniroot(f, interval = c(lower, upper), tol = .Machine$double.eps^0.5)

Statistical Computing Patterns

统计计算模式

Safe Likelihood Computation

安全的似然计算

undefined

undefined

Always work with log-likelihood

始终使用对数似然

log_lik <- function(theta, data) {

Compute log-likelihood, not likelihood

sum(dnorm(data, mean = theta[1], sd = theta[2], log = TRUE)) }

undefined

log_lik <- function(theta, data) {

计算对数似然而非似然

sum(dnorm(data, mean = theta[1], sd = theta[2], log = TRUE)) }

undefined

Robust Standard Errors

稳健标准误差

undefined

undefined

Sandwich estimator with numerical stability

具有数值稳定性的三明治估计量

sandwich_se <- function(score, hessian) {

Check Hessian conditioning

H_inv <- tryCatch( solve(hessian), error = function(e) MASS::ginv(hessian) )

meat <- crossprod(score) V <- H_inv %% meat %% H_inv

sqrt(diag(V)) }

undefined

sandwich_se <- function(score, hessian) {

检查海森矩阵的条件数

H_inv <- tryCatch( solve(hessian), error = function(e) MASS::ginv(hessian) )

meat <- crossprod(score) V <- H_inv %% meat %% H_inv

sqrt(diag(V)) }

undefined

Bootstrap with Error Handling

带错误处理的Bootstrap方法

safe_bootstrap <- function(data, statistic, R = 1000) {
  results <- numeric(R)
  failures <- 0

  for (i in 1:R) {
    boot_data <- data[sample(nrow(data), replace = TRUE), ]
    result <- tryCatch(
      statistic(boot_data),
      error = function(e) NA
    )
    results[i] <- result
    if (is.na(result)) failures <- failures + 1
  }

  if (failures > 0.1 * R) {
    warning(sprintf("%.1f%% bootstrap failures", 100 * failures / R))
  }

  list(
    estimate = mean(results, na.rm = TRUE),
    se = sd(results, na.rm = TRUE),
    failures = failures
  )
}

safe_bootstrap <- function(data, statistic, R = 1000) {
  results <- numeric(R)
  failures <- 0

  for (i in 1:R) {
    boot_data <- data[sample(nrow(data), replace = TRUE), ]
    result <- tryCatch(
      statistic(boot_data),
      error = function(e) NA
    )
    results[i] <- result
    if (is.na(result)) failures <- failures + 1
  }

  if (failures > 0.1 * R) {
    warning(sprintf("%.1f%% bootstrap failures", 100 * failures / R))
  }

  list(
    estimate = mean(results, na.rm = TRUE),
    se = sd(results, na.rm = TRUE),
    failures = failures
  )
}

Debugging Numerical Issues

数值问题调试

Diagnostic Checklist

诊断清单

Check for NaN/Inf:
```
any(is.nan(x))
```
,
```
any(is.infinite(x))
```
Check conditioning:
```
kappa(matrix)
```
Check eigenvalues: For PD matrices
Check gradients: Numerically vs analytically
Check scale: Variables on similar scales?

检查NaN/Inf：
```
any(is.nan(x))
```
,
```
any(is.infinite(x))
```
检查条件数：
```
kappa(matrix)
```
检查特征值：针对正定矩阵
检查梯度：数值梯度与解析梯度对比
检查尺度：变量是否处于相似尺度？

Debugging Functions

调试函数

undefined

undefined

Trace NaN/Inf sources

追踪NaN/Inf的来源

debug_numeric <- function(x, name = "x") { cat(sprintf("%s: range [%.3g, %.3g], ", name, min(x), max(x))) cat(sprintf("NaN: %d, Inf: %d, -Inf: %d\n", sum(is.nan(x)), sum(x == Inf), sum(x == -Inf))) }

Check relative error

检查相对误差

rel_error <- function(computed, true) { abs(computed - true) / max(abs(true), 1) }

undefined

rel_error <- function(computed, true) { abs(computed - true) / max(abs(true), 1) }

undefined

Best Practices Summary

最佳实践总结

Always work on log scale for products of probabilities
Use QR or Cholesky instead of matrix inversion
Check conditioning before solving linear systems
Center and scale predictors in regression
Handle edge cases (empty data, singular matrices)
Use existing implementations (LAPACK, BLAS) when possible
Test with extreme values (very small, very large, near-zero)
Compare analytical and numerical gradients
Monitor convergence in iterative algorithms
Document numerical assumptions and limitations

始终在对数尺度上操作概率的乘积
使用QR或Cholesky分解而非矩阵求逆
求解线性方程组前检查条件数
回归中的预测变量要中心化和标准化
处理边缘情况（空数据、奇异矩阵）
尽可能使用现有实现（LAPACK、BLAS）
用极端值测试（极小、极大、接近零的值）
对比解析梯度与数值梯度
监控迭代算法的收敛性
记录数值假设和局限性

Key References

关键参考文献

Higham
Golub & Van Loan

Higham
Golub & Van Loan