numerical-methods
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNumerical Methods
数值方法
You are an expert in numerical stability and computational aspects of statistical methods.
您是数值稳定性和统计方法计算方面的专家。
Floating-Point Fundamentals
浮点数基础
IEEE 754 Double Precision
IEEE 754双精度
- Precision: ~15-17 significant decimal digits
- Range: ~10⁻³⁰⁸ to 10³⁰⁸
- Machine epsilon: ε ≈ 2.2 × 10⁻¹⁶
- Special values: Inf, -Inf, NaN
- 精度:约15-17位有效十进制数字
- 范围:约10⁻³⁰⁸ 至 10³⁰⁸
- 机器精度:ε ≈ 2.2 × 10⁻¹⁶
- 特殊值:Inf, -Inf, NaN
Key Constants in R
R语言中的关键常量
r
.Machine$double.eps # ~2.22e-16 (machine epsilon)
.Machine$double.xmax # ~1.80e+308 (max finite)
.Machine$double.xmin # ~2.23e-308 (min positive normalized)
.Machine$double.neg.eps # ~1.11e-16 (negative epsilon)r
.Machine$double.eps # ~2.22e-16 (机器精度)
.Machine$double.xmax # ~1.80e+308 (最大有限值)
.Machine$double.xmin # ~2.23e-308 (最小正规格化值)
.Machine$double.neg.eps # ~1.11e-16 (负精度值)Common Numerical Issues
常见数值问题
1. Catastrophic Cancellation
1. 灾难性抵消
When subtracting nearly equal numbers:
r
undefined当减去几乎相等的数字时:
r
undefinedBAD: loses precision
糟糕:丢失精度
x <- 1e10 + 1
y <- 1e10
result <- x - y # Should be 1, may have errors
x <- 1e10 + 1
y <- 1e10
result <- x - y # 应为1,但可能存在误差
BETTER: reformulate to avoid subtraction
更佳:重新表述以避免减法
Example: Computing variance
示例:计算方差
var_bad <- mean(x^2) - mean(x)^2 # Can be negative!
var_good <- sum((x - mean(x))^2) / (n-1) # Always non-negative
undefinedvar_bad <- mean(x^2) - mean(x)^2 # 可能为负数!
var_good <- sum((x - mean(x))^2) / (n-1) # 始终非负
undefined2. Overflow/Underflow
2. 上溢/下溢
r
undefinedr
undefinedBAD: overflow
糟糕:上溢
prod(1:200) # Inf
prod(1:200) # Inf
GOOD: work on log scale
更佳:在对数尺度上计算
sum(log(1:200)) # Then exp() if needed
sum(log(1:200)) # 如有需要再取exp()
BAD: underflow in probabilities
糟糕:概率计算中的下溢
prod(dnorm(x)) # 0 for large x
prod(dnorm(x)) # 当x很大时结果为0
GOOD: sum log probabilities
更佳:对对数概率求和
sum(dnorm(x, log = TRUE))
undefinedsum(dnorm(x, log = TRUE))
undefined3. Log-Sum-Exp Trick
3. Log-Sum-Exp技巧
Essential for working with log probabilities:
r
log_sum_exp <- function(log_x) {
max_log <- max(log_x)
if (is.infinite(max_log)) return(max_log)
max_log + log(sum(exp(log_x - max_log)))
}处理对数概率的必备技巧:
r
log_sum_exp <- function(log_x) {
max_log <- max(log_x)
if (is.infinite(max_log)) return(max_log)
max_log + log(sum(exp(log_x - max_log)))
}Example: log(exp(-1000) + exp(-1001))
示例:log(exp(-1000) + exp(-1001))
log_sum_exp(c(-1000, -1001)) # Correct: ~-999.69
log(exp(-1000) + exp(-1001)) # Wrong: -Inf
undefinedlog_sum_exp(c(-1000, -1001)) # 正确结果:~-999.69
log(exp(-1000) + exp(-1001)) # 错误结果:-Inf
undefined4. Softmax Stability
4. Softmax稳定性
r
undefinedr
undefinedBAD
糟糕的实现
softmax_bad <- function(x) exp(x) / sum(exp(x))
softmax_bad <- function(x) exp(x) / sum(exp(x))
GOOD
更佳的实现
softmax <- function(x) {
x_max <- max(x)
exp_x <- exp(x - x_max)
exp_x / sum(exp_x)
}
undefinedsoftmax <- function(x) {
x_max <- max(x)
exp_x <- exp(x - x_max)
exp_x / sum(exp_x)
}
undefinedMatrix Computations
矩阵计算
Conditioning
条件数
The condition number κ(A) measures sensitivity to perturbation:
- κ(A) = ‖A‖ · ‖A⁻¹‖
- Rule: Expect to lose log₁₀(κ) digits of accuracy
- κ > 10¹⁵ means matrix is numerically singular
r
undefined条件数κ(A)衡量矩阵对扰动的敏感度:
- κ(A) = ‖A‖ · ‖A⁻¹‖
- 规则:预计会丢失log₁₀(κ)位精度
- κ > 10¹⁵ 表示矩阵在数值上是奇异的
r
undefinedCheck condition number
检查条件数
kappa(X, exact = TRUE)
kappa(X, exact = TRUE)
For regression: check X'X conditioning
回归场景:检查X'X的条件数
kappa(crossprod(X))
undefinedkappa(crossprod(X))
undefinedSolving Linear Systems
线性方程组求解
Prefer: Decomposition methods over explicit inversion
r
undefined优先选择:分解方法而非显式求逆
r
undefinedBAD: explicit inverse
糟糕:显式求逆
beta <- solve(t(X) %% X) %% t(X) %*% y
beta <- solve(t(X) %% X) %% t(X) %*% y
GOOD: QR decomposition
更佳:QR分解
beta <- qr.coef(qr(X), y)
beta <- qr.coef(qr(X), y)
BETTER for positive definite: Cholesky
对正定矩阵更优:Cholesky分解
R <- chol(crossprod(X))
beta <- backsolve(R, forwardsolve(t(R), crossprod(X, y)))
R <- chol(crossprod(X))
beta <- backsolve(R, forwardsolve(t(R), crossprod(X, y)))
For ill-conditioned: SVD/pseudoinverse
针对病态矩阵:SVD/伪逆
beta <- MASS::ginv(X) %*% y
undefinedbeta <- MASS::ginv(X) %*% y
undefinedSymmetric Positive Definite Matrices
对称正定矩阵
Always use specialized methods:
r
undefined始终使用专门的方法:
r
undefinedCholesky for SPD
对称正定矩阵的Cholesky分解
L <- chol(Sigma)
L <- chol(Sigma)
Eigendecomposition
特征分解
eig <- eigen(Sigma, symmetric = TRUE)
eig <- eigen(Sigma, symmetric = TRUE)
Check positive definiteness
检查正定性
all(eigen(Sigma, symmetric = TRUE, only.values = TRUE)$values > 0)
undefinedall(eigen(Sigma, symmetric = TRUE, only.values = TRUE)$values > 0)
undefinedOptimization Stability
优化稳定性
Gradient Computation
梯度计算
r
undefinedr
undefinedNumerical gradient (for verification)
数值梯度(用于验证)
numerical_grad <- function(f, x, h = sqrt(.Machine$double.eps)) {
sapply(seq_along(x), function(i) {
x_plus <- x_minus <- x
x_plus[i] <- x[i] + h
x_minus[i] <- x[i] - h
(f(x_plus) - f(x_minus)) / (2 * h)
})
}
numerical_grad <- function(f, x, h = sqrt(.Machine$double.eps)) {
sapply(seq_along(x), function(i) {
x_plus <- x_minus <- x
x_plus[i] <- x[i] + h
x_minus[i] <- x[i] - h
(f(x_plus) - f(x_minus)) / (2 * h)
})
}
Central difference is O(h²) accurate
中心差分的精度为O(h²)
Forward difference is O(h) accurate
前向差分的精度为O(h)
undefinedundefinedHessian Stability
海森矩阵稳定性
r
undefinedr
undefinedCheck Hessian is positive definite at optimum
检查最优解处的海森矩阵是否为正定矩阵
check_hessian <- function(H, tol = 1e-8) {
eigs <- eigen(H, symmetric = TRUE, only.values = TRUE)$values
min_eig <- min(eigs)
list(
positive_definite = min_eig > tol,
min_eigenvalue = min_eig,
condition_number = max(eigs) / min_eig
)
}
undefinedcheck_hessian <- function(H, tol = 1e-8) {
eigs <- eigen(H, symmetric = TRUE, only.values = TRUE)$values
min_eig <- min(eigs)
list(
positive_definite = min_eig > tol,
min_eigenvalue = min_eig,
condition_number = max(eigs) / min_eig
)
}
undefinedLine Search
线搜索
For gradient descent stability:
r
backtracking_line_search <- function(f, x, d, grad, alpha = 1, rho = 0.5, c = 1e-4) {
# Armijo condition
while (f(x + alpha * d) > f(x) + c * alpha * sum(grad * d)) {
alpha <- rho * alpha
if (alpha < 1e-10) break
}
alpha
}为梯度下降提供稳定性:
r
backtracking_line_search <- function(f, x, d, grad, alpha = 1, rho = 0.5, c = 1e-4) {
# Armijo条件
while (f(x + alpha * d) > f(x) + c * alpha * sum(grad * d)) {
alpha <- rho * alpha
if (alpha < 1e-10) break
}
alpha
}Integration and Quadrature
积分与数值积分
Numerical Integration Guidelines
数值积分指南
r
undefinedr
undefinedAdaptive quadrature (default choice)
自适应积分(默认选择)
integrate(f, lower, upper)
integrate(f, lower, upper)
For infinite limits
无限区间积分
integrate(f, -Inf, Inf)
integrate(f, -Inf, Inf)
For highly oscillatory or peaked functions
针对高振荡或峰值函数
Increase subdivisions
增加细分次数
integrate(f, lower, upper, subdivisions = 1000)
integrate(f, lower, upper, subdivisions = 1000)
For known singularities, split the domain
对于已知奇点,拆分积分域
undefinedundefinedMonte Carlo Integration
蒙特卡洛积分
r
mc_integrate <- function(f, n, lower, upper) {
x <- runif(n, lower, upper)
fx <- sapply(x, f)
estimate <- (upper - lower) * mean(fx)
se <- (upper - lower) * sd(fx) / sqrt(n)
list(value = estimate, se = se)
}r
mc_integrate <- function(f, n, lower, upper) {
x <- runif(n, lower, upper)
fx <- sapply(x, f)
estimate <- (upper - lower) * mean(fx)
se <- (upper - lower) * sd(fx) / sqrt(n)
list(value = estimate, se = se)
}Root Finding
根查找
Newton-Raphson Stability
牛顿-拉夫逊法稳定性
r
newton_raphson <- function(f, df, x0, tol = 1e-8, max_iter = 100) {
x <- x0
for (i in 1:max_iter) {
fx <- f(x)
dfx <- df(x)
# Check for near-zero derivative
if (abs(dfx) < .Machine$double.eps * 100) {
warning("Near-zero derivative")
break
}
x_new <- x - fx / dfx
if (abs(x_new - x) < tol) break
x <- x_new
}
x
}r
newton_raphson <- function(f, df, x0, tol = 1e-8, max_iter = 100) {
x <- x0
for (i in 1:max_iter) {
fx <- f(x)
dfx <- df(x)
# 检查导数是否接近零
if (abs(dfx) < .Machine$double.eps * 100) {
warning("Near-zero derivative")
break
}
x_new <- x - fx / dfx
if (abs(x_new - x) < tol) break
x <- x_new
}
x
}Brent's Method
Brent方法
For robust root finding without derivatives:
r
uniroot(f, interval = c(lower, upper), tol = .Machine$double.eps^0.5)无需导数的稳健根查找方法:
r
uniroot(f, interval = c(lower, upper), tol = .Machine$double.eps^0.5)Statistical Computing Patterns
统计计算模式
Safe Likelihood Computation
安全的似然计算
r
undefinedr
undefinedAlways work with log-likelihood
始终使用对数似然
log_lik <- function(theta, data) {
Compute log-likelihood, not likelihood
sum(dnorm(data, mean = theta[1], sd = theta[2], log = TRUE))
}
undefinedlog_lik <- function(theta, data) {
计算对数似然而非似然
sum(dnorm(data, mean = theta[1], sd = theta[2], log = TRUE))
}
undefinedRobust Standard Errors
稳健标准误差
r
undefinedr
undefinedSandwich estimator with numerical stability
具有数值稳定性的三明治估计量
sandwich_se <- function(score, hessian) {
Check Hessian conditioning
H_inv <- tryCatch(
solve(hessian),
error = function(e) MASS::ginv(hessian)
)
meat <- crossprod(score)
V <- H_inv %% meat %% H_inv
sqrt(diag(V))
}
undefinedsandwich_se <- function(score, hessian) {
检查海森矩阵的条件数
H_inv <- tryCatch(
solve(hessian),
error = function(e) MASS::ginv(hessian)
)
meat <- crossprod(score)
V <- H_inv %% meat %% H_inv
sqrt(diag(V))
}
undefinedBootstrap with Error Handling
带错误处理的Bootstrap方法
r
safe_bootstrap <- function(data, statistic, R = 1000) {
results <- numeric(R)
failures <- 0
for (i in 1:R) {
boot_data <- data[sample(nrow(data), replace = TRUE), ]
result <- tryCatch(
statistic(boot_data),
error = function(e) NA
)
results[i] <- result
if (is.na(result)) failures <- failures + 1
}
if (failures > 0.1 * R) {
warning(sprintf("%.1f%% bootstrap failures", 100 * failures / R))
}
list(
estimate = mean(results, na.rm = TRUE),
se = sd(results, na.rm = TRUE),
failures = failures
)
}r
safe_bootstrap <- function(data, statistic, R = 1000) {
results <- numeric(R)
failures <- 0
for (i in 1:R) {
boot_data <- data[sample(nrow(data), replace = TRUE), ]
result <- tryCatch(
statistic(boot_data),
error = function(e) NA
)
results[i] <- result
if (is.na(result)) failures <- failures + 1
}
if (failures > 0.1 * R) {
warning(sprintf("%.1f%% bootstrap failures", 100 * failures / R))
}
list(
estimate = mean(results, na.rm = TRUE),
se = sd(results, na.rm = TRUE),
failures = failures
)
}Debugging Numerical Issues
数值问题调试
Diagnostic Checklist
诊断清单
- Check for NaN/Inf: ,
any(is.nan(x))any(is.infinite(x)) - Check conditioning:
kappa(matrix) - Check eigenvalues: For PD matrices
- Check gradients: Numerically vs analytically
- Check scale: Variables on similar scales?
- 检查NaN/Inf:,
any(is.nan(x))any(is.infinite(x)) - 检查条件数:
kappa(matrix) - 检查特征值:针对正定矩阵
- 检查梯度:数值梯度与解析梯度对比
- 检查尺度:变量是否处于相似尺度?
Debugging Functions
调试函数
r
undefinedr
undefinedTrace NaN/Inf sources
追踪NaN/Inf的来源
debug_numeric <- function(x, name = "x") {
cat(sprintf("%s: range [%.3g, %.3g], ", name, min(x), max(x)))
cat(sprintf("NaN: %d, Inf: %d, -Inf: %d\n",
sum(is.nan(x)), sum(x == Inf), sum(x == -Inf)))
}
debug_numeric <- function(x, name = "x") {
cat(sprintf("%s: range [%.3g, %.3g], ", name, min(x), max(x)))
cat(sprintf("NaN: %d, Inf: %d, -Inf: %d\n",
sum(is.nan(x)), sum(x == Inf), sum(x == -Inf)))
}
Check relative error
检查相对误差
rel_error <- function(computed, true) {
abs(computed - true) / max(abs(true), 1)
}
undefinedrel_error <- function(computed, true) {
abs(computed - true) / max(abs(true), 1)
}
undefinedBest Practices Summary
最佳实践总结
- Always work on log scale for products of probabilities
- Use QR or Cholesky instead of matrix inversion
- Check conditioning before solving linear systems
- Center and scale predictors in regression
- Handle edge cases (empty data, singular matrices)
- Use existing implementations (LAPACK, BLAS) when possible
- Test with extreme values (very small, very large, near-zero)
- Compare analytical and numerical gradients
- Monitor convergence in iterative algorithms
- Document numerical assumptions and limitations
- 始终在对数尺度上操作概率的乘积
- 使用QR或Cholesky分解而非矩阵求逆
- 求解线性方程组前检查条件数
- 回归中的预测变量要中心化和标准化
- 处理边缘情况(空数据、奇异矩阵)
- 尽可能使用现有实现(LAPACK、BLAS)
- 用极端值测试(极小、极大、接近零的值)
- 对比解析梯度与数值梯度
- 监控迭代算法的收敛性
- 记录数值假设和局限性
Key References
关键参考文献
- Higham
- Golub & Van Loan
- Higham
- Golub & Van Loan