logging-best-practices

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Logging Best Practices

日志最佳实践

Apply these logging principles to ensure effective debugging, monitoring, and audit capabilities across applications and services.
遵循以下日志原则,确保在各类应用与服务中实现高效的调试、监控及审计能力。

Structured Logging

结构化日志

  • Use structured logging formats (JSON) for all log output
  • Include consistent fields across all log entries
  • Make logs machine-parseable while remaining human-readable
  • Use a logging library that supports structured output natively
  • Avoid string concatenation for log messages; use structured fields
  • 所有日志输出均采用结构化日志格式(JSON)
  • 所有日志条目包含一致的字段
  • 确保日志既可供机器解析,同时保持人类可读性
  • 使用原生支持结构化输出的日志库
  • 避免通过字符串拼接生成日志消息;应使用结构化字段

Standard Log Fields

标准日志字段

Include these fields in every log entry:
  • timestamp: ISO 8601 format with timezone
  • level: Log severity (DEBUG, INFO, WARN, ERROR, FATAL)
  • message: Human-readable description of the event
  • service: Name of the service or application
  • version: Application version or build identifier
  • trace_id: Distributed tracing correlation ID
  • span_id: Current span identifier
  • request_id: Unique identifier for the request
每条日志条目需包含以下字段:
  • timestamp:带时区的ISO 8601格式
  • level:日志严重级别(DEBUG、INFO、WARN、ERROR、FATAL)
  • message:事件的人类可读描述
  • service:服务或应用的名称
  • version:应用版本或构建标识符
  • trace_id:分布式追踪关联ID
  • span_id:当前追踪段标识符
  • request_id:请求的唯一标识符

Log Levels

日志级别

Use appropriate log levels consistently:
  • DEBUG: Detailed diagnostic information for development
  • INFO: Normal operational events and state changes
  • WARN: Unexpected situations that are handled gracefully
  • ERROR: Failures that affect current operation but not the service
  • FATAL: Critical failures requiring immediate attention
始终使用恰当的日志级别:
  • DEBUG:供开发阶段使用的详细诊断信息
  • INFO:正常操作事件与状态变更
  • WARN:已被妥善处理的异常情况
  • ERROR:影响当前操作但未导致服务中断的故障
  • FATAL:需要立即处理的严重故障

Context Propagation

上下文传播

  • Include request context in all log entries within a request lifecycle
  • Propagate trace IDs across service boundaries
  • Add user context (anonymized) for user-initiated actions
  • Include relevant business context for domain events
  • Use MDC (Mapped Diagnostic Context) or equivalent for context management
  • 在请求生命周期内,所有日志条目均需包含请求上下文
  • 跨服务边界传播trace ID
  • 为用户发起的操作添加(匿名化的)用户上下文
  • 为领域事件添加相关业务上下文
  • 使用MDC(映射诊断上下文)或同类工具管理上下文

Security and Privacy

安全与隐私

  • Never log sensitive information (passwords, tokens, PII)
  • Mask or redact sensitive data when it must be referenced
  • Implement log access controls appropriate to data sensitivity
  • Consider data retention policies and compliance requirements
  • Audit log access for sensitive systems
  • 切勿记录敏感信息(密码、令牌、个人可识别信息PII)
  • 当必须引用敏感数据时,需进行掩码或脱敏处理
  • 根据数据敏感度实施相应的日志访问控制
  • 考虑数据保留策略与合规要求
  • 对敏感系统的日志访问进行审计

Performance Considerations

性能考量

  • Use asynchronous logging to avoid blocking application threads
  • Implement log sampling for high-volume debug logs in production
  • Buffer logs appropriately to balance latency and throughput
  • Monitor logging infrastructure for bottlenecks
  • Set appropriate log levels per environment
  • 使用异步日志记录,避免阻塞应用线程
  • 在生产环境中对高流量DEBUG日志实施采样
  • 合理缓冲日志,平衡延迟与吞吐量
  • 监控日志基础设施的瓶颈
  • 为不同环境设置恰当的日志级别

Log Aggregation

日志聚合

  • Centralize logs from all services into a single platform
  • Use consistent formatting across all services
  • Implement log rotation and retention policies
  • Enable full-text search and filtering capabilities
  • Set up log-based alerts for critical patterns
  • 将所有服务的日志集中到单一平台
  • 在所有服务中使用一致的日志格式
  • 实施日志轮转与保留策略
  • 启用全文搜索与过滤功能
  • 针对关键模式设置基于日志的告警

Error Logging

错误日志记录

  • Include full error context: message, code, stack trace
  • Log the chain of errors in wrapped/nested exceptions
  • Include relevant request and state information
  • Avoid duplicate error logging across layers
  • Log error recovery actions and outcomes
  • 包含完整的错误上下文:消息、代码、堆栈跟踪
  • 记录包装/嵌套异常中的错误链
  • 包含相关的请求与状态信息
  • 避免跨层级重复记录错误
  • 记录错误恢复操作及结果

Best Practices

最佳实践

  • Log at service boundaries (entry and exit points)
  • Include timing information for performance analysis
  • Log configuration changes and deployments
  • Create actionable log messages that aid debugging
  • Review and clean up logging regularly to reduce noise
  • 在服务边界(入口与出口点)记录日志
  • 包含用于性能分析的计时信息
  • 记录配置变更与部署操作
  • 编写有助于调试的可操作日志消息
  • 定期审核并清理日志,减少冗余信息

Log Message Guidelines

日志消息指南

  • Write clear, descriptive messages
  • Include relevant identifiers (user ID, order ID, etc.)
  • Avoid generic messages like "Error occurred"
  • Use consistent terminology across the application
  • Include enough context to understand the event without additional lookups
  • 撰写清晰、描述性的消息
  • 包含相关标识符(用户ID、订单ID等)
  • 避免使用"发生错误"这类通用消息
  • 在应用中使用一致的术语
  • 包含足够上下文,无需额外查询即可理解事件

Environment-Specific Configuration

环境特定配置

  • Development: DEBUG level, console output, verbose formatting
  • Staging: INFO level, structured JSON, full context
  • Production: INFO/WARN level, structured JSON, sampling for DEBUG
  • 开发环境:DEBUG级别、控制台输出、详细格式
  • 预发布环境(Staging):INFO级别、结构化JSON、完整上下文
  • 生产环境:INFO/WARN级别、结构化JSON、对DEBUG日志进行采样