183-observability-tracing-opentelemetry

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Java Distributed Tracing with OpenTelemetry

使用OpenTelemetry实现Java分布式追踪

Implement robust distributed tracing in Java with OpenTelemetry by modeling meaningful spans, preserving context propagation, and instrumenting critical business and infrastructure paths with low-overhead, privacy-safe telemetry.
What is covered in this Skill?
  • OpenTelemetry tracing fundamentals for Java services
  • Span design: boundaries, parent/child relationships, and operation naming
  • Context propagation across HTTP, messaging, async tasks, and thread boundaries
  • Semantic conventions and stable attribute naming
  • Error/status/event recording best practices
  • Sampling strategy and performance/cost trade-offs
  • Privacy and security controls for trace attributes
  • Testing and verification of trace propagation and span correctness
Scope: Distributed tracing quality in application and integration layers, focused on diagnosability, consistency, and operational safety.
通过建模有意义的跨度、保留上下文传播,并以低开销、隐私安全的遥测技术为关键业务和基础设施路径添加监控,在Java中利用OpenTelemetry实现健壮的分布式追踪。
本Skill涵盖哪些内容?
  • Java服务的OpenTelemetry追踪基础
  • 跨度设计:边界、父子关系和操作命名
  • 跨HTTP、消息队列、异步任务和线程边界的上下文传播
  • 语义约定和稳定的属性命名
  • 错误/状态/事件记录的最佳实践
  • 采样策略与性能/成本权衡
  • 追踪属性的隐私和安全控制
  • 追踪传播和跨度正确性的测试与验证
范围: 应用层和集成层的分布式追踪质量,聚焦于可诊断性、一致性和操作安全性。

Constraints

约束条件

Tracing instrumentation must preserve context correctly and avoid leaking sensitive data. Over-instrumentation and high-cardinality attributes can harm cost and signal quality.
  • PROPAGATION FIRST: Ensure context propagation across all sync/async boundaries before adding extra span detail
  • NO SENSITIVE DATA: Never store secrets, credentials, tokens, raw payloads, or PII in span attributes/events
  • LOW CARDINALITY ATTRIBUTES: Avoid unbounded values in attributes that are used for aggregation/search
  • VERIFY: Run
    ./mvnw clean verify
    or
    mvn clean verify
    after applying tracing changes
追踪工具代码必须正确保留上下文,避免泄露敏感数据。过度监控和高基数属性会损害成本和信号质量。
  • 优先保证传播: 在添加额外跨度细节之前,确保上下文在所有同步/异步边界间正确传播
  • 禁止敏感数据: 切勿在跨度属性/事件中存储密钥、凭证、令牌、原始负载或个人身份信息(PII)
  • 低基数属性: 避免在用于聚合/搜索的属性中使用无界值
  • 验证: 在应用追踪变更后运行
    ./mvnw clean verify
    mvn clean verify

When to use this skill

何时使用本Skill

  • Improve tracing
  • Apply OpenTelemetry tracing
  • Add distributed tracing
  • Refactor tracing instrumentation
  • 改进追踪
  • 应用OpenTelemetry追踪
  • 添加分布式追踪
  • 重构追踪工具代码

Workflow

工作流程

  1. Define trace model and critical flows
Identify high-value request and async flows, define operation boundaries, and choose span names/attributes aligned with semantic conventions.
  1. Instrument and propagate context
Add OpenTelemetry spans to key boundaries and ensure trace context is propagated across HTTP clients/servers, messaging, and executor-based async work.
  1. Harden span data and sampling
Record status/errors/events consistently, remove sensitive data, control attribute cardinality, and configure sampling/exporters according to environment needs.
  1. Validate traces end-to-end
Verify parent-child relationships, propagation continuity, and backend visibility through tests and runtime checks.
  1. 定义追踪模型和关键流程
识别高价值请求和异步流程,定义操作边界,并选择符合语义约定的跨度名称/属性。
  1. 添加监控并传播上下文
在关键边界处添加OpenTelemetry跨度,确保追踪上下文在HTTP客户端/服务器、消息队列和基于执行器的异步工作中正确传播。
  1. 强化跨度数据和采样策略
持续记录状态/错误/事件,移除敏感数据,控制属性基数,并根据环境需求配置采样/导出器。
  1. 端到端验证追踪
通过测试和运行时检查验证父子关系、传播连续性和后端可见性。

Reference

参考资料

For detailed guidance, examples, and constraints, see references/183-observability-tracing-opentelemetry.md.
如需详细指南、示例和约束条件,请参阅 references/183-observability-tracing-opentelemetry.md