iii-observability
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseObservability
可观测性
Comparable to: Datadog, Grafana, Honeycomb, Jaeger
同类工具:Datadog、Grafana、Honeycomb、Jaeger
Key Concepts
核心概念
Use the concepts below when they fit the task. Not every worker needs custom spans or metrics.
- Built-in OpenTelemetry support across all SDKs — every function invocation is automatically traced
- The engine exports traces, metrics, and logs via OTLP to any compatible collector
- Workers propagate W3C trace context automatically across function invocations
- Prometheus metrics are exposed on port 9464
- with
registerWorker()config enables telemetry per workerotel - Custom spans via wrap async work with trace context
withSpan(name, opts, fn) - Custom metrics via create counters and histograms
getMeter()
在任务需要时使用以下概念,并非每个worker都需要自定义跨度或指标。
- 所有SDK均内置OpenTelemetry支持——每次函数调用都会自动被追踪
- 引擎通过OTLP将追踪数据、指标和日志导出至任何兼容的收集器
- Worker会在函数调用间自动传播W3C追踪上下文
- Prometheus指标在9464端口暴露
- 带有配置的
otel可为每个worker启用遥测功能registerWorker() - 通过创建自定义跨度,为异步工作包裹追踪上下文
withSpan(name, opts, fn) - 通过创建自定义指标,生成计数器和直方图
getMeter()
Architecture
架构
The worker SDK generates spans, metrics, and logs during function execution. These flow to the engine, which exports them via OTLP to a collector (Jaeger, Grafana, Datadog). The engine also exposes a Prometheus endpoint on port 9464 for scraping.
Worker SDK在函数执行期间生成跨度、指标和日志。这些数据流向引擎,再由引擎通过OTLP导出至收集器(如Jaeger、Grafana、Datadog)。引擎还会在9464端口暴露Prometheus端点以供抓取。
iii Primitives Used
使用的iii原语
| Primitive | Purpose |
|---|---|
| Connect worker with telemetry config |
| Create a custom trace span |
| Access OpenTelemetry Tracer directly |
| Access OpenTelemetry Meter for custom metrics |
| Get active trace ID for correlation |
| Inject W3C trace context into outbound calls |
| Subscribe to log events |
| Graceful shutdown of telemetry pipeline |
| 原语 | 用途 |
|---|---|
| 连接worker并配置遥测功能 |
| 创建自定义追踪跨度 |
| 直接访问OpenTelemetry Tracer |
| 访问OpenTelemetry Meter以创建自定义指标 |
| 获取当前追踪ID用于关联 |
| 将W3C追踪上下文注入出站调用 |
| 订阅日志事件 |
| 优雅关闭遥测管道 |
Reference Implementation
参考实现
See ../references/observability.js for the full working example — a worker with custom spans,
Also available in Python: ../references/observability.py
Also available in Rust: ../references/observability.rs
metrics counters, trace propagation, and log subscriptions connected to an OTel collector.
完整可运行示例请查看../references/observability.js——这是一个包含自定义跨度、指标计数器、追踪传播和日志订阅的worker,已连接至OTel收集器。
同时提供Python版本:../references/observability.py
以及Rust版本:../references/observability.rs
Common Patterns
常见模式
Code using this pattern commonly includes, when relevant:
- — enable telemetry
registerWorker('ws://localhost:49134', { otel: { enabled: true, serviceName: 'my-svc' } }) - — custom span
withSpan('validate-order', {}, async (span) => { span.setAttribute('order.id', id); ... }) - — custom counter metric
getMeter().createCounter('orders.processed') - — custom histogram metric
getMeter().createHistogram('request.duration') - — subscribe to warnings and above
onLog((log) => { ... }, { level: 'warn' }) - — get active trace ID for correlation with external systems
currentTraceId() - — propagate trace context to outbound HTTP calls
injectTraceparent() - Disable telemetry: or
registerWorker(url, { otel: { enabled: false } })OTEL_ENABLED=false
相关代码通常包含以下内容(按需使用):
- —— 启用遥测功能
registerWorker('ws://localhost:49134', { otel: { enabled: true, serviceName: 'my-svc' } }) - —— 自定义跨度
withSpan('validate-order', {}, async (span) => { span.setAttribute('order.id', id); ... }) - —— 自定义计数器指标
getMeter().createCounter('orders.processed') - —— 自定义直方图指标
getMeter().createHistogram('request.duration') - —— 订阅警告及以上级别的日志
onLog((log) => { ... }, { level: 'warn' }) - —— 获取当前追踪ID以与外部系统关联
currentTraceId() - —— 将追踪上下文传播至出站HTTP调用
injectTraceparent() - 禁用遥测:或
registerWorker(url, { otel: { enabled: false } })OTEL_ENABLED=false
Adapting This Pattern
模式适配
Use the adaptations below when they apply to the task.
- Enable in
otelconfig to start collecting traces automaticallyregisterWorker() - Add custom spans around expensive operations (DB queries, LLM calls, external APIs)
- Create domain-specific metrics (orders processed, payment failures, queue depth)
- Use to correlate iii traces with external system logs
currentTraceId() - Configure in iii-config.yaml for engine-side exporter, sampling ratio, and alerts
OtelModule - Point the OTLP endpoint at your collector (Jaeger, Grafana Tempo, Datadog Agent)
根据任务需求使用以下适配方式:
- 在配置中启用
registerWorker(),开始自动收集追踪数据otel - 在耗时操作(数据库查询、LLM调用、外部API)周围添加自定义跨度
- 创建领域特定指标(已处理订单数、支付失败数、队列深度)
- 使用将iii追踪数据与外部系统日志关联
currentTraceId() - 在iii-config.yaml中配置,设置引擎端导出器、采样率和告警规则
OtelModule - 将OTLP端点指向你的收集器(Jaeger、Grafana Tempo、Datadog Agent)
Engine Configuration
引擎配置
OtelModule must be enabled in iii-config.yaml for engine-side traces, metrics, and logs. See ../references/iii-config.yaml for the full annotated config reference.
必须在iii-config.yaml中启用OtelModule,才能收集引擎端的追踪数据、指标和日志。完整带注释的配置参考请查看../references/iii-config.yaml。
Pattern Boundaries
模式边界
- For engine-side OtelModule YAML configuration, prefer .
iii-engine-config - For SDK init options and function registration, prefer .
iii-functions-and-triggers - Stay with when the primary problem is SDK-level telemetry: spans, metrics, logs, and trace propagation.
iii-observability
- 对于引擎端OtelModule的YAML配置,优先使用。
iii-engine-config - 对于SDK初始化选项和函数注册,优先使用。
iii-functions-and-triggers - 当核心问题是SDK级别的遥测(跨度、指标、日志和追踪传播)时,使用。
iii-observability
When to Use
使用场景
- Use this skill when the task is primarily about in the iii engine.
iii-observability - Triggers when the request directly asks for this pattern or an equivalent implementation.
- 当任务主要涉及iii引擎中的时,使用本技能。
iii-observability - 当请求直接要求此模式或等效实现时触发。
Boundaries
边界限制
- Never use this skill as a generic fallback for unrelated tasks.
- You must not apply this skill when a more specific iii skill is a better fit.
- Always verify environment and safety constraints before applying examples from this skill.
- 切勿将本技能作为无关任务的通用 fallback。
- 当有更特定的iii技能更合适时,不得使用本技能。
- 在应用本技能中的示例前,务必验证环境和安全约束。