Observability
Comparable to: Datadog, Grafana, Honeycomb, Jaeger
Key Concepts
Use the concepts below when they fit the task. Not every worker needs custom spans or metrics.
- Built-in OpenTelemetry support across all SDKs — every function invocation is automatically traced
- The engine exports traces, metrics, and logs via OTLP to any compatible collector
- Workers propagate W3C trace context automatically across function invocations
- Prometheus metrics are exposed on port 9464
- with config enables telemetry per worker
- Custom spans via wrap async work with trace context
- Custom metrics via create counters and histograms
Architecture
The worker SDK generates spans, metrics, and logs during function execution. These flow to the engine, which exports them via OTLP to a collector (Jaeger, Grafana, Datadog). The engine also exposes a Prometheus endpoint on port 9464 for scraping.
iii Primitives Used
| Primitive | Purpose |
|---|
registerWorker(url, { otel })
| Connect worker with telemetry config |
| Create a custom trace span |
| Access OpenTelemetry Tracer directly |
| Access OpenTelemetry Meter for custom metrics |
| Get active trace ID for correlation |
| Inject W3C trace context into outbound calls |
onLog(callback, { level })
| Subscribe to log events |
| Graceful shutdown of telemetry pipeline |
Reference Implementation
See ../references/observability.js for the full working example — a worker with custom spans,
Also available in Python: ../references/observability.py
Also available in Rust: ../references/observability.rs
metrics counters, trace propagation, and log subscriptions connected to an OTel collector.
Common Patterns
Code using this pattern commonly includes, when relevant:
registerWorker('ws://localhost:49134', { otel: { enabled: true, serviceName: 'my-svc' } })
— enable telemetry
withSpan('validate-order', {}, async (span) => { span.setAttribute('order.id', id); ... })
— custom span
getMeter().createCounter('orders.processed')
— custom counter metric
getMeter().createHistogram('request.duration')
— custom histogram metric
onLog((log) => { ... }, { level: 'warn' })
— subscribe to warnings and above
- — get active trace ID for correlation with external systems
- — propagate trace context to outbound HTTP calls
- Disable telemetry:
registerWorker(url, { otel: { enabled: false } })
or
Adapting This Pattern
Use the adaptations below when they apply to the task.
- Enable in config to start collecting traces automatically
- Add custom spans around expensive operations (DB queries, LLM calls, external APIs)
- Create domain-specific metrics (orders processed, payment failures, queue depth)
- Use to correlate iii traces with external system logs
- Configure in iii-config.yaml for engine-side exporter, sampling ratio, and alerts
- Point the OTLP endpoint at your collector (Jaeger, Grafana Tempo, Datadog Agent)
Engine Configuration
OtelModule must be enabled in iii-config.yaml for engine-side traces, metrics, and logs. See ../references/iii-config.yaml for the full annotated config reference.
Pattern Boundaries
- For engine-side OtelModule YAML configuration, prefer .
- For SDK init options and function registration, prefer
iii-functions-and-triggers
.
- Stay with when the primary problem is SDK-level telemetry: spans, metrics, logs, and trace propagation.
When to Use
- Use this skill when the task is primarily about in the iii engine.
- Triggers when the request directly asks for this pattern or an equivalent implementation.
Boundaries
- Never use this skill as a generic fallback for unrelated tasks.
- You must not apply this skill when a more specific iii skill is a better fit.
- Always verify environment and safety constraints before applying examples from this skill.