Loading...
Loading...
Implement OpenTelemetry logs/metrics/traces, SLI/SLO gates, burn-rate alerts, and APM integrations. Use when adding or validating observability.
npx skill4agent add vasilyu1983/ai-agents-public qa-observabilitydata/sources.jsontraceparentassets/checklists/template-observability-readiness-checklist.mdassets/monitoring/slo/*| Task | Recommended default | Notes |
|---|---|---|
| Tracing | OpenTelemetry + Jaeger/Tempo | Prefer OTLP exporters via Collector when possible |
| Metrics | Prometheus + Grafana | Use histograms for latency; watch cardinality |
| Logging | Structured JSON + correlation IDs | Never log secrets/PII; redact aggressively |
| Reliability gates | SLOs + error budgets + burn-rate alerts | Gate releases on sustained burn/regressions |
| Performance | Profiling + load tests + budgets | Add continuous profiling for intermittent issues |
| Zero-code visibility | eBPF (OpenTelemetry zero-code) + continuous profiling (Parca/Pyroscope) | Use when code changes are not feasible |
| If the user needs... | Read | Also use |
|---|---|---|
| A minimal, production-ready baseline | | |
| Node/Python instrumentation setup | | |
| Working trace propagation across services | | |
| SLOs, burn-rate alerts, and release gates | | |
| Profiling/load testing with evidence | | |
| A maturity model and roadmap | | |
| What to avoid and how to fix it | | |
| Alert design and fatigue reduction | | |
| Dashboard hierarchy and layout | | |
| Structured logging and cost control | | |
references/core-observability-patterns.mdreferences/opentelemetry-best-practices.mdreferences/distributed-tracing-patterns.mdreferences/slo-design-guide.mdreferences/performance-profiling-guide.mdreferences/observability-maturity-model.mdreferences/anti-patterns-best-practices.mdreferences/alerting-strategies.mdreferences/dashboard-design-patterns.mdreferences/log-aggregation-patterns.mdassets/checklists/template-observability-readiness-checklist.mdassets/opentelemetry/nodejs/opentelemetry-nodejs-setup.mdassets/opentelemetry/python/opentelemetry-python-setup.mdassets/monitoring/slo/slo-definition.yamlassets/monitoring/slo/prometheus-alert-rules.yamlassets/monitoring/grafana/grafana-dashboard-slo.jsonassets/monitoring/grafana/template-grafana-dashboard-observability.jsonassets/load-testing/load-testing-k6.jsassets/load-testing/template-load-test-artillery.yamlassets/performance/frontend/template-lighthouse-ci.jsonassets/performance/backend/template-nodejs-profiling-config.jsdata/sources.json../ops-devops-platform/SKILL.md../data-sql-optimization/SKILL.md../qa-debugging/SKILL.md../qa-testing-strategy/SKILL.md../qa-resilience/SKILL.md../software-architecture-design/SKILL.mddata/sources.json