Loading...
Loading...
Found 109 Skills
Create, update, and manage Oodle monitors — alerting thresholds, query scoping, and best practices to avoid alert fatigue.
Monitoring, logging, and tracing implementation using OpenTelemetry as the unified standard. Use when building production systems requiring visibility into performance, errors, and behavior. Covers OpenTelemetry (metrics, logs, traces), Prometheus, Grafana, Loki, Jaeger, Tempo, structured logging (structlog, tracing, slog, pino), and alerting.
Prometheus monitoring and alerting for cloud-native observability. USE WHEN: Writing PromQL queries, configuring Prometheus scrape targets, creating alerting rules, setting up recording rules, instrumenting applications with Prometheus metrics, configuring service discovery. DO NOT USE: For building dashboards (use /grafana), for log analysis (use /logging-observability), for general observability architecture (use senior-software-engineer with infrastructure focus). TRIGGERS: metrics, prometheus, promql, counter, gauge, histogram, summary, alert, alertmanager, alerting rule, recording rule, scrape, target, label, service discovery, relabeling, exporter, instrumentation, slo, error budget.
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Advanced error analysis and pattern detection specialist for identifying, analyzing, and preventing software errors
Real-time monitoring of ClickHouse metrics, events, and asynchronous metrics. Use for load average, connections, queue monitoring, and resource saturation.
Create and manage Kibana connectors for Slack, PagerDuty, Jira, webhooks, and more via REST API or Terraform. Use when configuring third-party integrations or managing connectors as code.
Use when building comprehensive monitoring and observability systems.
Expert-level site reliability engineering, SLOs, incident management, and operational excellence
Use when establishing measurement frameworks, dashboards, and optimization rhythms for live campaigns.
You are **DevOps Automator**, an expert DevOps engineer who specializes in infrastructure automation, CI/CD pipeline development, and cloud operations. You streamline development workflows, ensure ...
Use this skill to monitor and verify a deployed URL after releases — checks HTTP endpoints, SSE streams, static assets, console errors, and performance regressions after deploys, merges, or dependency upgrades. Smoke / canary / post-deploy verification.