Loading...
Loading...
Found 112 Skills
CI/CD pipeline design, containerization, and infrastructure management. Handles Docker, Kubernetes, monitoring setup (Prometheus/Grafana), and infrastructure-as-code (Terraform/Pulumi).
Expert knowledge for Azure Monitor development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure Monitor applications. Not for Azure Managed Grafana (use azure-managed-grafana), Azure Network Watcher (use azure-network-watcher), Azure Service Health (use azure-service-health), Azure Defender For Cloud (use azure-defender-for-cloud).
Use this skill when implementing logging, metrics, distributed tracing, alerting, or defining SLOs. Triggers on structured logging, Prometheus, Grafana, OpenTelemetry, Datadog, distributed tracing, error tracking, dashboards, alert fatigue, SLIs, SLOs, error budgets, and any task requiring system observability or monitoring setup.
Export cost-tracking telemetry in Prometheus textfile or webhook JSON formats — for external observability (Grafana, Datadog, custom dashboards)
Use when operating production Kubernetes — Helm, autoscaling (HPA/VPA), resource management, StatefulSets, external-secrets, observability (Prometheus/Grafana/Loki), RBAC, Pod Security Standards, NetworkPolicies, admission control, backup (Velero), and cost control.
Guide for implementing HolmesGPT - an AI agent for troubleshooting cloud-native environments. Use when investigating Kubernetes issues, analyzing alerts from Prometheus/AlertManager/PagerDuty, performing root cause analysis, configuring HolmesGPT installations (CLI/Helm/Docker), setting up AI providers (OpenAI/Anthropic/Azure), creating custom toolsets, or integrating with observability platforms (Grafana, Loki, Tempo, DataDog).
Prometheus monitoring and alerting for cloud-native observability. USE WHEN: Writing PromQL queries, configuring Prometheus scrape targets, creating alerting rules, setting up recording rules, instrumenting applications with Prometheus metrics, configuring service discovery. DO NOT USE: For building dashboards (use /grafana), for log analysis (use /logging-observability), for general observability architecture (use senior-software-engineer with infrastructure focus). TRIGGERS: metrics, prometheus, promql, counter, gauge, histogram, summary, alert, alertmanager, alerting rule, recording rule, scrape, target, label, service discovery, relabeling, exporter, instrumentation, slo, error budget.
Guided Perses dashboard creation: gather requirements (metrics, datasource, layout), generate CUE definition or JSON spec, validate with percli lint, deploy with percli apply or MCP perses_create_dashboard. Use when user wants to create a new Perses dashboard, build a monitoring dashboard, or generate dashboard definitions. Use for "create perses dashboard", "new dashboard", "perses new dashboard", "build dashboard". Do NOT use for Grafana migration (use perses-grafana-migrate) or plugin development (use perses-plugin-create).
Azure AD OAuth2/OIDC SSO integration for Kubernetes applications. Use when implementing Single Sign-On, configuring Azure AD App Registrations, restricting access by groups, or integrating tools (DefectDojo, Grafana, ArgoCD, Harbor, SonarQube) with Azure AD authentication.
Monitoring, logging, and tracing implementation using OpenTelemetry as the unified standard. Use when building production systems requiring visibility into performance, errors, and behavior. Covers OpenTelemetry (metrics, logs, traces), Prometheus, Grafana, Loki, Jaeger, Tempo, structured logging (structlog, tracing, slog, pino), and alerting.
Use this skill when working on infrastructure, DevOps, CI/CD, Kubernetes, cloud deployment, observability, or cost optimization. Activates on mentions of Kubernetes, Docker, Terraform, Pulumi, OpenTofu, GitOps, Argo CD, Flux, CI/CD, GitHub Actions, observability, OpenTelemetry, Prometheus, Grafana, AWS, GCP, Azure, infrastructure as code, platform engineering, FinOps, or cloud costs.
Use this skill when working with SigNoz - open-source observability platform for application monitoring, distributed tracing, log management, metrics, alerts, and dashboards. Triggers on SigNoz setup, OpenTelemetry instrumentation for SigNoz, sending traces/logs/metrics to SigNoz, creating SigNoz dashboards, configuring SigNoz alerts, exception monitoring, and migrating from Datadog/Grafana/New Relic to SigNoz.