Loading...
Loading...
Found 17 Skills
Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.
Set up comprehensive infrastructure monitoring with Prometheus, Grafana, and alerting systems for metrics, health checks, and performance tracking.
Author monitoring resources: PrometheusRules, ServiceMonitors, PodMonitors, AlertmanagerConfig, Silence CRs, and canary-checker health checks. Use when: (1) Creating or modifying alert rules (PrometheusRule), (2) Adding scrape targets (ServiceMonitor/PodMonitor), (3) Configuring Alertmanager routing or silences, (4) Writing canary-checker health checks, (5) Creating recording rules, (6) Adding monitoring for a new application or platform component. Triggers: "create alert", "add alerting", "PrometheusRule", "ServiceMonitor", "PodMonitor", "AlertmanagerConfig", "silence alert", "canary check", "recording rule", "add monitoring", "scrape target", "alert rule", "prometheus rule", "health check canary"
Expert knowledge for Azure Monitor development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure Monitor applications. Not for Azure Managed Grafana (use azure-managed-grafana), Azure Network Watcher (use azure-network-watcher), Azure Service Health (use azure-service-health), Azure Defender For Cloud (use azure-defender-for-cloud).
Write, validate, and optimise PromQL queries for Prometheus and Grafana Cloud Metrics. Use when the user asks to query metrics, write a PromQL expression, calculate rates, aggregate across labels, build histogram quantiles, create recording rules, debug query performance, or understand metric cardinality. Triggers on phrases like "PromQL", "Prometheus query", "write a metric query", "calculate rate", "histogram_quantile", "recording rule", "metric cardinality", "sum by", "rate vs irate", "absent()", or "query is slow".
监控与告警
Monitoring and observability with OpenTelemetry, Prometheus, Grafana dashboards, and structured logging
Set up comprehensive observability for Groq integrations with metrics, traces, and alerts. Use when implementing monitoring for Groq operations, setting up dashboards, or configuring alerting for Groq integration health. Trigger with phrases like "groq monitoring", "groq metrics", "groq observability", "monitor groq", "groq alerts", "groq tracing".
Prometheus and Grafana Cloud Metrics overview including PromQL query language, Metrics Drilldown, alerting, recording rules, and integration patterns. Use when working with Prometheus, writing PromQL queries, configuring alerting, or discussing metrics architecture and best practices.
Help me troubleshoot service issues based on Prometheus metrics
Prometheus/Grafana metrics analysis and PromQL queries. Use when investigating latency, error rates, resource usage, or any time-series metrics.
Use this skill when implementing logging, metrics, distributed tracing, alerting, or defining SLOs. Triggers on structured logging, Prometheus, Grafana, OpenTelemetry, Datadog, distributed tracing, error tracking, dashboards, alert fatigue, SLIs, SLOs, error budgets, and any task requiring system observability or monitoring setup.