Search Results: alerting

Found 109 Skills

DevOps & Cloud Servicesakin-ozer/cc-devops-skill...

promql-validator

Validate, lint, audit, or fix PromQL queries and alerting rules; detects anti-patterns.

DevOps & Cloud Servicescontrol-theory/dstl8

dstl8

Set up and use Dstl8 for observability. Triggers: install or configure Dstl8 (CLI, sources, MCP); incident triage and investigation; root cause analysis; checking whether a deploy fixed an issue; alerting on recurring patterns; cross-environment correlation; pre-coding context on past incidents and recent issues.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesmjunaidca/mjs-agent-skill...

operating-production-services

SRE patterns for production service reliability: SLOs, error budgets, postmortems, and incident response. Use when defining reliability targets, writing postmortems, implementing SLO alerting, or establishing on-call practices. NOT for initial service development (use scaffolding skills instead).

🇺🇸|EnglishTranslated

1 scripts/Checked

DevOps & Cloud Servicesg1joshi/agent-skills

prometheus

Prometheus monitoring and alerting with PromQL. Use for metrics collection.

🇺🇸|EnglishTranslated

Automationguia-matthieu/clawfu-skil...

competitor-monitor

Monitor competitor websites for changes. Use when: tracking competitor pricing changes; monitoring new features; watching for content updates; alerting on website changes; competitive intelligence

🇺🇸|EnglishTranslated

1 scripts/Attention

DevOps & Cloud Servicesrightnow-ai/openfang

prometheus

Prometheus monitoring expert for PromQL, alerting rules, Grafana dashboards, and observability

🇺🇸|EnglishTranslated

Tools & Utilitiesyonatangross/orchestkit

web-research-workflow

Unified decision tree for web research and competitive monitoring. Auto-selects WebFetch, Tavily, or agent-browser based on target site characteristics and available API keys. Includes competitor page tracking, snapshot diffing, and change alerting. Use when researching web content, scraping, extracting raw markdown, capturing documentation, or monitoring competitor changes.

🇺🇸|EnglishTranslated

Data Processingfinsilabs/awesome-ecommer...

financial-analytics-dashboard

Build interactive financial KPI dashboards with customizable metrics, drill-down analysis, variance explanations, and automated threshold-based alerting

🇺🇸|EnglishTranslated

AI & Machine Learningpproenca/dot-skills

marketplace-search-recsys-planning

Use this skill whenever planning, designing, reviewing, or improving search and recommendation systems for a two-sided trust marketplace built on OpenSearch — covers user-intent framing, product-surface architecture, index design, query understanding, retrieval strategy, ranking, search-plus-recs blending, measurement, and a dashboard-and-alerting layer for ongoing decision making. Triggers on tasks involving marketplace search, homefeeds, ranking, relevance tuning, OpenSearch query DSL, analyzers, synonyms, golden sets, NDCG, A/B testing, or diagnosing an existing retrieval system. Use this skill BEFORE marketplace-personalisation when planning new work; hand off when the diagnosed bottleneck is personalisation-specific.

🇺🇸|EnglishTranslated

Code Qualityerichowens/some_claude_sk...

error-handling-patterns

Design error handling strategies for TypeScript and Python applications — exception hierarchies, Result/Either types, retry patterns, error boundaries, and structured error logging. Use when designing error handling architecture, choosing between exceptions and Result types, implementing retry logic, or building error recovery flows. Activate on "error handling", "exception hierarchy", "Result type", "retry pattern", "circuit breaker", "error boundary", "Pokemon exception". NOT for debugging specific runtime errors, logging infrastructure setup, or monitoring/alerting configuration.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesdaemon-blockint-tech/agen...

site-reliability-engineer

Guides Site Reliability Engineering—SLI/SLO and error budgets, reliability dashboards and burn-rate alerting, production readiness reviews, capacity planning for availability, toil reduction, dependency and failure-mode analysis, release reliability (canaries, rollback criteria), and service-owner incident mitigation tied to customer impact. Use when defining or operating SLOs, measuring error budget burn, improving service reliability, running PRRs before launch, planning scalable resilient capacity, or leading technical mitigation during outages—not for CI/CD pipeline implementation (devops), incident program and paging policy design (incident-management-engineer), cloud access and patch tickets (cloud-system-administrator), load-test profiling (performance-engineer), rollout cutover strategy (deployment-strategist), or greenfield cloud build-out (cloud-engineer).

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesharness/harness-skills

ai-operations

Configure Harness AI-powered operations (AIDA) via MCP. Set up predictive failure analysis with ML models for memory leaks, disk exhaustion, connection pool saturation, and latency degradation. Configure intelligent alert correlation and noise reduction to reduce alert volume. Use when asked to set up predictive failure analysis, configure AI-powered alerting, reduce alert noise, or enable ML-based anomaly detection. Do NOT use for pipeline debugging (use debug-pipeline instead) or SLO management (use manage-slos instead). Trigger phrases: AIDA, predictive failure, alert correlation, noise reduction, anomaly detection, AI ops, predictive analysis, alert fatigue, ML alerting, intelligent alerting.

🇺🇸|EnglishTranslated