Search Results: observability

Found 356 Skills

DevOps & Cloud Servicessickn33/antigravity-aweso...

devops-troubleshooter

Expert DevOps troubleshooter specializing in rapid incident response, advanced debugging, and modern observability. Masters log analysis, distributed tracing, Kubernetes debugging, performance optimization, and root cause analysis. Handles production outages, system reliability, and preventive monitoring. Use PROACTIVELY for debugging, incident response, or system troubleshooting.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessamber/cc-skills

promql-cli

CLI for querying Prometheus and PromQL-compatible engines (Thanos, Cortex, VictoriaMetrics, Grafana Mimir, Grafana Tempo...) — instant queries, range queries, metric discovery (metrics/labels/meta subcommands), output formats (table/csv/json/graph). Apply when executing PromQL queries, troubleshooting performance issues on a software having observability, investigating latency/error rates/saturation, or analyzing time series data.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceswshobson/agents

grafana-dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceszhanghandong/rust-skills

domain-cloud-native

Use when building cloud-native apps. Keywords: kubernetes, k8s, docker, container, grpc, tonic, microservice, service mesh, observability, tracing, metrics, health check, cloud, deployment, 云原生, 微服务, 容器

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesaj-geddes/useful-ai-promp...

prometheus-monitoring

Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.

🇺🇸|EnglishTranslated

AI & Machine Learningportkey-ai/skills

portkey-python-sdk

Complete reference for the Portkey AI Gateway Python SDK with unified API access to 200+ LLMs, automatic fallbacks, caching, and full observability. Use when building Python applications that need LLM integration with production-grade reliability.

🇺🇸|EnglishTranslated

Backend Developmentmindrally/skills

microservices

Guidelines for building production-grade microservices with FastAPI/Python and Go, covering serverless patterns, clean architecture, observability, and resilience.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesdash0hq/agent-skills

otel-instrumentation

Expert guidance for emitting high-quality, cost-efficient OpenTelemetry telemetry. Use when instrumenting applications with traces, metrics, or logs. Triggers on requests for observability, telemetry, tracing, metrics collection, logging integration, or OTel setup.

🇺🇸|EnglishTranslated

Backend Developmentvasilyu1983/ai-agents-pub...

software-backend

Production-grade backend service development across Node.js (Express/Fastify/NestJS/Hono), Bun, Python (FastAPI), Go, and Rust (Axum), with PostgreSQL and common ORMs (Prisma/Drizzle/SQLAlchemy/GORM/SeaORM). Use for REST/GraphQL/tRPC APIs, auth (OIDC/OAuth), caching, background jobs, observability (OpenTelemetry), testing, deployment readiness, and zero-trust defaults.

🇺🇸|EnglishTranslated

AI & Machine Learningsickn33/antigravity-aweso...

incident-response-smart-fix

[Extended thinking: This workflow implements a sophisticated debugging and resolution pipeline that leverages AI-assisted debugging tools and observability platforms to systematically diagnose and res

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessecondsky/claude-skills

workers-observability

Cloudflare Workers observability with logging, Analytics Engine, Tail Workers, metrics, and alerting. Use for monitoring, debugging, tracing, or encountering log parsing, metric aggregation, alert configuration errors.

🇺🇸|EnglishTranslated

5 scripts/Attention

Backend Developmentahgraber/skills

python-runtime-operations

Use when building or reviewing service, job, or CLI runtime behavior in Python — designing startup validation, shutdown sequences, observability, and structured logging. Also use when startup crashes from late config, shutdown leaves orphaned processes, terminal states are implicit, or logs lack structure.

🇺🇸|EnglishTranslated