Loading...
Loading...
Found 389 Skills
Use when building cloud-native apps. Keywords: kubernetes, k8s, docker, container, grpc, tonic, microservice, service mesh, observability, tracing, metrics, health check, cloud, deployment, 云原生, 微服务, 容器
Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, building agents, or setting up LLM observability.
Production-grade backend service development across Node.js (Express/Fastify/NestJS/Hono), Bun, Python (FastAPI), Go, and Rust (Axum), with PostgreSQL and common ORMs (Prisma/Drizzle/SQLAlchemy/GORM/SeaORM). Use for REST/GraphQL/tRPC APIs, auth (OIDC/OAuth), caching, background jobs, observability (OpenTelemetry), testing, deployment readiness, and zero-trust defaults.
AWS CloudFormation patterns for CloudWatch monitoring, metrics, alarms, dashboards, logs, and observability. Use when creating CloudWatch metrics, alarms, dashboards, log groups, log subscriptions, anomaly detection, synthesized canaries, Application Signals, and implementing template structure with Parameters, Outputs, Mappings, Conditions, cross-stack references, and CloudWatch best practices for monitoring production infrastructure.
Use when building or reviewing service, job, or CLI runtime behavior in Python — designing startup validation, shutdown sequences, observability, and structured logging. Also use when startup crashes from late config, shutdown leaves orphaned processes, terminal states are implicit, or logs lack structure.
DigitalOcean management services for monitoring, uptime checks, and resource organization with Projects. Use when setting up observability, alerts, and operational visibility on DigitalOcean.
Use this skill when designing backend systems, databases, APIs, or services. Triggers on schema design, database migrations, indexing strategies, distributed systems architecture, microservices, caching, message queues, observability setup, logging, metrics, tracing, SLO/SLI definition, performance optimization, query tuning, security hardening, authentication, authorization, API design (REST, GraphQL, gRPC), rate limiting, pagination, and failure handling patterns. Acts as a senior backend engineering advisor for mid-level engineers leveling up.
Complete reference for the Galileo AI platform Python SDK for evaluating, observing, and protecting GenAI applications. Use when building Python applications that need LLM evaluation, production observability, tracing, or runtime guardrails with Galileo.
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
Implement service mesh (Istio, Linkerd) for service-to-service communication, traffic management, security, and observability.
You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.