Loading...
Loading...
Found 65 Skills
Expert performance engineer specializing in modern observability, application optimization, and scalable system performance. Masters OpenTelemetry, distributed tracing, load testing, multi-tier caching, Core Web Vitals, and performance monitoring. Handles end-to-end optimization, real user monitoring, and scalability patterns. Use PROACTIVELY for performance optimization, observability, or scalability challenges.
Production-grade backend service development across Node.js (Express/Fastify/NestJS/Hono), Bun, Python (FastAPI), Go, and Rust (Axum), with PostgreSQL and common ORMs (Prisma/Drizzle/SQLAlchemy/GORM/SeaORM). Use for REST/GraphQL/tRPC APIs, auth (OIDC/OAuth), caching, background jobs, observability (OpenTelemetry), testing, deployment readiness, and zero-trust defaults.
Expert guidance for building production-ready FastAPI applications with modular architecture where each business domain is an independent module with own routes, models, schemas, services, cache, and migrations. Uses UV + pyproject.toml for modern Python dependency management, project name subdirectory for clean workspace organization, structlog (JSON+colored logging), pydantic-settings configuration, auto-discovery module loader, async SQLAlchemy with PostgreSQL, per-module Alembic migrations, Redis/memory cache with module-specific namespaces, central httpx client, OpenTelemetry/Prometheus observability, conversation ID tracking (X-Conversation-ID header+cookie), conditional Keycloak/app-based RBAC authentication, DDD/clean code principles, and automation scripts for rapid module development. Use when user requests FastAPI project setup, modular architecture, independent module development, microservice architecture, async database operations, caching strategies, logging patterns, configuration management, authentication systems, observability implementation, or enterprise Python web services. Supports max 3-4 route nesting depth, cache invalidation patterns, inter-module communication via service layer, and comprehensive error handling workflows.
Use this skill when working on infrastructure, DevOps, CI/CD, Kubernetes, cloud deployment, observability, or cost optimization. Activates on mentions of Kubernetes, Docker, Terraform, Pulumi, OpenTofu, GitOps, Argo CD, Flux, CI/CD, GitHub Actions, observability, OpenTelemetry, Prometheus, Grafana, AWS, GCP, Azure, infrastructure as code, platform engineering, FinOps, or cloud costs.
Use when adding logging to services, setting up monitoring, creating alerts, debugging production issues, designing SLIs/SLOs, or implementing structured logging (Pino, Winston), metrics (Prometheus, DataDog, CloudWatch), or distributed tracing (OpenTelemetry).
Guide for implementing Grafana Tempo - a high-scale distributed tracing backend for OpenTelemetry traces. Use when configuring Tempo deployments, setting up storage backends (S3, Azure Blob, GCS), writing TraceQL queries, deploying via Helm, understanding trace structure, or troubleshooting Tempo issues on Kubernetes.
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
Implement distributed tracing with correlation IDs, trace propagation, and span tracking across microservices. Use when debugging distributed systems, monitoring request flows, or implementing observability.
Production observability with structured logging, metrics collection, distributed tracing, and alerting
Migrate a Java application from the classic Elastic APM Java agent to the EDOT Java agent. Use when switching from elastic-apm-agent.jar to elastic-otel-javaagent.jar.
Senior Site Reliability Engineer & Debug Architect. Expert in AI-assisted observability, distributed tracing, and autonomous incident remediation in 2026.
Set up comprehensive observability for Groq integrations with metrics, traces, and alerts. Use when implementing monitoring for Groq operations, setting up dashboards, or configuring alerting for Groq integration health. Trigger with phrases like "groq monitoring", "groq metrics", "groq observability", "monitor groq", "groq alerts", "groq tracing".