Loading...
Loading...
Found 19 Skills
Configure Istio traffic management including routing, load balancing, circuit breakers, and canary deployments. Use when implementing service mesh traffic policies, progressive delivery, or resilience patterns.
Build production-ready systems with stability patterns: circuit breakers, bulkheads, timeouts, and retry logic. Use when the user mentions "production outage", "circuit breaker", "timeout strategy", "deployment pipeline", or "chaos engineering". Covers capacity planning, health checks, and anti-fragility patterns. For data systems, see ddia-systems. For system architecture, see system-design.
Use when designing domain error handling. Keywords: domain error, error categorization, recovery strategy, retry, fallback, domain error hierarchy, user-facing vs internal errors, error code design, circuit breaker, graceful degradation, resilience, error context, backoff, retry with backoff, error recovery, transient vs permanent error, 领域错误, 错误分类, 恢复策略, 重试, 熔断器, 优雅降级
Implement comprehensive API error handling with standardized error responses, logging, monitoring, and user-friendly messages. Use when building resilient APIs, debugging issues, or improving error reporting.
This skill should be used when implementing fault tolerance and resilience patterns in Spring Boot applications using the Resilience4j library. Apply this skill to add circuit breaker, retry, rate limiter, bulkhead, time limiter, and fallback mechanisms to prevent cascading failures, handle transient errors, and manage external service dependencies gracefully in microservices architectures.
Implements standardized API error responses with proper status codes, logging, and user-friendly messages. Use when building production APIs, implementing error recovery patterns, or integrating error monitoring services.
Provides comprehensive guidance for Spring Cloud microservices including service discovery, configuration management, load balancing, circuit breakers, API gateways, and distributed tracing. Use when the user asks about Spring Cloud, needs to build microservices, implement service discovery, or work with Spring Cloud components.
Error handling best practices across languages — error types, recovery strategies, user-facing messages, and logging. Reference when implementing error handling or designing error flows.
Implements reliability patterns including circuit breakers, retries, fallbacks, bulkheads, and SLO definitions. Provides failure mode analysis and incident response plans. Use for "SRE", "reliability", "resilience", or "failure handling".
Expert in making multi-agent systems resilient. Specializes in detecting loops, hallucinations, and failures, and implementing self-healing workflows. Use when designing error handling for agent systems, implementing retry strategies, or building resilient AI workflows.
Production-grade fault tolerance for distributed systems. Use when implementing circuit breakers, retry with exponential backoff, bulkhead isolation patterns, or building resilience into LLM API integrations.
Provides robust error handling strategies and patterns. Use when the user mentions resilience, error handling, fallbacks, or debugging failures.