Loading...
Loading...
Found 21 Skills
Chaos engineering principles, controlled failure injection, resilience testing, and system recovery validation. Use when testing distributed systems, building confidence in fault tolerance, or validating disaster recovery.
Implements security chaos engineering experiments that deliberately disable or degrade security controls to verify detection and response capabilities. Tests WAF bypass, firewall rule removal, log pipeline disruption, and EDR disablement scenarios using boto3 and subprocess. Use when validating SOC detection coverage and resilience.
Use this skill when implementing chaos engineering practices, designing fault injection experiments, running game days, or improving system resilience. Triggers on chaos engineering, fault injection, Chaos Monkey, Litmus, game days, resilience testing, failure modes, blast radius, and any task requiring controlled failure experimentation.
Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience testing, blast radius control, game days, antifragile systems.
Use when defining SLIs/SLOs, managing error budgets, or building reliable systems at scale. Invoke for incident management, chaos engineering, toil reduction, capacity planning.
Expert-level site reliability engineering, SLOs, incident management, and operational excellence
Use when building reliable and scalable distributed systems.
Expert in resilience testing, fault injection, and building anti-fragile systems using controlled experiments.
Apply Gremlin's enterprise chaos engineering methodology. Emphasizes categorized failure injection, safety controls, and structured experimentation. Use when implementing chaos engineering in enterprise environments with compliance requirements.
Build production-ready systems with stability patterns: circuit breakers, bulkheads, timeouts, and retry logic. Use when the user mentions "production outage", "circuit breaker", "timeout strategy", "deployment pipeline", or "chaos engineering". Covers capacity planning, health checks, and anti-fragility patterns. For data systems, see ddia-systems. For system architecture, see system-design.
Use when the user wants to deploy and run a prepared AWS FIS experiment. Triggers on "execute FIS experiment", "run FIS experiment", "start chaos experiment", "deploy FIS template", "启动 FIS 实验", "运行混沌实验", "执行故障注入实验", "deploy and run the experiment in [directory]". Expects a prepared experiment directory (from aws-fis-experiment-prepare or manually created) containing experiment-template.json, iam-policy.json, cfn-template.yaml, and alarm configs. Deploys resources via CLI or CloudFormation, starts the experiment with strict user confirmation, monitors progress, and generates results report.
Injects managed chaos into environments to test system resilience. Validates that self-healing and monitoring systems work as expected under stress.