Loading...
Loading...
Found 62 Skills
Create a blameless postmortem when the user asks to write a postmortem, document what went wrong, analyze an incident, or run a 5 Whys analysis
Rapid decision-making loop for dynamic situations. Use for incident response, competitive scenarios, time-sensitive decisions, and situations requiring quick adaptation.
Creates safe rollback procedures for deployments with automated workflows, rollback runbooks, version management, and incident response. Use for "rollback automation", "deployment recovery", "incident response", or "production rollback".
Expert SRE incident responder specializing in rapid problem resolution, modern observability, and comprehensive incident management. Masters incident command, blameless post-mortems, error budget management, and system reliability patterns. Handles critical outages, communication strategies, and continuous improvement. Use IMMEDIATELY for production incidents or SRE practices.
Respond to Kubernetes incidents with runbooks and diagnostics. Use for outages, pod failures, node issues, network problems, and emergency response.
SRE patterns for production service reliability: SLOs, error budgets, postmortems, and incident response. Use when defining reliability targets, writing postmortems, implementing SLO alerting, or establishing on-call practices. NOT for initial service development (use scaffolding skills instead).
Guide incident response from detection to post-mortem using SRE principles, severity classification, on-call management, blameless culture, and communication protocols. Use when setting up incident processes, designing escalation policies, or conducting post-mortems.
Expert SRE incident responder specializing in rapid problem resolution.
Operational runbook and procedure documentation specialist. Use when creating incident response procedures, operational playbooks, or system maintenance guides.
Analyze volatile memory (RAM) dumps for forensic investigation. Use when investigating malware infections, rootkits, process injection, credential theft, or any incident requiring analysis of system memory state. Supports Windows, Linux, and macOS memory images.
Use when incidents occur and you need pre-approved workflows, templates, and escalation paths.
Use for structured technical SEO audits, incident response, and validation.