Search Results: incident-response

Found 83 Skills

postmortem-writing

Write effective blameless postmortems with root cause analysis, timelines, and action items. Use when conducting incident reviews, writing postmortem documents, or improving incident response processes.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesaj-geddes/useful-ai-promp...

runbook-creation

Create operational runbooks, playbooks, standard operating procedures (SOPs), and incident response guides. Use when documenting operational procedures, on-call guides, or incident response processes.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceswshobson/agents

incident-runbook-templates

Create structured incident response runbooks with step-by-step procedures, escalation paths, and recovery actions. Use when building runbooks, responding to incidents, or establishing incident response procedures.

🇺🇸|EnglishTranslated

Testing & QAvasilyu1983/ai-agents-pub...

qa-debugging

Systematic debugging playbook for application errors and incidents: crashes, regressions, intermittent failures, production-only bugs, performance issues, stack traces, log/trace analysis, profiling, and distributed systems root cause analysis.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesmajiayu000/claude-arsenal

observability-sre

Observability and SRE expert. Use when setting up monitoring, logging, tracing, defining SLOs, or managing incidents. Covers Prometheus, Grafana, OpenTelemetry, and incident response best practices.

🇺🇸|EnglishTranslated

DevOps & Cloud Serviceswshobson/agents

on-call-handoff-patterns

Master on-call shift handoffs with context transfer, escalation procedures, and documentation. Use when transitioning on-call responsibilities, documenting shift summaries, or improving on-call processes.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

error-debugging-error-analysis

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

🇺🇸|EnglishTranslated

Security & Compliancedavila7/claude-code-templ...

security-compliance

Guides security professionals in implementing defense-in-depth security architectures, achieving compliance with industry frameworks (SOC2, ISO27001, GDPR, HIPAA), conducting threat modeling and risk assessments, managing security operations and incident response, and embedding security throughout the SDLC.

🇺🇸|EnglishTranslated

2 scripts/Checked

DevOps & Cloud Servicesdavila7/claude-code-templ...

it-operations

Manages IT infrastructure, monitoring, incident response, and service reliability. Provides frameworks for ITIL service management, observability strategies, automation, backup/recovery, capacity planning, and operational excellence practices.

🇺🇸|EnglishTranslated

Security & Compliancerysweet/amplihack

cybersecurity-analyst

Analyzes events through cybersecurity lens using threat modeling, attack surface analysis, defense-in-depth, zero-trust architecture, and risk-based frameworks (CIA triad, STRIDE, MITRE ATT&CK). Provides insights on vulnerabilities, attack vectors, defense strategies, incident response, and security posture. Use when: Security incidents, vulnerability assessments, threat analysis, security architecture, compliance. Evaluates: Confidentiality, integrity, availability, threat actors, attack patterns, controls, residual risk.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

incident-responder

Expert SRE incident responder specializing in rapid problem resolution, modern observability, and comprehensive incident management. Masters incident command, blameless post-mortems, error budget management, and system reliability patterns. Handles critical outages, communication strategies, and continuous improvement. Use IMMEDIATELY for production incidents or SRE practices.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesrohitg00/kubectl-mcp-serv...

k8s-incident

Respond to Kubernetes incidents with runbooks and diagnostics. Use for outages, pod failures, node issues, network problems, and emergency response.

🇺🇸|EnglishTranslated

1 scripts/Checked