Search Results: sre-practices

Found 14 Skills

DevOps & Cloud Servicesjeffallan/claude-skills

sre-engineer

Use when defining SLIs/SLOs, managing error budgets, or building reliable systems at scale. Invoke for incident management, chaos engineering, toil reduction, capacity planning.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

incident-response-incident-response

Use when working with incident response incident response

🇺🇸|EnglishTranslated

DevOps & Cloud Servicespjt222/development-guides

write-incident-runbook

Create structured incident runbooks with diagnostic steps, resolution procedures, escalation paths, and communication templates for effective incident response. Use when documenting response procedures for recurring alerts, standardizing incident response across an on-call rotation, reducing MTTR with clear diagnostic steps, creating training materials for new team members, or linking alert annotations directly to resolution procedures.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicespersonamanagmentlayer/pcl

sre-expert

Expert-level site reliability engineering, SLOs, incident management, and operational excellence

🇺🇸|EnglishTranslated

DevOps & Cloud Services404kidwiz/claude-supercod...

devops-incident-responder

Expert in SRE practices, incident management, root cause analysis, and automated remediation.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesdokhacgiakhoa/antigravity...

incident-responder

Expert SRE incident responder specializing in rapid problem resolution.

🇺🇸|EnglishTranslated

2 scripts/Checked

DevOps & Cloud Services404kidwiz/claude-supercod...

devops-engineer

Senior DevOps Engineer with expertise in CI/CD automation, infrastructure as code, monitoring, and SRE practices. Proficient in cloud platforms, containerization, configuration management, and building scalable DevOps pipelines with focus on automation and operational excellence.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesdavincidreams/agent-team-...

monitoring-observability

Prometheus, Grafana, CloudWatch, Azure Monitor, Stackdriver, logging, alerting, and SRE practices

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesfirst-fluke/fullstack-sta...

devops-iac-engineer

Expert guidance for designing, implementing, and maintaining cloud infrastructure using Experience in Infrastructure as Code (IaC) principles. Use this skill for architecting cloud solutions, setting up CI/CD pipelines, implementing observability, and following SRE best practices.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesonewave-ai/claude-skills

incident-responder

Production incident response automation. Reads logs, checks recent deploys, identifies root cause, suggests fixes, drafts incident comms, creates post-mortem templates. Severity classification (SEV1-4), escalation paths, status page updates. Generates incident-report.md with timeline, root cause, impact assessment, remediation steps, and prevention measures.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesalirezarezvani/claude-ski...

slo-architect

Use when defining, reviewing, or operating SLOs/SLIs/error budgets. Triggers on "define an SLO", "what should our SLO be", "error budget", "burn rate", "SLI", "service level objective", "Google SRE workbook", "multi-window burn-rate alert", or any reliability-target question. Ships SLO designer, error-budget calculator with multi-window burn-rate thresholds, and SLO reviewer that catches the common bugs (target too aggressive, window too short, conflicting SLOs, no SLI definition). 4 references on SLO principles + SLI design + error budget math + composition with feature-flags-architect/chaos-engineering/kubernetes-operator. NOT a generic observability skill — specifically the SLO discipline.

🇺🇸|EnglishTranslated

3 scripts/Checked