Loading...
Loading...
Found 67 Skills
An engineering runbook — service overview, alerts table, dashboards links, common procedures with copy-pasteable commands, on-call rotation, and an incident-response checklist. Use when the brief mentions "runbook", "ops doc", "on-call guide", "SRE doc", or "运维手册".
Application monitoring and observability setup for Python/React projects. Use when configuring logging, metrics collection, health checks, alerting rules, or dashboard creation. Covers structured logging with structlog, Prometheus metrics for FastAPI, health check endpoints, alert threshold design, Grafana dashboard patterns, error tracking with Sentry, and uptime monitoring. Does NOT cover incident response procedures (use incident-response) or deployment (use deployment-pipeline).
Deployment procedures and CI/CD pipeline configuration for Python/React projects. Use when deploying to staging or production, creating CI/CD pipelines with GitHub Actions, troubleshooting deployment failures, or planning rollbacks. Covers pipeline stages (build/test/staging/production), environment promotion, pre-deployment validation, health checks, canary deployment, rollback procedures, and GitHub Actions workflows. Does NOT cover Docker image building (use docker-best-practices) or incident response (use incident-response).
Security-focused code review checklist and automated scanning patterns. Use when reviewing pull requests for security issues, auditing authentication/authorization code, checking for OWASP Top 10 vulnerabilities, or validating input sanitization. Covers SQL injection prevention, XSS protection, CSRF tokens, authentication flow review, secrets detection, dependency vulnerability scanning, and secure coding patterns for Python (FastAPI) and React. Does NOT cover deployment security (use docker-best-practices) or incident handling (use incident-response).
Expert DevOps troubleshooter specializing in rapid incident response, advanced debugging, and modern observability. Masters log analysis, distributed tracing, Kubernetes debugging, performance optimization, and root cause analysis. Handles production outages, system reliability, and preventive monitoring. Use PROACTIVELY for debugging, incident response, or system troubleshooting.
Create security policies, guidelines, compliance documentation, and security best practices. Use when documenting security policies, compliance requirements, or security guidelines.
Master smart contract security with auditing, vulnerability detection, and incident response
Write effective blameless postmortems with root cause analysis, timelines, and action items. Use when conducting incident reviews, writing postmortem documents, or improving incident response processes.
Create operational runbooks, playbooks, standard operating procedures (SOPs), and incident response guides. Use when documenting operational procedures, on-call guides, or incident response processes.
Systematic debugging playbook for application errors and incidents: crashes, regressions, intermittent failures, production-only bugs, performance issues, stack traces, log/trace analysis, profiling, and distributed systems root cause analysis.
Master memory forensics techniques including memory acquisition, process analysis, and artifact extraction using Volatility and related tools. Use when analyzing memory dumps, investigating incidents, or performing malware analysis from RAM captures.
Guides security professionals in implementing defense-in-depth security architectures, achieving compliance with industry frameworks (SOC2, ISO27001, GDPR, HIPAA), conducting threat modeling and risk assessments, managing security operations and incident response, and embedding security throughout the SDLC.