linux-troubleshooting

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Linux Troubleshooting Workflow

Linux故障排查工作流

Overview

概述

Specialized workflow for diagnosing and resolving Linux system issues including performance problems, service failures, network issues, and resource constraints.
专门用于诊断和解决Linux系统问题的工作流,包括性能故障、服务失效、网络问题以及资源限制。

When to Use This Workflow

何时使用此工作流

Use this workflow when:
  • Diagnosing system performance issues
  • Troubleshooting service failures
  • Investigating network problems
  • Resolving disk space issues
  • Debugging application errors
在以下场景使用此工作流:
  • 诊断系统性能问题
  • 排查服务失效问题
  • 调查网络故障
  • 解决磁盘空间问题
  • 调试应用程序错误

Workflow Phases

工作流阶段

Phase 1: Initial Assessment

阶段1:初步评估

Skills to Invoke

调用技能

  • bash-linux
    - Linux commands
  • devops-troubleshooter
    - Troubleshooting
  • bash-linux
    - Linux命令
  • devops-troubleshooter
    - 故障排查

Actions

操作

  1. Check system uptime
  2. Review recent changes
  3. Identify symptoms
  4. Gather error messages
  5. Document findings
  1. 检查系统运行时间
  2. 查看近期变更
  3. 识别症状
  4. 收集错误信息
  5. 记录发现

Commands

命令

bash
uptime
hostnamectl
cat /etc/os-release
dmesg | tail -50
bash
uptime
hostnamectl
cat /etc/os-release
dmesg | tail -50

Copy-Paste Prompts

可复制粘贴的提示词

Use @bash-linux to gather system information
Use @bash-linux to gather system information

Phase 2: Resource Analysis

阶段2:资源分析

Skills to Invoke

调用技能

  • bash-linux
    - Resource commands
  • performance-engineer
    - Performance analysis
  • bash-linux
    - 资源相关命令
  • performance-engineer
    - 性能分析

Actions

操作

  1. Check CPU usage
  2. Analyze memory
  3. Review disk space
  4. Monitor I/O
  5. Check network
  1. 检查CPU使用率
  2. 分析内存情况
  3. 查看磁盘空间
  4. 监控I/O
  5. 检查网络

Commands

命令

bash
top -bn1 | head -20
free -h
df -h
iostat -x 1 5
bash
top -bn1 | head -20
free -h
df -h
iostat -x 1 5

Copy-Paste Prompts

可复制粘贴的提示词

Use @performance-engineer to analyze system resources
Use @performance-engineer to analyze system resources

Phase 3: Process Investigation

阶段3:进程调查

Skills to Invoke

调用技能

  • bash-linux
    - Process commands
  • server-management
    - Process management
  • bash-linux
    - 进程相关命令
  • server-management
    - 进程管理

Actions

操作

  1. List running processes
  2. Identify resource hogs
  3. Check process status
  4. Review process trees
  5. Analyze strace output
  1. 列出运行中的进程
  2. 识别资源占用大户
  3. 检查进程状态
  4. 查看进程树
  5. 分析strace输出

Commands

命令

bash
ps aux --sort=-%cpu | head -10
pstree -p
lsof -p PID
strace -p PID
bash
ps aux --sort=-%cpu | head -10
pstree -p
lsof -p PID
strace -p PID

Copy-Paste Prompts

可复制粘贴的提示词

Use @server-management to investigate processes
Use @server-management to investigate processes

Phase 4: Log Analysis

阶段4:日志分析

Skills to Invoke

调用技能

  • bash-linux
    - Log commands
  • error-detective
    - Error detection
  • bash-linux
    - 日志相关命令
  • error-detective
    - 错误检测

Actions

操作

  1. Check system logs
  2. Review application logs
  3. Search for errors
  4. Analyze log patterns
  5. Correlate events
  1. 检查系统日志
  2. 查看应用程序日志
  3. 搜索错误信息
  4. 分析日志模式
  5. 关联事件

Commands

命令

bash
journalctl -xe
tail -f /var/log/syslog
grep -i error /var/log/*
bash
journalctl -xe
tail -f /var/log/syslog
grep -i error /var/log/*

Copy-Paste Prompts

可复制粘贴的提示词

Use @error-detective to analyze log files
Use @error-detective to analyze log files

Phase 5: Network Diagnostics

阶段5:网络诊断

Skills to Invoke

调用技能

  • bash-linux
    - Network commands
  • network-engineer
    - Network troubleshooting
  • bash-linux
    - 网络相关命令
  • network-engineer
    - 网络故障排查

Actions

操作

  1. Check network interfaces
  2. Test connectivity
  3. Analyze connections
  4. Review firewall rules
  5. Check DNS resolution
  1. 检查网络接口
  2. 测试连通性
  3. 分析连接情况
  4. 查看防火墙规则
  5. 检查DNS解析

Commands

命令

bash
ip addr show
ss -tulpn
curl -v http://target
dig domain
bash
ip addr show
ss -tulpn
curl -v http://target
dig domain

Copy-Paste Prompts

可复制粘贴的提示词

Use @network-engineer to diagnose network issues
Use @network-engineer to diagnose network issues

Phase 6: Service Troubleshooting

阶段6:服务故障排查

Skills to Invoke

调用技能

  • server-management
    - Service management
  • systematic-debugging
    - Debugging
  • server-management
    - 服务管理
  • systematic-debugging
    - 调试

Actions

操作

  1. Check service status
  2. Review service logs
  3. Test service restart
  4. Verify dependencies
  5. Check configuration
  1. 检查服务状态
  2. 查看服务日志
  3. 测试服务重启
  4. 验证依赖关系
  5. 检查配置

Commands

命令

bash
systemctl status service
journalctl -u service -f
systemctl restart service
bash
systemctl status service
journalctl -u service -f
systemctl restart service

Copy-Paste Prompts

可复制粘贴的提示词

Use @systematic-debugging to troubleshoot service issues
Use @systematic-debugging to troubleshoot service issues

Phase 7: Resolution

阶段7:问题解决

Skills to Invoke

调用技能

  • incident-responder
    - Incident response
  • bash-pro
    - Fix implementation
  • incident-responder
    - 事件响应
  • bash-pro
    - 修复实施

Actions

操作

  1. Implement fix
  2. Verify resolution
  3. Monitor stability
  4. Document solution
  5. Create prevention plan
  1. 实施修复方案
  2. 验证问题解决
  3. 监控稳定性
  4. 记录解决方案
  5. 制定预防计划

Copy-Paste Prompts

可复制粘贴的提示词

Use @incident-responder to implement resolution
Use @incident-responder to implement resolution

Troubleshooting Checklist

故障排查检查清单

  • System information gathered
  • Resources analyzed
  • Logs reviewed
  • Network tested
  • Services verified
  • Issue resolved
  • Documentation created
  • 已收集系统信息
  • 已分析资源情况
  • 已查看日志
  • 已测试网络
  • 已验证服务
  • 问题已解决
  • 已创建文档

Quality Gates

质量关卡

  • Root cause identified
  • Fix verified
  • Monitoring in place
  • Documentation complete
  • 已确定根本原因
  • 修复方案已验证
  • 已部署监控
  • 文档已完成

Related Workflow Bundles

相关工作流包

  • os-scripting
    - OS scripting
  • bash-scripting
    - Bash scripting
  • cloud-devops
    - DevOps
  • os-scripting
    - 操作系统脚本
  • bash-scripting
    - Bash脚本
  • cloud-devops
    - DevOps