openclaw-self-healing
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOpenClaw Self-Healing System
OpenClaw 自愈系统
"The system that heals itself — or calls for help when it can't."
A 4-tier autonomous self-healing system for OpenClaw Gateway.
“能自我修复的系统——当它无法自行修复时,会主动寻求帮助。”
一款面向OpenClaw Gateway的四层自主自愈系统。
Architecture
架构
Level 1: Watchdog (180s) → Process monitoring (OpenClaw built-in)
Level 2: Health Check (300s) → HTTP 200 + 3 retries
Level 3: Claude Recovery → 30min AI-powered diagnosis 🧠
Level 4: Discord Alert → Human escalationLevel 1: Watchdog (180s) → Process monitoring (OpenClaw built-in)
Level 2: Health Check (300s) → HTTP 200 + 3 retries
Level 3: Claude Recovery → 30min AI-powered diagnosis 🧠
Level 4: Discord Alert → Human escalationWhat's Special (v2.0)
新增特性(v2.0)
- World's first Claude Code as Level 3 emergency doctor
- Persistent Learning - Automatic recovery documentation (symptom → cause → solution → prevention)
- Reasoning Logs - Explainable AI decision-making process
- Multi-Channel Alerts - Discord + Telegram support
- Metrics Dashboard - Success rate, recovery time, trending analysis
- Production-tested (verified recovery Feb 5-6, 2026)
- macOS LaunchAgent integration
- 全球首创 将Claude Code作为三级应急诊断工具
- 持续学习 - 自动生成故障恢复文档(症状→原因→解决方案→预防措施)
- 推理日志 - 可解释的AI决策过程
- 多渠道告警 - 支持Discord + Telegram
- 指标仪表盘 - 成功率、恢复时间、趋势分析
- 经生产环境测试(2026年2月5-6日验证恢复有效)
- 集成macOS LaunchAgent
Quick Setup
快速设置
1. Install Dependencies
1. 安装依赖
bash
brew install tmux
npm install -g @anthropic-ai/claude-codebash
brew install tmux
npm install -g @anthropic-ai/claude-code2. Configure Environment
2. 配置环境变量
bash
undefinedbash
undefinedCopy template to OpenClaw config directory
复制模板到OpenClaw配置目录
cp .env.example ~/.openclaw/.env
cp .env.example ~/.openclaw/.env
Edit and add your Discord webhook (optional)
编辑并添加你的Discord webhook(可选)
nano ~/.openclaw/.env
undefinednano ~/.openclaw/.env
undefined3. Install Scripts
3. 安装脚本
bash
undefinedbash
undefinedCopy scripts
复制脚本
cp scripts/.sh ~/openclaw/scripts/
chmod +x ~/openclaw/scripts/.sh
cp scripts/.sh ~/openclaw/scripts/
chmod +x ~/openclaw/scripts/.sh
Install LaunchAgent
安装LaunchAgent
cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist
undefinedcp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist
undefined4. Verify
4. 验证安装
bash
undefinedbash
undefinedCheck Health Check is running
检查健康检查是否在运行
launchctl list | grep openclaw.healthcheck
launchctl list | grep openclaw.healthcheck
View logs
查看日志
tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log
undefinedtail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log
undefinedScripts
脚本说明
| Script | Level | Description |
|---|---|---|
| 2 | HTTP 200 check + 3 retries + escalation |
| 3 | Claude Code PTY session for AI diagnosis (v1) |
| 3 | Enhanced with learning + reasoning logs (v2) ⭐ |
| 4 | Discord/Telegram notification on failure |
| - | Visualize recovery statistics (NEW) |
| 脚本名称 | 层级 | 描述 |
|---|---|---|
| 2 | HTTP 200检查 + 3次重试 + 升级告警 |
| 3 | 用于AI诊断的Claude Code PTY会话(v1版本) |
| 3 | 新增学习功能与推理日志的增强版本(v2)⭐ |
| 4 | 故障时发送Discord/Telegram通知 |
| - | 可视化恢复统计数据(新增) |
Configuration
配置说明
All settings via environment variables in :
~/.openclaw/.env| Variable | Default | Description |
|---|---|---|
| (none) | Discord webhook for alerts |
| | Gateway health check URL |
| | Restart attempts before escalation |
| | Claude recovery timeout (30 min) |
所有设置通过中的环境变量进行配置:
~/.openclaw/.env| 变量名称 | 默认值 | 描述 |
|---|---|---|
| 无 | 用于告警的Discord webhook地址 |
| | 网关健康检查地址 |
| | 升级告警前的重启尝试次数 |
| | Claude恢复超时时间(30分钟) |
Testing
测试指南
Test Level 2 (Health Check)
测试层级2(健康检查)
bash
undefinedbash
undefinedRun manually
手动运行
bash ~/openclaw/scripts/gateway-healthcheck.sh
bash ~/openclaw/scripts/gateway-healthcheck.sh
Expected output:
预期输出:
✅ Gateway healthy
✅ Gateway healthy
undefinedundefinedTest Level 3 (Claude Recovery)
测试层级3(Claude恢复)
bash
undefinedbash
undefinedInject a config error (backup first!)
注入配置错误(请先备份!)
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak
Wait for Health Check to detect and escalate (~8 min)
等待健康检查检测到故障并升级告警(约8分钟)
tail -f ~/openclaw/memory/emergency-recovery-*.log
undefinedtail -f ~/openclaw/memory/emergency-recovery-*.log
undefinedLinks
相关链接
License
许可证
MIT License - do whatever you want with it.
Built by @ramsbaby + Jarvis 🦞
MIT许可证 - 你可以随意使用本项目。
由@ramsbaby + Jarvis 🦞 开发