infrastructure-monitoring-setup
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWorks with infrastructure-monitor.sh script, systemd timer, ntfy.sh push notifications,
支持infrastructure-monitor.sh脚本、systemd定时器、ntfy.sh推送通知
Infrastructure Monitoring Setup Skill
基础设施监控搭建技能
Complete setup and configuration of automated infrastructure monitoring with mobile push notifications and auto-recovery capabilities.
完成带有移动推送通知和自动恢复功能的自动化基础设施监控的完整搭建与配置。
Quick Start
快速开始
Quick setup for monitoring (5 minutes):
bash
undefined监控快速搭建(5分钟完成):
bash
undefined1. Create unique ntfy topic
1. Create unique ntfy topic
TOPIC="infra-$(openssl rand -hex 8)"
echo "Your topic: $TOPIC"
TOPIC="infra-$(openssl rand -hex 8)"
echo "Your topic: $TOPIC"
2. Add to .env
2. Add to .env
echo "ALERT_ENABLED=true" >> /home/dawiddutoit/projects/network/.env
echo "NTFY_SERVER=https://ntfy.sh" >> /home/dawiddutoit/projects/network/.env
echo "NTFY_TOPIC=$TOPIC" >> /home/dawiddutoit/projects/network/.env
echo "AUTO_RECOVER=true" >> /home/dawiddutoit/projects/network/.env
echo "ALERT_ENABLED=true" >> /home/dawiddutoit/projects/network/.env
echo "NTFY_SERVER=https://ntfy.sh" >> /home/dawiddutoit/projects/network/.env
echo "NTFY_TOPIC=$TOPIC" >> /home/dawiddutoit/projects/network/.env
echo "AUTO_RECOVER=true" >> /home/dawiddutoit/projects/network/.env
3. Install systemd service
3. Install systemd service
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.* /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now infrastructure-monitor.timer
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.* /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now infrastructure-monitor.timer
4. Test
4. Test
/home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
Then install ntfy app on phone and subscribe to your topic./home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
随后在手机上安装ntfy应用并订阅你的主题。Table of Contents
目录
- When to Use This Skill
- What This Skill Does
- Instructions
- 3.1 Install ntfy Mobile App
- 3.2 Configure Monitoring in .env
- 3.3 Install Systemd Timer
- 3.4 Test Monitoring and Alerts
- 3.5 Configure Home Assistant Integration (Optional)
- 3.6 Verify Auto-Recovery
- 3.7 View Monitoring Logs
- Supporting Files
- Expected Outcomes
- Requirements
- Red Flags to Avoid
When to Use This Skill
1. 何时使用该技能
Explicit Triggers:
- "Setup monitoring"
- "Configure mobile alerts"
- "Enable auto-recovery"
- "Setup ntfy notifications"
- "Configure Home Assistant alerts"
Implicit Triggers:
- Want to be notified of infrastructure failures
- Need automated recovery for common issues
- Infrastructure has been down without detection
- Want proactive monitoring
Debugging Triggers:
- "Why am I not getting alerts?"
- "Is monitoring working?"
- "How to test notifications?"
明确触发场景:
- 「搭建监控系统」
- 「配置移动告警」
- 「启用自动恢复」
- 「搭建ntfy通知」
- 「配置Home Assistant告警」
隐含触发场景:
- 希望收到基础设施故障通知
- 需要为常见问题启用自动恢复
- 基础设施曾在无检测的情况下宕机
- 希望实现主动监控
排查触发场景:
- 「为什么收不到告警?」
- 「监控是否正常运行?」
- 「如何测试通知功能?」
What This Skill Does
2. 该技能的功能
- Mobile Alerts - Configures ntfy.sh push notifications to phone
- Auto-Recovery - Enables automatic fixes for common failures
- HA Integration - Optional Home Assistant notification integration
- Systemd Service - Installs timer to run monitoring every 5 minutes
- Tests Setup - Verifies notifications and recovery work
- Logs Access - Shows how to view monitoring logs
- Troubleshooting - Diagnoses alert delivery issues
- 移动告警 - 配置ntfy.sh手机推送通知
- 自动恢复 - 启用常见故障的自动修复功能
- Home Assistant集成 - 可选的Home Assistant通知集成
- Systemd服务 - 安装定时器实现每5分钟运行一次监控
- 测试搭建 - 验证通知与恢复功能是否正常
- 日志访问 - 展示如何查看监控日志
- 故障排查 - 诊断告警投递问题
Instructions
3. 操作指南
3.1 Install ntfy Mobile App
3.1 安装ntfy移动应用
Install app:
- Android: https://play.google.com/store/apps/details?id=io.heckel.ntfy
- iOS: https://apps.apple.com/us/app/ntfy/id1625396347
Subscribe to topic:
- Open ntfy app
- Tap "+" to add subscription
- Enter topic: (you'll generate this in step 3.2)
infra-YOUR-RANDOM-ID - Server:
https://ntfy.sh - Tap "Subscribe"
Note: You need the topic ID from step 3.2 before subscribing. Come back here after generating it.
安装应用:
- Android:https://play.google.com/store/apps/details?id=io.heckel.ntfy
- iOS:https://apps.apple.com/us/app/ntfy/id1625396347
订阅主题:
- 打开ntfy应用
- 点击「+」添加订阅
- 输入主题:(将在3.2步骤生成)
infra-YOUR-RANDOM-ID - 服务器:
https://ntfy.sh - 点击「订阅」
注意: 你需要先完成3.2步骤获取主题ID,再返回此处完成订阅。
3.2 Configure Monitoring in .env
3.2 在.env中配置监控
Generate unique topic ID:
bash
TOPIC="infra-$(openssl rand -hex 8)"
echo "Your unique topic: $TOPIC"Save this topic ID - you'll use it in the ntfy app.
Add monitoring configuration to .env:
bash
undefined生成唯一主题ID:
bash
TOPIC="infra-$(openssl rand -hex 8)"
echo "Your unique topic: $TOPIC"保存该主题ID,后续将在ntfy应用中使用。
将监控配置添加到.env:
bash
undefinedNavigate to project directory
Navigate to project directory
cd /home/dawiddutoit/projects/network
cd /home/dawiddutoit/projects/network
Add monitoring variables
Add monitoring variables
cat >> .env << EOF
cat >> .env << EOF
Monitoring & Alerts
Monitoring & Alerts
ALERT_ENABLED=true
NTFY_SERVER=https://ntfy.sh
NTFY_TOPIC=$TOPIC
AUTO_RECOVER=true
EOF
**Verify configuration:**
```bash
grep -A4 "Monitoring & Alerts" /home/dawiddutoit/projects/network/.envExpected:
undefinedALERT_ENABLED=true
NTFY_SERVER=https://ntfy.sh
NTFY_TOPIC=$TOPIC
AUTO_RECOVER=true
EOF
**验证配置:**
```bash
grep -A4 "Monitoring & Alerts" /home/dawiddutoit/projects/network/.env预期输出:
undefinedMonitoring & Alerts
Monitoring & Alerts
ALERT_ENABLED=true
NTFY_SERVER=https://ntfy.sh
NTFY_TOPIC=infra-a3f7d92b4c8e1f56
AUTO_RECOVER=true
**Configuration options:**
| Variable | Purpose | Default |
|----------|---------|---------|
| `ALERT_ENABLED` | Enable mobile push notifications | `false` |
| `NTFY_SERVER` | ntfy.sh server URL | `https://ntfy.sh` |
| `NTFY_TOPIC` | Unique topic for your alerts | None (required) |
| `AUTO_RECOVER` | Enable automatic recovery | `true` |
**To disable auto-recovery but keep alerts:**
```bashALERT_ENABLED=true
NTFY_SERVER=https://ntfy.sh
NTFY_TOPIC=infra-a3f7d92b4c8e1f56
AUTO_RECOVER=true
**配置选项:**
| 变量名 | 用途 | 默认值 |
|--------|------|--------|
| `ALERT_ENABLED` | 启用移动推送通知 | `false` |
| `NTFY_SERVER` | ntfy.sh服务器地址 | `https://ntfy.sh` |
| `NTFY_TOPIC` | 告警专属主题 | 无(必填) |
| `AUTO_RECOVER` | 启用自动恢复 | `true` |
**如需保留告警但禁用自动恢复:**
```bashEdit .env
编辑.env
nano /home/dawiddutoit/projects/network/.env
nano /home/dawiddutoit/projects/network/.env
Change: AUTO_RECOVER=false
修改:AUTO_RECOVER=false
undefinedundefined3.3 Install Systemd Timer
3.3 安装Systemd定时器
Install systemd service and timer to run monitoring every 5 minutes:
bash
undefined安装systemd服务与定时器,实现每5分钟运行一次监控:
bash
undefinedCopy service files
复制服务文件
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.service /etc/systemd/system/
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.timer /etc/systemd/system/
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.service /etc/systemd/system/
sudo cp /home/dawiddutoit/projects/network/systemd/infrastructure-monitor.timer /etc/systemd/system/
Reload systemd
重载systemd
sudo systemctl daemon-reload
sudo systemctl daemon-reload
Enable and start timer
启用并启动定时器
sudo systemctl enable infrastructure-monitor.timer
sudo systemctl start infrastructure-monitor.timer
**Verify timer is active:**
```bashsudo systemctl enable infrastructure-monitor.timer
sudo systemctl start infrastructure-monitor.timer
**验证定时器是否激活:**
```bashCheck timer status
查看定时器状态
systemctl list-timers infrastructure-monitor.timer
systemctl list-timers infrastructure-monitor.timer
Check service status
查看服务状态
sudo systemctl status infrastructure-monitor.timer
Expected:● infrastructure-monitor.timer - Run infrastructure monitoring every 5 minutes
Loaded: loaded (/etc/systemd/system/infrastructure-monitor.timer; enabled)
Active: active (waiting) since...
**Timer configuration:**
- Runs every 5 minutes
- Starts 1 minute after boot
- Persistent (survives reboots)sudo systemctl status infrastructure-monitor.timer
预期输出:● infrastructure-monitor.timer - Run infrastructure monitoring every 5 minutes
Loaded: loaded (/etc/systemd/system/infrastructure-monitor.timer; enabled)
Active: active (waiting) since...
**定时器配置:**
- 每5分钟运行一次
- 系统启动1分钟后开始运行
- 持久化(重启后依然有效)3.4 Test Monitoring and Alerts
3.4 测试监控与告警
Test monitoring script:
bash
undefined测试监控脚本:
bash
undefinedRun monitoring manually
手动运行监控
/home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
Expected output shows:
- Docker containers checked
- Tunnel connectivity tested
- Service health verified
- Network interface status
- Alert sent to ntfy topic
**Test alert delivery:**
Within 30 seconds, you should receive push notification on phone with infrastructure status.
**If no notification received:**
Check ntfy topic subscription:
```bash/home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
预期输出包含:
- Docker容器检查结果
- 隧道连通性测试
- 服务健康状态验证
- 网络接口状态
- 已向ntfy主题发送告警
**测试告警投递:**
30秒内,你应在手机上收到包含基础设施状态的推送通知。
**若未收到通知:**
检查ntfy主题订阅:
```bashTest sending to topic directly
直接向主题发送测试消息
curl -d "Test from infrastructure monitoring" https://ntfy.sh/$TOPIC
If direct curl works but monitoring doesn't:
- Check ALERT_ENABLED=true in .env
- Verify NTFY_TOPIC matches app subscription
- Check script has network accesscurl -d "Test from infrastructure monitoring" https://ntfy.sh/$TOPIC
若直接curl请求有效但监控脚本无响应:
- 检查.env中ALERT_ENABLED=true
- 验证NTFY_TOPIC与应用订阅的主题一致
- 检查脚本是否具备网络访问权限3.5 Configure Home Assistant Integration (Optional)
3.5 配置Home Assistant集成(可选)
Why use Home Assistant integration:
- Centralized home automation alerts
- Can trigger automations based on infrastructure status
- Redundancy with ntfy.sh
- Integration with existing HA notifications
Prerequisites:
- Home Assistant running and accessible
- HA mobile app installed (for notify.mobile_app_* service)
Step 1: Create Long-Lived Access Token
- Go to Home Assistant: http://192.168.68.123:8123
- Click your profile (bottom left)
- Scroll to "Long-Lived Access Tokens"
- Click "Create Token"
- Name: "Infrastructure Monitoring"
- Copy token (shown only once)
Step 2: Find Notification Service Name
- In Home Assistant: Developer Tools → Services
- Filter by "notify"
- Find your mobile app service:
notify.mobile_app_your_phone
Step 3: Add to .env
bash
undefined为什么使用Home Assistant集成:
- 集中管理家庭自动化告警
- 可根据基础设施状态触发自动化流程
- 与ntfy.sh形成冗余
- 与现有Home Assistant通知系统集成
前置要求:
- Home Assistant已运行且可访问
- 已安装Home Assistant移动应用(用于notify.mobile_app_*服务)
步骤1:创建长期访问令牌
- 打开Home Assistant:http://192.168.68.123:8123
- 点击左下角个人资料
- 滚动至「长期访问令牌」
- 点击「创建令牌」
- 命名:「基础设施监控」
- 复制令牌(仅显示一次)
步骤2:查找通知服务名称
- 在Home Assistant中:开发者工具 → 服务
- 筛选「notify」
- 找到你的移动应用服务:
notify.mobile_app_your_phone
步骤3:添加到.env
bash
undefinedEdit .env
编辑.env
nano /home/dawiddutoit/projects/network/.env
nano /home/dawiddutoit/projects/network/.env
Add HA configuration
添加Home Assistant配置
HA_NOTIFICATIONS_ENABLED=true
HA_BASE_URL=http://192.168.68.123:8123
HA_ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
HA_NOTIFY_SERVICE=notify.mobile_app_your_phone
**Step 4: Test HA Notifications**
```bashHA_NOTIFICATIONS_ENABLED=true
HA_BASE_URL=http://192.168.68.123:8123
HA_ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
HA_NOTIFY_SERVICE=notify.mobile_app_your_phone
**步骤4:测试Home Assistant通知**
```bashRun monitoring (should send to both ntfy and HA)
运行监控(应同时向ntfy和Home Assistant发送通知)
/home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
Check you receive notification in Home Assistant companion app.
**Troubleshooting HA notifications:**
```bash/home/dawiddutoit/projects/network/scripts/infrastructure-monitor.sh
检查是否在Home Assistant companion app中收到通知。
**排查Home Assistant通知问题:**
```bashTest HA API access
测试Home Assistant API访问
curl -H "Authorization: Bearer YOUR_TOKEN"
http://192.168.68.123:8123/api/
http://192.168.68.123:8123/api/
curl -H "Authorization: Bearer YOUR_TOKEN"
http://192.168.68.123:8123/api/
http://192.168.68.123:8123/api/
Test notification service
测试通知服务
curl -X POST
-H "Authorization: Bearer YOUR_TOKEN"
-H "Content-Type: application/json"
-d '{"message": "Test from infrastructure monitoring"}'
http://192.168.68.123:8123/api/services/notify/mobile_app_your_phone
-H "Authorization: Bearer YOUR_TOKEN"
-H "Content-Type: application/json"
-d '{"message": "Test from infrastructure monitoring"}'
http://192.168.68.123:8123/api/services/notify/mobile_app_your_phone
undefinedcurl -X POST
-H "Authorization: Bearer YOUR_TOKEN"
-H "Content-Type: application/json"
-d '{"message": "Test from infrastructure monitoring"}'
http://192.168.68.123:8123/api/services/notify/mobile_app_your_phone
-H "Authorization: Bearer YOUR_TOKEN"
-H "Content-Type: application/json"
-d '{"message": "Test from infrastructure monitoring"}'
http://192.168.68.123:8123/api/services/notify/mobile_app_your_phone
undefined3.6 Verify Auto-Recovery
3.6 验证自动恢复功能
Monitor logs to see auto-recovery in action:
bash
undefined查看日志以观察自动恢复过程:
bash
undefinedView live monitoring logs
实时查看监控日志
sudo journalctl -u infrastructure-monitor.service -f
sudo journalctl -u infrastructure-monitor.service -f
Or check persistent log
或查看持久化日志
tail -f /var/log/infrastructure-monitor.log
**Auto-recovery capabilities:**
| Issue | Detection | Recovery Action |
|-------|-----------|----------------|
| Stuck cloudflared | No registrations in 10 min | Restart cloudflared container |
| Docker network isolation | Ping fails between containers | Recreate bridge network |
| Inactive Ethernet | WiFi used instead of eth0 | Activate Ethernet connection |
| Service failures | HTTP health checks fail | Restart affected containers |
**Test auto-recovery:**
```bashtail -f /var/log/infrastructure-monitor.log
**自动恢复能力:**
| 问题 | 检测方式 | 恢复操作 |
|------|----------|----------|
| cloudflared卡顿 | 10分钟内无注册记录 | 重启cloudflared容器 |
| Docker网络隔离 | 容器间ping不通 | 重建桥接网络 |
| 以太网未激活 | 使用WiFi而非eth0 | 激活以太网连接 |
| 服务故障 | HTTP健康检查失败 | 重启受影响的容器 |
**测试自动恢复:**
```bashSimulate stuck tunnel
模拟隧道卡顿
docker stop cloudflared
docker stop cloudflared
Wait 5 minutes (next monitoring run)
等待5分钟(下一次监控运行)
Check logs - should show tunnel restarted
查看日志 - 应显示隧道已重启
Verify tunnel recovered
验证隧道恢复
docker ps | grep cloudflared
docker logs cloudflared | grep "Registered tunnel"
undefineddocker ps | grep cloudflared
docker logs cloudflared | grep "Registered tunnel"
undefined3.7 View Monitoring Logs
3.7 查看监控日志
View systemd service logs:
bash
undefined查看systemd服务日志:
bash
undefinedLive monitoring logs
实时监控日志
sudo journalctl -u infrastructure-monitor.service -f
sudo journalctl -u infrastructure-monitor.service -f
Last 50 lines
查看最近50行
sudo journalctl -u infrastructure-monitor.service -n 50
sudo journalctl -u infrastructure-monitor.service -n 50
Logs from today
查看今日日志
sudo journalctl -u infrastructure-monitor.service --since today
sudo journalctl -u infrastructure-monitor.service --since today
Logs with timestamps
查看带时间戳的日志
sudo journalctl -u infrastructure-monitor.service -o short-iso
**View persistent log file:**
```bashsudo journalctl -u infrastructure-monitor.service -o short-iso
**查看持久化日志文件:**
```bashLive tail
实时尾部查看
tail -f /var/log/infrastructure-monitor.log
tail -f /var/log/infrastructure-monitor.log
Last 100 lines
查看最近100行
tail -100 /var/log/infrastructure-monitor.log
tail -100 /var/log/infrastructure-monitor.log
Search for errors
搜索错误
grep -i error /var/log/infrastructure-monitor.log
grep -i error /var/log/infrastructure-monitor.log
Search for recoveries
搜索恢复记录
grep -i "recovered" /var/log/infrastructure-monitor.log
**Check timer schedule:**
```bashgrep -i "recovered" /var/log/infrastructure-monitor.log
**查看定时器计划:**
```bashShow next run time
查看下次运行时间
systemctl list-timers infrastructure-monitor.timer
systemctl list-timers infrastructure-monitor.timer
Show timer configuration
查看定时器配置
systemctl cat infrastructure-monitor.timer
**Monitoring controls:**
```bashsystemctl cat infrastructure-monitor.timer
**监控控制命令:**
```bashStop monitoring temporarily
临时停止监控
sudo systemctl stop infrastructure-monitor.timer
sudo systemctl stop infrastructure-monitor.timer
Restart monitoring
重启监控
sudo systemctl start infrastructure-monitor.timer
sudo systemctl start infrastructure-monitor.timer
Disable monitoring (survives reboot)
禁用监控(重启后依然保持禁用)
sudo systemctl disable infrastructure-monitor.timer
sudo systemctl disable infrastructure-monitor.timer
Re-enable monitoring
重新启用监控
sudo systemctl enable infrastructure-monitor.timer
undefinedsudo systemctl enable infrastructure-monitor.timer
undefinedSupporting Files
4. 相关文件
| File | Purpose |
|---|---|
| Monitoring architecture, recovery strategies, ntfy.sh details |
| Example configurations, alert formats, log outputs |
| Test script for alert delivery |
| 文件 | 用途 |
|---|---|
| 监控架构、恢复策略、ntfy.sh详细说明 |
| 示例配置、告警格式、日志输出 |
| 告警投递测试脚本 |
Expected Outcomes
5. 预期效果
Success:
- ntfy app receives push notifications
- Monitoring runs every 5 minutes
- Auto-recovery fixes common failures within 5 minutes
- Logs show monitoring activity
- Home Assistant notifications working (if configured)
Partial Success:
- Monitoring runs but alerts not received (check topic subscription)
- Alerts received but auto-recovery disabled (set AUTO_RECOVER=true)
Failure Indicators:
- No notifications received after 10 minutes
- Timer not running (check systemctl status)
- Script fails with errors (check logs)
- HA notifications not working (check token/service name)
成功状态:
- ntfy应用收到推送通知
- 监控每5分钟运行一次
- 自动恢复功能在5分钟内修复常见故障
- 日志显示监控活动
- Home Assistant通知正常(若已配置)
部分成功状态:
- 监控正常运行但收不到告警(检查主题订阅)
- 收到告警但自动恢复功能未启用(设置AUTO_RECOVER=true)
失败标识:
- 10分钟后仍未收到通知
- 定时器未运行(检查systemctl状态)
- 脚本运行报错(查看日志)
- Home Assistant通知失效(检查令牌/服务名称)
Requirements
6. 前置要求
- Infrastructure server running Linux with systemd
- Mobile device with ntfy app installed
- Internet connectivity for ntfy.sh
- .env file with monitoring configuration
- Home Assistant (optional, for HA integration)
- 运行Linux且带systemd的基础设施服务器
- 安装了ntfy应用的移动设备
- 可访问ntfy.sh的网络连接
- 包含监控配置的.env文件
- Home Assistant(可选,用于集成)
Red Flags to Avoid
7. 注意事项
- Do not use public/guessable ntfy topic (security risk)
- Do not share ntfy topic publicly (anyone can subscribe)
- Do not disable monitoring without alternative alerting
- Do not ignore persistent alerts (investigate root cause)
- Do not run monitoring script too frequently (causes noise)
- Do not commit .env with ntfy topic to git (privacy)
- Do not use AUTO_RECOVER=false without manual monitoring
- 请勿使用公开/易猜测的ntfy主题(存在安全风险)
- 请勿公开分享ntfy主题(任何人都可订阅)
- 请勿在无替代告警方案的情况下禁用监控
- 请勿忽略持续告警(需排查根本原因)
- 请勿过于频繁地运行监控脚本(会产生冗余信息)
- 请勿将包含ntfy主题的.env文件提交至git(隐私风险)
- 请勿在未配置手动监控的情况下设置AUTO_RECOVER=false
Notes
补充说明
- Monitoring checks run every 5 minutes via systemd timer
- ntfy.sh is free and doesn't require account
- Topic ID should be random and private (security by obscurity)
- Auto-recovery attempts fixes before alerting as critical
- Alert levels: 🔴 Critical (manual intervention), ⚠️ Warning (recovery in progress)
- HA integration is optional and works alongside ntfy.sh
- Logs persist across reboots at /var/log/infrastructure-monitor.log
- Maximum detection time: 5 minutes (timer interval)
- Monitoring survives server reboots (systemd timer enabled)
- Use infrastructure-health-check skill for manual on-demand checks
- 监控检查通过systemd定时器每5分钟运行一次
- ntfy.sh是免费服务,无需注册账号
- 主题ID应随机且私密(通过模糊实现安全)
- 自动恢复会先尝试修复,再发送严重告警
- 告警级别:🔴 严重(需人工干预),⚠️ 警告(恢复中)
- Home Assistant集成是可选的,可与ntfy.sh同时使用
- 日志持久化存储在/var/log/infrastructure-monitor.log,重启后依然保留
- 最大检测时间:5分钟(定时器间隔)
- 监控在服务器重启后依然有效(systemd定时器已启用)
- 如需手动按需检查,请使用infrastructure-health-check技能