infra-manage-ssh-services
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWorks with SSH commands, Docker remote management, and infrastructure health checks.
支持SSH命令、Docker远程管理以及基础设施健康检查。
Infrastructure SSH Service Management
基础设施SSH服务管理
Quick Start
快速开始
Discover available infrastructure:
bash
undefined发现可用基础设施:
bash
undefinedList all hosts and their status
列出所有主机及其状态
ping -c 1 -W 1 infra.local && echo "✅ infra.local (primary)" || echo "❌ infra.local"
ping -c 1 -W 1 192.168.68.135 && echo "✅ deus (development)" || echo "❌ deus"
ping -c 1 -W 1 homeassistant.local && echo "✅ homeassistant.local" || echo "❌ homeassistant.local"
**Check primary infrastructure services:**
```bashping -c 1 -W 1 infra.local && echo "✅ infra.local (primary)" || echo "❌ infra.local"
ping -c 1 -W 1 192.168.68.135 && echo "✅ deus (development)" || echo "❌ deus"
ping -c 1 -W 1 homeassistant.local && echo "✅ homeassistant.local" || echo "❌ homeassistant.local"
**检查核心基础设施服务:**
```bashView all running Docker services on infra.local
查看infra.local上所有运行中的Docker服务
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
Quick MongoDB health check (MongoDB 4.4 uses 'mongo' not 'mongosh')
快速MongoDB健康检查(MongoDB 4.4使用'mongo'而非'mongosh')
ssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'" 2>/dev/null
**Before using remote MongoDB (NomNom project):**
```bashssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'" 2>/dev/null
**使用远程MongoDB前(NomNom项目):**
```bashVerify MongoDB is accessible
验证MongoDB是否可访问
nc -z infra.local 27017 && echo "✅ MongoDB port open" || echo "❌ MongoDB unreachable"
undefinednc -z infra.local 27017 && echo "✅ MongoDB端口开放" || echo "❌ MongoDB无法访问"
undefinedConnection Reference
连接参考
To connect to infra.local, you have three equivalent options:
bash
undefined连接到infra.local的三种等效方式:
bash
undefinedOption 1: Use the connect function (recommended)
方式1:使用connect函数(推荐)
connect infra
connect infra
Option 2: Use the SSH alias from ~/.ssh/config
方式2:使用~/.ssh/config中的SSH别名
ssh infra
ssh infra
Option 3: Use the full hostname
方式3:使用完整主机名
ssh dawiddutoit@infra.local
**All three commands do the same thing:**
- Connect to `infra.local`
- Authenticate as user `dawiddutoit`
- Use SSH key `~/.ssh/id_ed25519`
**First-time setup (if SSH key not yet copied):**
```bash
connect infra --setupssh dawiddutoit@infra.local
**三种命令的效果完全相同:**
- 连接到`infra.local`
- 以用户`dawiddutoit`身份认证
- 使用SSH密钥`~/.ssh/id_ed25519`
**首次设置(若未复制SSH密钥):**
```bash
connect infra --setupThis copies your SSH public key to infra.local for passwordless authentication
此命令会将你的SSH公钥复制到infra.local,实现无密码认证
**For other hosts:**
```bash
connect deus # or: ssh deus # or: ssh dawiddutoit@192.168.68.135
connect ha # or: ssh ha # or: ssh root@homeassistant.local
connect motor # or: ssh motor # or: ssh dawiddutoit@pi4-motor.local
connect armitage # or: ssh unit@armitage.localRunning commands on infra.local (without interactive shell):
bash
undefined
**连接其他主机:**
```bash
connect deus # 或:ssh deus # 或:ssh dawiddutoit@192.168.68.135
connect ha # 或:ssh ha # 或:ssh root@homeassistant.local
connect motor # 或:ssh motor # 或:ssh dawiddutoit@pi4-motor.local
connect armitage # 或:ssh unit@armitage.local在infra.local上执行命令(无需交互式shell):
bash
undefinedExecute single command
执行单个命令
ssh infra "docker ps"
ssh infra "docker ps"
Execute multiple commands
执行多个命令
ssh infra "cd ~/projects/local-infra && docker compose ps"
ssh infra "cd ~/projects/local-infra && docker compose ps"
Chain commands
链式执行命令
ssh infra "docker ps -f name=mongodb && docker logs --tail 10 local-infra-mongodb-1"
undefinedssh infra "docker ps -f name=mongodb && docker logs --tail 10 local-infra-mongodb-1"
undefinedTable of Contents
目录
- 何时使用此技能
- 此技能的功能
- 操作说明
- 3.1 发现阶段 - 查找可用主机与服务
- 3.2 健康检查阶段 - 验证连通性
- 3.3 执行阶段 - 管理服务
- 支持文件
- 常见工作流
- 预期结果
- 集成点
- 预期收益
- 要求
- 需要避免的风险
When to Use This Skill
何时使用此技能
Explicit Triggers (User Requests)
显式触发(用户请求)
- "Check infrastructure status"
- "Connect to infra/deus/ha"
- "View Docker services on infra"
- "Test MongoDB connectivity"
- "What services are running on infra.local?"
- "Troubleshoot remote MongoDB connection"
- "Check Langfuse status"
- "View OTLP collector logs"
- "检查基础设施状态"
- "连接到infra/deus/ha"
- "查看infra上的Docker服务"
- "测试MongoDB连通性"
- "infra.local上有哪些服务在运行?"
- "排查远程MongoDB连接问题"
- "检查Langfuse状态"
- "查看OTLP收集器日志"
Implicit Triggers (Contextual Needs)
隐式触发(场景需求)
- Before using remote MongoDB in NomNom project
- When remote service connection fails (MongoDB, Neo4j, Langfuse)
- Before starting development session that uses remote resources
- When planning to use OpenTelemetry/Langfuse observability
- When investigating service availability for integration work
- 在NomNom项目中使用远程MongoDB之前
- 远程服务连接失败时(MongoDB、Neo4j、Langfuse)
- 开始使用远程资源的开发会话之前
- 计划使用OpenTelemetry/Langfuse可观测性工具时
- 排查集成工作中的服务可用性问题时
Debugging/Troubleshooting Triggers
调试/排查触发
- Connection refused errors to infra.local services
- MongoDB ServerSelectionTimeoutError
- SSH authentication failures
- Docker container not responding
- Service appears running but not accessible
- Neo4j or Infinity in restart loop
- 连接infra.local服务时出现连接拒绝错误
- MongoDB ServerSelectionTimeoutError
- SSH认证失败
- Docker容器无响应
- 服务显示运行但无法访问
- Neo4j或Infinity处于重启循环
What This Skill Does
此技能的功能
This skill provides systematic workflows for:
- Service Discovery - Identify available hosts (5 total) and running services (16+ on infra.local)
- Connectivity Testing - Verify network reachability, port availability, SSH access
- Docker Management - View, restart, and monitor remote Docker containers
- Health Verification - Check service health status and logs
- Troubleshooting - Diagnose connection issues and service failures
- Infrastructure Integration - Ensure remote resources (MongoDB, Langfuse, OTLP) are ready for use
本技能提供以下系统化工作流:
- 服务发现 - 识别可用主机(共5台)和运行中的服务(infra.local上有16+个)
- 连通性测试 - 验证网络可达性、端口可用性、SSH访问
- Docker管理 - 查看、重启和监控远程Docker容器
- 健康验证 - 检查服务健康状态和日志
- 故障排查 - 诊断连接问题和服务故障
- 基础设施集成 - 确保远程资源(MongoDB、Langfuse、OTLP)已准备就绪
Instructions
操作说明
3.1 Discovery Phase
3.1 发现阶段
Step 1: Identify Target Host
Use the function to determine which host you need:
connectbash
undefined步骤1:确定目标主机
使用函数确定你需要连接的主机:
connectbash
undefinedView available hosts
查看可用主机
connect
connect
Output: Hosts: infra, armitage, deus, ha, motor
输出:Hosts: infra, armitage, deus, ha, motor
**Infrastructure Inventory:**
| Host | Connection | Status | Primary Services |
|------|------------|--------|------------------|
| **infra.local** | `connect infra` | ✅ Online | MongoDB, Langfuse, OTLP, Jaeger, Neo4j, Infinity, PostgreSQL, Redis, MinIO, Mosquitto, Caddy |
| **deus** | `connect deus` | ✅ Online | None detected (development machine) |
| **homeassistant.local** | `connect ha` | ✅ Online | Home Assistant (port 8123) |
| **pi4-motor.local** | `connect motor` | ❌ Offline | Motor control (Raspberry Pi 4) |
| **armitage.local** | `connect armitage` | ❌ Offline | Neo4j, Infinity Embeddings (WSL2 PC) |
**Step 2: Test Host Reachability**
```bash
**基础设施清单:**
| 主机 | 连接方式 | 状态 | 核心服务 |
|------|------------|--------|------------------|
| **infra.local** | `connect infra` | ✅ 在线 | MongoDB, Langfuse, OTLP, Jaeger, Neo4j, Infinity, PostgreSQL, Redis, MinIO, Mosquitto, Caddy |
| **deus** | `connect deus` | ✅ 在线 | 未检测到服务(开发机器) |
| **homeassistant.local** | `connect ha` | ✅ 在线 | Home Assistant(端口8123) |
| **pi4-motor.local** | `connect motor` | ❌ 离线 | 电机控制(Raspberry Pi 4) |
| **armitage.local** | `connect armitage` | ❌ 离线 | Neo4j, Infinity Embeddings(WSL2电脑) |
**步骤2:测试主机可达性**
```bashQuick network ping test
快速网络ping测试
ping -c 1 -W 1 infra.local
ping -c 1 -W 1 infra.local
Test specific port availability
测试特定端口可用性
nc -z infra.local 27017 # MongoDB
nc -z infra.local 3000 # Langfuse
nc -z infra.local 4317 # OTLP Collector
nc -z infra.local 7687 # Neo4j (if not in restart loop)
**Step 3: Discover Running Services**
```bashnc -z infra.local 27017 # MongoDB
nc -z infra.local 3000 # Langfuse
nc -z infra.local 4317 # OTLP Collector
nc -z infra.local 7687 # Neo4j(若未处于重启循环)
**步骤3:发现运行中的服务**
```bashView all Docker containers on infra.local
查看infra.local上的所有Docker容器
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
Count running services
统计运行中的服务数量
ssh infra "docker ps --format '{{.Names}}' | wc -l"
ssh infra "docker ps --format '{{.Names}}' | wc -l"
Check specific service
检查特定服务
ssh infra "docker ps -f name=mongodb"
undefinedssh infra "docker ps -f name=mongodb"
undefined3.2 Health Check Phase
3.2 健康检查阶段
Step 1: Verify SSH Connectivity
bash
undefined步骤1:验证SSH连通性
bash
undefinedTest basic SSH connection
测试基础SSH连接
ssh infra "echo 'Connection OK'"
ssh infra "echo 'Connection OK'"
If SSH fails, check SSH agent
若SSH失败,检查SSH代理
ssh-add -l
ssh-add -l
Copy SSH key if needed (first-time setup)
必要时复制SSH密钥(首次设置)
connect infra --setup
**Step 2: Check Service Health**
```bashconnect infra --setup
**步骤2:检查服务健康状态**
```bashMongoDB health check
MongoDB健康检查
ssh infra "docker inspect --format='{{.State.Health.Status}}' local-infra-mongodb-1"
ssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'"
ssh infra "docker inspect --format='{{.State.Health.Status}}' local-infra-mongodb-1"
ssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'"
Langfuse health check (HTTP)
Langfuse健康检查(HTTP)
curl -s -o /dev/null -w "%{http_code}" http://infra.local:3000
curl -s -o /dev/null -w "%{http_code}" http://infra.local:3000
OTLP Collector health check
OTLP收集器健康检查
ssh infra "docker inspect --format='{{.State.Status}}' local-infra-otel-collector-1"
ssh infra "docker inspect --format='{{.State.Status}}' local-infra-otel-collector-1"
View container logs for errors
查看容器日志中的错误
ssh infra "docker logs --tail 50 local-infra-mongodb-1"
**Step 3: Verify Application-Level Connectivity**
For **MongoDB** (NomNom project):
```bashssh infra "docker logs --tail 50 local-infra-mongodb-1"
**步骤3:验证应用层连通性**
针对**MongoDB**(NomNom项目):
```bashTest from application environment
从应用环境测试
cd ~/projects/play/nomnom
python -c "from motor.motor_asyncio import AsyncIOMotorClient; import asyncio; asyncio.run(AsyncIOMotorClient('mongodb://infra.local:27017').admin.command('ping'))" && echo "✅ MongoDB reachable"
For **Langfuse**:
```bashcd ~/projects/play/nomnom
python -c "from motor.motor_asyncio import AsyncIOMotorClient; import asyncio; asyncio.run(AsyncIOMotorClient('mongodb://infra.local:27017').admin.command('ping'))" && echo "✅ MongoDB可访问"
针对**Langfuse**:
```bashCheck web UI accessibility
检查Web UI可访问性
curl -I http://infra.local:3000 | grep "HTTP"
undefinedcurl -I http://infra.local:3000 | grep "HTTP"
undefined3.3 Execution Phase
3.3 执行阶段
Service Management Commands:
bash
undefined服务管理命令:
bash
undefinedRestart single service
重启单个服务
ssh infra "cd ~/projects/local-infra && docker compose restart mongodb"
ssh infra "cd ~/projects/local-infra && docker compose restart mongodb"
Restart all services
重启所有服务
ssh infra "cd ~/projects/local-infra && docker compose restart"
ssh infra "cd ~/projects/local-infra && docker compose restart"
Stop service
停止服务
ssh infra "cd ~/projects/local-infra && docker compose stop mongodb"
ssh infra "cd ~/projects/local-infra && docker compose stop mongodb"
Start service
启动服务
ssh infra "cd ~/projects/local-infra && docker compose up -d mongodb"
ssh infra "cd ~/projects/local-infra && docker compose up -d mongodb"
View Docker Compose configuration
查看Docker Compose配置
ssh infra "cd ~/projects/local-infra && docker compose config"
**Monitoring Commands:**
```bashssh infra "cd ~/projects/local-infra && docker compose config"
**监控命令:**
```bashFollow logs in real-time
实时跟踪日志
ssh infra "docker logs -f local-infra-mongodb-1"
ssh infra "docker logs -f local-infra-mongodb-1"
View last 100 lines
查看最后100行日志
ssh infra "docker logs --tail 100 local-infra-langfuse-web-1"
ssh infra "docker logs --tail 100 local-infra-langfuse-web-1"
View logs for all services
查看所有服务的日志
ssh infra "cd ~/projects/local-infra && docker compose logs -f"
ssh infra "cd ~/projects/local-infra && docker compose logs -f"
Check resource usage
检查资源使用情况
ssh infra "docker stats --no-stream"
**File Synchronization:**
```bashssh infra "docker stats --no-stream"
**文件同步:**
```bashPush file to infra.local
将文件推送到infra.local
syncpi push ~/path/to/file
syncpi push ~/path/to/file
Pull file from infra.local
从infra.local拉取文件
syncpi pull ~/path/to/file
syncpi pull ~/path/to/file
Sync zsh configuration
同步zsh配置
syncpi zsh push
syncpi zsh pull
undefinedsyncpi zsh push
syncpi zsh pull
undefinedSupporting Files
支持文件
references/infrastructure_guide.md
references/infrastructure_guide.md
Complete infrastructure documentation - Read this for:
- Detailed service inventory with ports and URLs
- Environment variable mappings
- Docker Compose management on infra.local
- Troubleshooting guides for specific services
- Security notes and credential locations
When to read: Before performing any infrastructure operations, when troubleshooting connection issues, or when needing detailed service information.
Location:
/Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.md完整基础设施文档 - 阅读此文档可了解:
- 包含端口和URL的详细服务清单
- 环境变量映射
- infra.local上的Docker Compose管理
- 特定服务的排查指南
- 安全说明和凭证位置
阅读时机: 在执行任何基础设施操作之前、排查连接问题时,或需要详细服务信息时。
位置:
/Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.mdscripts/health_check.sh
scripts/health_check.sh
Quick health check script - Automated connectivity and service status checks.
Usage:
See references/detailed-workflows.md for:
- 7 comprehensive workflows (NomNom setup, connection debugging, service discovery, restart loop diagnosis, SSH setup, OTLP verification, file syncing)
- Expected outcomes (successful/failed health checks, restart loop diagnosis)
- Integration examples (NomNom, observability, Home Assistant, quality gates)
- Troubleshooting guide (connection refused, permission denied, restart loops, slow SSH)
- Advanced techniques (complex commands, real-time monitoring, batch health checks)
Environment variables:
env
MONGODB_URL=mongodb://infra.local:27017
MONGODB_DATABASE=off快速健康检查脚本 - 自动化连通性和服务状态检查。
用法:
查看references/detailed-workflows.md可了解:
- 7个完整工作流(NomNom设置、连接调试、服务发现、重启循环诊断、SSH设置、OTLP验证、文件同步)
- 预期结果(健康检查成功/失败、重启循环诊断)
- 集成示例(NomNom、可观测性、Home Assistant、质量门)
- 排查指南(连接拒绝、权限不足、重启循环、SSH缓慢)
- 高级技巧(复杂命令、实时监控、批量健康检查)
环境变量:
env
MONGODB_URL=mongodb://infra.local:27017
MONGODB_DATABASE=offWith Observability Skills
与可观测性技能集成
Before using observability skills:
bash
undefined使用可观测性技能前:
bash
undefinedVerify OTLP Collector is running
验证OTLP收集器是否运行
ssh infra "docker ps -f name=otel-collector -q" | grep -q . || echo "⚠️ OTLP Collector offline"
ssh infra "docker ps -f name=otel-collector -q" | grep -q . || echo "⚠️ OTLP收集器离线"
Then use skills:
然后使用以下技能:
- observability-analyze-logs
- observability-analyze-logs
- observability-analyze-session-logs
- observability-analyze-session-logs
undefinedundefinedWith Home Assistant Skills
与Home Assistant技能集成
Before using HA skills:
bash
undefined使用HA技能前:
bash
undefinedVerify Home Assistant is accessible
验证Home Assistant是否可访问
curl -s -H "Authorization: Bearer $HA_LONG_LIVED_TOKEN" http://192.168.68.123:8123/api/ | grep -q "message" && echo "✅ HA API accessible"
curl -s -H "Authorization: Bearer $HA_LONG_LIVED_TOKEN" http://192.168.68.123:8123/api/ | grep -q "message" && echo "✅ HA API可访问"
Then use skills:
然后使用以下技能:
- ha-dashboard-create
- ha-dashboard-create
- ha-custom-cards
- ha-custom-cards
- ha-mushroom-cards
- ha-mushroom-cards
undefinedundefinedWith Quality Gates
与质量门集成
Infrastructure verification as quality gate:
bash
undefined作为质量门的基础设施验证:
bash
undefinedAdd to pre-start checks
添加到启动前检查
if ! nc -z infra.local 27017; then
echo "❌ QUALITY GATE FAILED: MongoDB unreachable"
echo "Run: ssh infra 'cd ~/projects/local-infra && docker compose restart mongodb'"
exit 1
fi
undefinedif ! nc -z infra.local 27017; then
echo "❌ 质量门失败:MongoDB无法访问"
echo "执行:ssh infra 'cd ~/projects/local-infra && docker compose restart mongodb'"
exit 1
fi
undefinedExpected Benefits
预期收益
| Metric | Before Skill | After Skill | Improvement |
|---|---|---|---|
| Discovery Time | 5-10 min (manual SSH, guessing) | 30 sec (automated checks) | 10-20x faster |
| Troubleshooting Time | 10-30 min (trial and error) | 2-5 min (systematic workflow) | 5-6x faster |
| Connection Failures | 30-40% (no verification) | <5% (proactive health checks) | 6-8x reduction |
| Service Availability Awareness | Unknown until failure | Real-time status | Proactive visibility |
| Documentation Access | Search files, guess locations | Single skill reference | Immediate context |
| 指标 | 使用技能前 | 使用技能后 | 提升效果 |
|---|---|---|---|
| 发现时间 | 5-10分钟(手动SSH、猜测) | 30秒(自动化检查) | 快10-20倍 |
| 排查时间 | 10-30分钟(反复尝试) | 2-5分钟(系统化工作流) | 快5-6倍 |
| 连接失败率 | 30-40%(无验证) | <5%(主动健康检查) | 降低6-8倍 |
| 服务可用性感知 | 故障发生后才知晓 | 实时状态 | 主动可见性 |
| 文档访问效率 | 搜索文件、猜测位置 | 单一技能参考 | 即时获取上下文 |
Success Metrics
成功指标
- Discovery Success Rate - Can identify all online hosts and services in <30 seconds
- Health Check Coverage - Verify critical services (MongoDB, Langfuse, OTLP) before use
- Troubleshooting Efficiency - Resolve 80% of connection issues within 5 minutes
- Proactive Usage - Check infrastructure before remote operations (NomNom, observability)
- Zero Surprise Failures - No "connection refused" errors due to unchecked infrastructure
- 发现成功率 - 能在30秒内识别所有在线主机和服务
- 健康检查覆盖率 - 使用前验证关键服务(MongoDB、Langfuse、OTLP)
- 排查效率 - 5分钟内解决80%的连接问题
- 主动使用率 - 在远程操作(NomNom、可观测性)前检查基础设施
- 零意外故障 - 无因未检查基础设施导致的“连接拒绝”错误
Requirements
要求
Tools
工具
- Bash (for SSH commands and connectivity tests)
- Read (for comprehensive infrastructure guide)
- Bash(用于SSH命令和连通性测试)
- 阅读器(用于查看完整基础设施指南)
Environment
环境
- SSH access to remote hosts (via )
~/.ssh/config - SSH keys configured (use if needed)
connect <host> --setup - Network connectivity to infra.local (primary), deus, homeassistant.local
- function in
connect(lines 290-306)~/.zshrc - Optional: function for file synchronization
syncpi
- 对远程主机的SSH访问(通过)
~/.ssh/config - 已配置SSH密钥(必要时使用)
connect <host> --setup - 与infra.local(主节点)、deus、homeassistant.local的网络连通性
- 中包含
~/.zshrc函数(第290-306行)connect - 可选:用于文件同步的函数
syncpi
Knowledge
知识
- Basic SSH command syntax
- Understanding of Docker and Docker Compose
- Familiarity with port-based service discovery (nc, curl)
- Environment variables for service endpoints
- 基础SSH命令语法
- Docker和Docker Compose的理解
- 基于端口的服务发现(nc、curl)的熟悉度
- 服务端点的环境变量知识
Utility Scripts
实用脚本
scripts/health_check.sh
scripts/health_check.sh
Purpose: Run comprehensive health checks across all infrastructure hosts
Usage:
bash
undefined用途: 对所有基础设施主机运行全面健康检查
用法:
bash
undefinedCheck all hosts
检查所有主机
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh
Check specific host
检查特定主机
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh infra
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh infra
Verbose output
详细输出
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh --verbose
**Checks performed:**
1. Network reachability (ping)
2. SSH connectivity
3. Docker daemon status
4. Container health for critical services
5. Port availability for key services
6. Service-specific health endpointsbash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh --verbose
**执行的检查:**
1. 网络可达性(ping)
2. SSH连通性
3. Docker守护进程状态
4. 关键服务的容器健康状态
5. 核心服务的端口可用性
6. 服务特定的健康端点Red Flags to Avoid
需要避免的风险
- Assuming local MongoDB - MongoDB runs on infra.local, NOT localhost
- Skipping connectivity checks - Always verify before using remote services
- Ignoring offline hosts - armitage.local and pi4-motor.local are offline (environment variables may point to them)
- Missing SSH key setup - Run on first use
connect <host> --setup - Not checking container health - Container "Up" ≠ healthy (use for health)
docker inspect - Hardcoding IPs - Use hostnames (infra.local, homeassistant.local) for mDNS resolution
- Ignoring restart loops - Neo4j and Infinity are restarting on infra.local (check logs)
- Skipping logs when debugging - Always view logs before restarting services
- Not testing ports - Use to verify port availability before connection attempts
nc -z - Missing Docker Compose context - Always before Docker Compose commands
cd ~/projects/local-infra
- 假设使用本地MongoDB - MongoDB运行在infra.local,而非localhost
- 跳过连通性检查 - 使用远程服务前务必验证
- 忽略离线主机 - armitage.local和pi4-motor.local处于离线状态(环境变量可能指向它们)
- 未完成SSH密钥设置 - 首次使用时运行
connect <host> --setup - 不检查容器健康状态 - 容器“运行中”≠健康(使用查看健康状态)
docker inspect - 硬编码IP - 使用主机名(infra.local、homeassistant.local)实现mDNS解析
- 忽略重启循环 - infra.local上的Neo4j和Infinity处于重启循环(查看日志)
- 排查时不查看日志 - 重启服务前务必查看日志
- 不测试端口 - 连接尝试前使用验证端口可用性
nc -z - 忽略Docker Compose上下文 - 执行Docker Compose命令前务必
cd ~/projects/local-infra
Notes
说明
Key Infrastructure Facts:
- Primary Host: infra.local (16+ services, always online)
- MongoDB: 632K OpenFoodFacts products already imported
- Telemetry: All Claude Code sessions automatically send OTLP to infra.local:4317
- Offline Services: Neo4j and Infinity Embeddings in restart loop on infra.local
- Alternative Endpoints: armitage.local has Neo4j/Infinity but is currently offline
- Home Assistant: Separate host with 16 related skills in ~/.claude/skills/
Environment Variable Locations:
- SSH config:
~/.ssh/config - Secrets: (lines 366-540)
~/.zshrc - Project .env: (MongoDB URL)
~/projects/play/nomnom/.env
Related Documentation:
- Complete infrastructure guide:
/Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.md - NomNom MongoDB setup: (lines 224-235)
/Users/dawiddutoit/projects/play/nomnom/CLAUDE.md - Observability skills: (search "observability-*")
~/.claude/CLAUDE.md - Home Assistant skills: (search "ha-*")
~/.claude/CLAUDE.md
核心基础设施事实:
- 主节点: infra.local(16+个服务,始终在线)
- MongoDB: 已导入632K条OpenFoodFacts产品数据
- 遥测: 所有Claude Code会话自动将OTLP发送到infra.local:4317
- 离线服务: infra.local上的Neo4j和Infinity Embeddings处于重启循环
- 备用端点: armitage.local运行Neo4j/Infinity,但当前处于离线状态
- Home Assistant: 独立主机,在~/.claude/skills/中有16个相关技能
环境变量位置:
- SSH配置:
~/.ssh/config - 密钥:(第366-540行)
~/.zshrc - 项目.env文件:(MongoDB URL)
~/projects/play/nomnom/.env
相关文档:
- 完整基础设施指南:
/Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.md - NomNom MongoDB设置:(第224-235行)
/Users/dawiddutoit/projects/play/nomnom/CLAUDE.md - 可观测性技能:(搜索“observability-*”)
~/.claude/CLAUDE.md - Home Assistant技能:(搜索“ha-*”)
~/.claude/CLAUDE.md