eve-troubleshooting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEve Troubleshooting
Eve故障排查
Use CLI-first diagnostics. Do not assume cluster access.
使用CLI优先的诊断方式。不要假设拥有集群访问权限。
Quick Triage Checklist
快速分类检查清单
bash
eve system health
eve auth status
eve job list --phase activebash
eve system health
eve auth status
eve job list --phase activeCommon Issues and Fixes
常见问题与修复方案
Auth Fails or "Not authenticated"
认证失败或显示“未认证”
bash
eve auth logout
eve auth login
eve auth statusIf SSH key is missing, register it with the admin or follow the CLI prompt to fetch from GitHub.
bash
eve auth logout
eve auth login
eve auth status如果SSH密钥缺失,请联系管理员注册,或按照CLI提示从GitHub获取。
Secret Missing / Interpolation Error
密钥缺失/插值错误
bash
eve secrets list --project proj_xxx
eve secrets set MISSING_KEY "value" --project proj_xxxVerify exists for local interpolation.
.eve/secrets.yamlbash
eve secrets list --project proj_xxx
eve secrets set MISSING_KEY "value" --project proj_xxx确认本地存在文件以支持本地插值。
.eve/secrets.yamlDeploy Job Failed
部署任务失败
bash
eve job follow <job-id>
eve job diagnose <job-id>
eve job result <job-id>Check for registry auth errors, missing secrets, or healthcheck failures.
bash
eve job follow <job-id>
eve job diagnose <job-id>
eve job result <job-id>检查是否存在镜像仓库认证错误、密钥缺失或健康检查失败的情况。
GHCR Push Fails with UNAUTHORIZED
推送至GHCR时因UNAUTHORIZED失败
If build jobs fail with when pushing to GHCR:
UNAUTHORIZED: authentication required- Verify secrets are set:
eve secrets list --project proj_xxx - Confirm token scopes include +
read:packageswrite:packages - Check if the package is linked to the repo in GitHub Packages settings
- Add OCI source label to Dockerfile:
LABEL org.opencontainers.image.source="https://github.com/ORG/REPO"
Unlinked packages in GHCR only allow pushes from the package owner. Linking to a repo inherits repository collaborator permissions.
如果构建任务在推送至GHCR时失败并提示:
UNAUTHORIZED: authentication required- 确认已设置密钥:
eve secrets list --project proj_xxx - 确认令牌权限范围包含+
read:packageswrite:packages - 检查GitHub Packages设置中是否已将包关联至仓库
- 为Dockerfile添加OCI源标签:
LABEL org.opencontainers.image.source="https://github.com/ORG/REPO"
GHCR中未关联的包仅允许包所有者推送。将包关联至仓库后,会继承仓库协作者的权限。
Build Failures
构建失败
Symptoms
症状
- Pipeline fails at build step
- shows run status =
eve build diagnosefailed
- 流水线在构建步骤失败
- 显示运行状态 =
eve build diagnosefailed
Triage
分类排查
bash
eve build list --project <id> # Find recent builds
eve build diagnose <build_id> # Full state dump
eve build logs <build_id> # Raw build outputbash
eve build list --project <id> # 查找最近的构建记录
eve build diagnose <build_id> # 完整状态转储
eve build logs <build_id> # 原始构建输出Common Causes
常见原因
Registry authentication:
- Verify GHCR_USERNAME and GHCR_TOKEN (or GITHUB_TOKEN) secrets are set
- Token needs scope for GHCR pushes
write:packages - Check:
eve secrets list --project <id>
Dockerfile issues:
- Service must have in manifest pointing to directory with Dockerfile
build.context - Dockerfile path defaults to
<context>/Dockerfile - Multi-stage builds work with BuildKit; may fail with Kaniko
Workspace/clone errors:
- Build requires workspace at the correct git SHA
- Check for workspace preparation errors
eve build diagnose
Image push failures:
- OCI labels help link packages to repos: add to Dockerfile
LABEL org.opencontainers.image.source="https://github.com/OWNER/REPO" - Ensure registry host matches manifest
registry.host
镜像仓库认证问题:
- 确认已设置GHCR_USERNAME和GHCR_TOKEN(或GITHUB_TOKEN)密钥
- 令牌需要拥有权限范围才能推送至GHCR
write:packages - 检查命令:
eve secrets list --project <id>
Dockerfile问题:
- 服务清单中必须包含,指向包含Dockerfile的目录
build.context - Dockerfile路径默认为
<context>/Dockerfile - 多阶段构建在BuildKit下可正常工作,但可能在Kaniko下失败
工作区/克隆错误:
- 构建需要工作区处于正确的git SHA版本
- 查看输出中的工作区准备错误信息
eve build diagnose
镜像推送失败:
- OCI标签有助于将包关联至仓库:在Dockerfile中添加
LABEL org.opencontainers.image.source="https://github.com/OWNER/REPO" - 确保镜像仓库地址与清单中的一致
registry.host
Job Stuck or Blocked
任务停滞或被阻塞
bash
eve job show <job-id>
eve job dep list <job-id>Resolve dependencies or update phase with if appropriate.
eve job updatebash
eve job show <job-id>
eve job dep list <job-id>若合适,解决依赖问题或使用更新任务阶段。
eve job updateApp Not Reachable After Deploy
部署后应用无法访问
- Confirm deploy job succeeded ().
eve job result - Validate ingress host pattern: .
{service}.{orgSlug}-{projectSlug}-{env}.{domain} - Ensure service port matches .
x-eve.ingress.port
- 确认部署任务已成功完成(使用查看)。
eve job result - 验证入口主机格式:。
{service}.{orgSlug}-{projectSlug}-{env}.{domain} - 确保服务端口与配置一致。
x-eve.ingress.port
Escalation
升级处理
If CLI output is insufficient, collect:
eve system healtheve job diagnose <job-id>- manifest diff (recent changes)
Then hand off to the platform operator.
如果CLI输出信息不足以排查问题,请收集以下内容:
- 输出
eve system health - 输出
eve job diagnose <job-id> - 清单差异(最近的变更)
然后提交给平台运维人员处理。