eve-troubleshooting

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Eve Troubleshooting

Eve故障排查

Use CLI-first diagnostics. Do not assume cluster access.
使用CLI优先的诊断方式。不要假设拥有集群访问权限。

Quick Triage Checklist

快速分类检查清单

bash
eve system health
eve auth status
eve job list --phase active
bash
eve system health
eve auth status
eve job list --phase active

Common Issues and Fixes

常见问题与修复方案

Auth Fails or "Not authenticated"

认证失败或显示“未认证”

bash
eve auth logout
eve auth login
eve auth status
If SSH key is missing, register it with the admin or follow the CLI prompt to fetch from GitHub.
bash
eve auth logout
eve auth login
eve auth status
如果SSH密钥缺失,请联系管理员注册,或按照CLI提示从GitHub获取。

Secret Missing / Interpolation Error

密钥缺失/插值错误

bash
eve secrets list --project proj_xxx
eve secrets set MISSING_KEY "value" --project proj_xxx
Verify
.eve/secrets.yaml
exists for local interpolation.
bash
eve secrets list --project proj_xxx
eve secrets set MISSING_KEY "value" --project proj_xxx
确认本地存在
.eve/secrets.yaml
文件以支持本地插值。

Deploy Job Failed

部署任务失败

bash
eve job follow <job-id>
eve job diagnose <job-id>
eve job result <job-id>
Check for registry auth errors, missing secrets, or healthcheck failures.
bash
eve job follow <job-id>
eve job diagnose <job-id>
eve job result <job-id>
检查是否存在镜像仓库认证错误、密钥缺失或健康检查失败的情况。

GHCR Push Fails with UNAUTHORIZED

推送至GHCR时因UNAUTHORIZED失败

If build jobs fail with
UNAUTHORIZED: authentication required
when pushing to GHCR:
  1. Verify secrets are set:
    eve secrets list --project proj_xxx
  2. Confirm token scopes include
    read:packages
    +
    write:packages
  3. Check if the package is linked to the repo in GitHub Packages settings
  4. Add OCI source label to Dockerfile:
    LABEL org.opencontainers.image.source="https://github.com/ORG/REPO"
Unlinked packages in GHCR only allow pushes from the package owner. Linking to a repo inherits repository collaborator permissions.
如果构建任务在推送至GHCR时失败并提示
UNAUTHORIZED: authentication required
  1. 确认已设置密钥:
    eve secrets list --project proj_xxx
  2. 确认令牌权限范围包含
    read:packages
    +
    write:packages
  3. 检查GitHub Packages设置中是否已将包关联至仓库
  4. 为Dockerfile添加OCI源标签:
    LABEL org.opencontainers.image.source="https://github.com/ORG/REPO"
GHCR中未关联的包仅允许包所有者推送。将包关联至仓库后,会继承仓库协作者的权限。

Build Failures

构建失败

Symptoms

症状

  • Pipeline fails at build step
  • eve build diagnose
    shows run status =
    failed
  • 流水线在构建步骤失败
  • eve build diagnose
    显示运行状态 =
    failed

Triage

分类排查

bash
eve build list --project <id>          # Find recent builds
eve build diagnose <build_id>          # Full state dump
eve build logs <build_id>              # Raw build output
bash
eve build list --project <id>          # 查找最近的构建记录
eve build diagnose <build_id>          # 完整状态转储
eve build logs <build_id>              # 原始构建输出

Common Causes

常见原因

Registry authentication:
  • Verify GHCR_USERNAME and GHCR_TOKEN (or GITHUB_TOKEN) secrets are set
  • Token needs
    write:packages
    scope for GHCR pushes
  • Check:
    eve secrets list --project <id>
Dockerfile issues:
  • Service must have
    build.context
    in manifest pointing to directory with Dockerfile
  • Dockerfile path defaults to
    <context>/Dockerfile
  • Multi-stage builds work with BuildKit; may fail with Kaniko
Workspace/clone errors:
  • Build requires workspace at the correct git SHA
  • Check
    eve build diagnose
    for workspace preparation errors
Image push failures:
  • OCI labels help link packages to repos: add
    LABEL org.opencontainers.image.source="https://github.com/OWNER/REPO"
    to Dockerfile
  • Ensure registry host matches manifest
    registry.host
镜像仓库认证问题:
  • 确认已设置GHCR_USERNAME和GHCR_TOKEN(或GITHUB_TOKEN)密钥
  • 令牌需要拥有
    write:packages
    权限范围才能推送至GHCR
  • 检查命令:
    eve secrets list --project <id>
Dockerfile问题:
  • 服务清单中必须包含
    build.context
    ,指向包含Dockerfile的目录
  • Dockerfile路径默认为
    <context>/Dockerfile
  • 多阶段构建在BuildKit下可正常工作,但可能在Kaniko下失败
工作区/克隆错误:
  • 构建需要工作区处于正确的git SHA版本
  • 查看
    eve build diagnose
    输出中的工作区准备错误信息
镜像推送失败:
  • OCI标签有助于将包关联至仓库:在Dockerfile中添加
    LABEL org.opencontainers.image.source="https://github.com/OWNER/REPO"
  • 确保镜像仓库地址与清单中的
    registry.host
    一致

Job Stuck or Blocked

任务停滞或被阻塞

bash
eve job show <job-id>
eve job dep list <job-id>
Resolve dependencies or update phase with
eve job update
if appropriate.
bash
eve job show <job-id>
eve job dep list <job-id>
若合适,解决依赖问题或使用
eve job update
更新任务阶段。

App Not Reachable After Deploy

部署后应用无法访问

  • Confirm deploy job succeeded (
    eve job result
    ).
  • Validate ingress host pattern:
    {service}.{orgSlug}-{projectSlug}-{env}.{domain}
    .
  • Ensure service port matches
    x-eve.ingress.port
    .
  • 确认部署任务已成功完成(使用
    eve job result
    查看)。
  • 验证入口主机格式:
    {service}.{orgSlug}-{projectSlug}-{env}.{domain}
  • 确保服务端口与
    x-eve.ingress.port
    配置一致。

Escalation

升级处理

If CLI output is insufficient, collect:
  • eve system health
  • eve job diagnose <job-id>
  • manifest diff (recent changes)
Then hand off to the platform operator.
如果CLI输出信息不足以排查问题,请收集以下内容:
  • eve system health
    输出
  • eve job diagnose <job-id>
    输出
  • 清单差异(最近的变更)
然后提交给平台运维人员处理。