eve-verification-plans

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Eve Verification Plans

Eve 验证计划

Author agentic verification plans — markdown documents that fully specify the steps to verify an Eve-compatible app works correctly AND conforms to Eve platform conventions. Plans are actionable by humans or agents.

编写Agent驱动的验证计划——这类Markdown文档完整规定了验证兼容Eve的应用正常运行且符合Eve平台约定的全部步骤，可供人工或Agent执行。

When to Use

适用场景

Building verification for a new Eve-compatible app
Auditing an existing app for Eve platform conformance
After significant feature work that needs structured validation
Before handoff — proving the app works the Eve way, end to end
Onboarding a new team to understand what "correctly built on Eve" looks like

为新的兼容Eve的应用构建验证体系
审计现有应用是否符合Eve平台规范
完成重大功能迭代后，需要进行结构化验证
交付前，端到端证明应用完全符合Eve开发规范
帮助新团队了解「符合Eve规范的正确构建方式」的标准

Quick Start

快速开始

bash

undefined

bash

undefined

1. Create the verification directory in your project

1. 在项目中创建验证目录

mkdir -p e2e-verification/00-smoke

2. Copy the smoke template

2. 复制冒烟测试模板

cp templates/00-smoke-test-plan.md e2e-verification/00-smoke/

3. Customize for your app (endpoints, services, agents)

3. 结合你的应用自定义内容（端点、服务、Agent等）

4. Run it

4. 执行测试计划

undefined

undefined

The Six Verification Dimensions

六大验证维度

Every Eve app has up to six dimensions to verify. Cover all that apply.

Dimension	Tool	When	Conformance Check
Platform conformance	Eve CLI + manifest inspection	Always	CLI parity, manifest conventions, secrets model
Service layer	Eve CLI + REST API	Always	Every endpoint reachable via CLI; no kubectl needed
Input / ingestion	Repo fixtures + upload commands	When app accepts files	Deterministic fixtures, real parsing flows
Data layer	Eve CLI + DB migrations	When app has DB	Migrations via Eve pipeline, not manual SQL
UI / visual	agent-browser or Playwright	When frontend exists	SSO login, dark/light mode, agent-testable
Agent behavior	`eve job follow` + optimization	When app has agents	Efficient completion, no blind alleys

每个Eve应用最多包含六个需要验证的维度，请覆盖所有适用的维度。

维度	工具	适用场景	合规检查项
平台合规性	Eve CLI + manifest检查	所有场景	CLI一致性、manifest约定、密钥模型
服务层	Eve CLI + REST API	所有场景	所有端点均可通过CLI访问，无需使用kubectl
输入/摄取	仓库fixture + 上传命令	应用支持接收文件时	确定性fixture、真实解析流程
数据层	Eve CLI + 数据库迁移	应用包含数据库时	通过Eve流水线执行迁移，不使用手动SQL
UI/视觉	agent-browser 或 Playwright	存在前端时	SSO登录、深色/浅色模式、支持Agent测试
Agent行为	`eve job follow` + 优化工具	应用包含Agent时	高效完成任务，无无效路径

Verification Plan Format

验证计划格式

Each scenario is a self-contained markdown document:

markdown

undefined

每个场景都是独立的Markdown文档：

markdown

undefined

Scenario NN: <Name>

场景 NN: <名称>

Time: ~Nm Environment: staging | local | both Parallel Safe: Yes/No Requires: LLM | Browser | None

<one-paragraph description>

预计耗时: ~Nm 环境: 预发 | 本地 | 两者都支持 支持并行执行: 是/否 依赖: LLM | 浏览器 | 无

<一段简短的场景描述>

Prerequisites

前置条件

What must be true before running
Required secrets, auth, prior scenarios

运行前必须满足的条件
所需的密钥、认证、前置场景

Fixtures

Fixture

File paths used by this scenario
Provenance or generation command for each
Why these files are representative

本场景使用的文件路径
每个文件的来源或生成命令
选择这些文件作为代表的原因

Setup

准备步骤

```bash

Environment detection + auth

环境检测 + 认证

Project/org setup

项目/组织配置

Fixture validation or generation

Fixture校验或生成

```

Phases

执行阶段

Phase 1: <Name>

阶段1: <名称>

```bash

Commands to execute

要执行的命令

```

Expected:

Bullet list of assertions
Each assertion is pass/fail verifiable

```

预期结果:

断言项列表
每个断言都支持明确的通过/失败校验

Phase 2: ...

阶段2: ...

Success Criteria

成功标准

Checkboxes for every pass/fail assertion
Grouped by phase

所有通过/失败断言的勾选框
按执行阶段分组

Debugging

调试指南

Symptom	Diagnostic	Fix
...	...	...

现象	诊断方法	修复方案
...	...	...

Cleanup

清理步骤

```bash

Teardown commands

环境清理命令

```

undefined

```

undefined

Format Rules

格式规则

Environment-aware: Every plan starts with environment detection —
```
EVE_API_URL
```
determines cloud vs local
Self-contained: No assumed state beyond documented prerequisites
Fixture-explicit: Every uploaded/imported artifact is checked in or generated from documented commands
Phased: Break into phases that can run independently (parallel where safe)
Assertion-driven: Every step has explicit
```
Expected:
```
with pass/fail criteria
Debuggable: Troubleshooting section with symptom → diagnostic → fix

感知环境: 所有计划以环境检测开头——通过
```
EVE_API_URL
```
判断是云端还是本地环境
自包含: 除了文档中明确列出的前置条件外，不依赖任何假设的状态
明确Fixture: 所有上传/导入的产物均已提交到仓库，或可通过文档记录的命令生成
分阶段: 拆分为可独立运行的阶段（安全的场景支持并行执行）
断言驱动: 每个步骤都有明确的
```
预期结果
```
，附带通过/失败判断标准
可调试: 包含故障排查部分，按「现象→诊断→修复」的结构编写

Platform Conformance Verification

平台合规性验证

Before testing functionality, verify the app is built the Eve way. Every verification suite starts with this checklist:

.eve/manifest.yaml

exists and passes

eve project sync --dry-run

Manifest follows current conventions (
```
name
```
preferred over legacy
```
project
```
)
All services have health endpoints reachable via Eve ingress
CLI can interact with every API endpoint (no "UI-only" functionality)
Secrets managed via
```
eve secrets
```
, not hardcoded or env-file-based
DB migrations run as pipeline steps, not manual scripts
Agents (if any) defined in
```
agents.yaml
```
with harness profiles
Pipelines (if any) defined in manifest and runnable via
```
eve pipeline run
```
Frontend (if any) authenticates via Eve SSO, not custom auth
Upload/import flows (if any) have deterministic fixtures checked in or generated locally

See

references/eve-conformance-checks.md

for the full checklist with rationale.

在测试功能之前，首先验证应用是按照Eve规范构建的。所有验证套件都以这个检查清单开头：

存在

.eve/manifest.yaml

，且能通过

eve project sync --dry-run

校验

Manifest符合当前约定（优先使用
```
name
```
字段，而非遗留的
```
project
```
字段）
所有服务都有可通过Eve ingress访问的健康检查端点
CLI可与所有API端点交互，不存在「仅UI可用」的功能
密钥通过
```
eve secrets
```
管理，没有硬编码或基于env文件的配置
数据库迁移作为流水线步骤执行，不使用手动脚本
Agent（如果有）在
```
agents.yaml
```
中定义，附带测试 harness 配置
流水线（如果有）在manifest中定义，可通过
```
eve pipeline run
```
执行
前端（如果有）通过Eve SSO认证，不使用自定义认证体系
上传/导入流程（如果有）的确定性fixture已提交到仓库或可在本地生成

完整的检查清单及设计理由请查看

references/eve-conformance-checks.md

。

Service Layer Verification

服务层验证

Test every API surface CLI-first:

bash

TOKEN=$(eve auth token --raw)

所有API接口优先通过CLI测试：

bash

TOKEN=$(eve auth token --raw)

Health check (always first)

健康检查（永远是第一步）

curl -sf "${APP_SCHEME}://api.${APP_DOMAIN}/health" | jq '.'

App API via auth token

携带认证令牌访问应用API

curl -sf -H "Authorization: Bearer $TOKEN"
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'


**CLI parity assertion**: For every `curl` call in a test plan, ask: "Can this also be done via a CLI command?" If not, file an issue — don't accept it.

Pattern:
1. **Eve CLI** commands where they exist (deploy, env, job, secrets)
2. **App CLI** if the app follows `eve-app-cli` patterns
3. **REST API** via `curl` with auth tokens for custom endpoints
4. **Auth**: Mint tokens via `eve auth mint` or `eve auth token --raw`

curl -sf -H "Authorization: Bearer $TOKEN"
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'


**CLI一致性断言**: 测试计划中的每一个`curl`调用都需要确认：「这个操作是否也可以通过CLI命令完成？」如果不行，需要提交issue，不能接受这种情况。

调用顺序：
1. 优先使用现有的**Eve CLI**命令（部署、环境、任务、密钥等）
2. 如果应用遵循`eve-app-cli`规范，使用**应用CLI**
3. 自定义端点使用带认证令牌的`curl`调用**REST API**
4. **认证**: 通过`eve auth mint`或`eve auth token --raw`生成令牌

Input / Ingestion Verification

输入/摄取验证

When the app accepts uploads, imports, or document bundles, verification must include a fixture plan.

当应用支持上传、导入或接收文档包时，验证流程必须包含fixture计划。

Fixture Selection Order

Fixture选择优先级

Reuse existing repo fixtures if they match accepted file types and are deterministic
Manufacture synthetic fixtures locally with committed scripts
Source small public-domain fixtures only when local manufacture would reduce realism

如果仓库现有fixture匹配接收的文件类型且具备确定性，优先复用
通过已提交到仓库的脚本在本地生成合成fixture
仅当本地生成会降低真实度时，才使用小型公共领域fixture

Fixture Matrix

Fixture矩阵

Minimal valid — smallest acceptable file that exercises the happy path
Typical real-world — representative document/media/import file
Boundary / invalid — wrong type, malformed structure, or size edge
Cross-format — if the app accepts multiple types (PDF + Markdown + CSV), verify each

最小有效文件：可走通正常流程的最小可接收文件
典型真实场景文件：具有代表性的文档/媒体/导入文件
边界/无效文件：错误类型、结构损坏、大小极端的文件
跨格式文件：如果应用支持多种类型（PDF + Markdown + CSV），逐个验证

What to Check

检查项

File accepted through the real app surface (CLI, REST endpoint, or browser upload)
MIME/type detection and metadata are correct
Storage/persistence path is correct
Downstream processing produces expected results
Error handling is explicit for rejected or malformed fixtures

Rule: If a plan says "upload a sample PDF", it must include an actual file path or generator step. "Find a PDF online" is not acceptable.

See

references/fixture-patterns.md

for detailed guidance.

文件可通过真实的应用入口接收（CLI、REST端点、浏览器上传）
MIME/类型检测和元数据正确
存储/持久化路径正确
下游处理产出预期结果
对被拒绝或结构损坏的fixture有明确的错误处理

规则: 如果计划中提到「上传一个示例PDF」，必须提供实际的文件路径或生成步骤。「在网上找一个PDF」是不可接受的。

详细指南请查看

references/fixture-patterns.md

。

UI Verification

UI验证

When the app has a frontend, verify visual quality and interaction flows.

当应用包含前端时，验证视觉质量和交互流程。

SSO Token Injection

SSO令牌注入

bash

undefined

bash

undefined

Mint an SSO token via CLI

通过CLI生成SSO令牌

SSO_TOKEN=$(eve auth mint --email user@example.com --org $ORG_ID --format sso-jwt)

Use agent-browser with the token

携带令牌使用agent-browser

agent-browser --session verify open "${APP_URL}/auth/callback?token=${SSO_TOKEN}" agent-browser --session verify wait --url "**/dashboard" agent-browser --session verify screenshot ./e2e-verification/artifacts/dashboard.png

undefined

undefined

What to Check

检查项

Pages render without console errors
Dark mode and light mode both work (screenshot both)
Key user flows complete (login → dashboard → action → result)
Responsive layout at standard breakpoints
Forms submit correctly and validation fires

See

references/ui-verification-patterns.md

for tool choice guidance and patterns.

页面渲染无控制台错误
深色模式和浅色模式均正常工作（两者都要截图）
核心用户流程可正常完成（登录→仪表盘→操作→结果）
在标准断点下响应式布局正常
表单提交正常，校验逻辑生效

工具选择指南和模式请查看

references/ui-verification-patterns.md

。

Agent Verification

Agent验证

When the app includes Eve agents, verification extends to behavior quality.

Create a job that exercises the agent's primary workflow
Follow execution:
```
eve job follow <job-id>
```
Check receipt:
```
eve job receipt <job-id>
```
(tokens, cost, duration)
Apply
```
eve-agent-optimisation
```
diagnostic workflow
Record baseline metrics in the test plan for regression detection

当应用包含Eve Agent时，需要额外验证行为质量。

创建一个触发Agent核心工作流的任务
跟踪执行状态：
```
eve job follow <job-id>
```
查看执行回执：
```
eve job receipt <job-id>
```
（令牌消耗、成本、耗时）
执行
```
eve-agent-optimisation
```
诊断工作流
在测试计划中记录基准指标，用于回归检测

What to Check

检查项

Agent completes its task (correct output)
Token usage within acceptable bounds
No unnecessary tool calls or blind alleys
Error cases handled gracefully (bad input, missing secrets)
Multi-agent coordination works (jobs complete in dependency order)

See

references/agent-verification-patterns.md

for integration with optimization.

Agent可完成任务（输出正确）
令牌消耗在可接受范围内
无不必要的工具调用或无效路径
错误场景处理优雅（输入错误、密钥缺失）
多Agent协同正常（任务按依赖顺序完成）

与优化工具的集成方法请查看

references/agent-verification-patterns.md

。

Deploy Cycle Patterns

部署循环模式

Verification often reveals issues. The fix/deploy loop:

验证过程中经常会发现问题，修复/部署循环如下：

Cloud (Staging) — Default

云端（预发环境）—— 默认

discover bug → fix code → commit → tag release-v* → push tag →
  wait for CI (publish-images → infra dispatch → deploy) →
  re-run failed scenario

发现bug → 修复代码 → 提交 → 打release-v*标签 → 推送标签 →
  等待CI执行（发布镜像→触发基础设施部署→部署完成）→
  重新运行失败的场景

Local (k3d)

本地（k3d）

discover bug → fix code → pnpm build →
  ./bin/eh k8s-image push → ./bin/eh k8s deploy →
  re-run failed scenario

See

references/deploy-cycle-patterns.md

for environment detection and wait patterns.

发现bug → 修复代码 → pnpm build →
  ./bin/eh k8s-image push → ./bin/eh k8s deploy →
  重新运行失败的场景

环境检测和等待模式请查看

references/deploy-cycle-patterns.md

。

Scenario Discovery

场景发现

How to identify what scenarios to create for a given app:

Read the manifest — every service is a verification target
Read agents config — every agent needs a behavioral test
Read the API spec — every endpoint group is a potential scenario
Check upload/import surfaces — every accepted file class needs fixtures
Check pipelines — build/deploy/workflow pipelines need end-to-end verification
Check the UI — every page/route needs visual verification
Check integrations — webhooks, chat gateways, external APIs

如何为特定应用确定需要创建的场景：

阅读manifest——每个服务都是一个验证目标
阅读Agent配置——每个Agent都需要行为测试
阅读API规范——每个端点组都是潜在的测试场景
检查上传/导入入口——每类可接收的文件都需要对应的fixture
检查流水线——构建/部署/工作流流水线都需要端到端验证
检查UI——每个页面/路由都需要视觉验证
检查集成——Webhook、聊天网关、外部API

Minimum Scenario Set

最小场景集

Scenario	Required	What It Covers
`00-smoke`	Always	Health, auth, connectivity + Eve conformance checklist
`01-deploy`	Always	Build, release, deploy via pipeline, verify endpoints
`02-core-flow`	Always	Primary user journey end-to-end (CLI + API)
`03-ui-visual`	If frontend	Screenshot verification, SSO login, dark/light mode
`04-input-ingestion`	If uploads/imports	Fixture upload, parsing, storage, error handling
`05-data-layer`	If database	Migrations via pipeline, schema correct, data integrity
`06-agent-execution`	If agents	Agents complete primary tasks correctly
`07-agent-optimization`	If agents	Baseline metrics + optimization pass
`08-pipeline-flows`	If pipelines	Each pipeline runs end-to-end, steps succeed in order

场景	是否必填	覆盖范围
`00-smoke`	所有场景必填	健康检查、认证、连通性 + Eve合规检查清单
`01-deploy`	所有场景必填	构建、发布、流水线部署、端点验证
`02-core-flow`	所有场景必填	端到端核心用户旅程（CLI + API）
`03-ui-visual`	存在前端时必填	截图验证、SSO登录、深色/浅色模式
`04-input-ingestion`	支持上传/导入时必填	Fixture上传、解析、存储、错误处理
`05-data-layer`	存在数据库时必填	流水线迁移、schema正确性、数据完整性
`06-agent-execution`	存在Agent时必填	Agent正确完成核心任务
`07-agent-optimization`	存在Agent时必填	基准指标 + 优化校验
`08-pipeline-flows`	存在流水线时必填	每条流水线端到端运行，步骤按顺序成功执行

Running Verification Plans

执行验证计划

Sequential Execution

顺序执行

bash

undefined

bash

undefined

Run scenarios in order

按顺序执行所有场景

for plan in e2e-verification/*/; do echo "=== Running: $plan ==="

Agent reads and executes the plan document

done

undefined

for plan in e2e-verification/*/; do echo "=== 正在执行: $plan ==="

Agent读取并执行计划文档

done

undefined

Parallel Execution

并行执行

Scenarios marked

Parallel Safe: Yes

can run concurrently. Typically:

```
00-smoke
```
runs first (validates prerequisites)
```
01-deploy
```
through
```
03-ui-visual
```
can parallelize
Agent scenarios (
```
06-07
```
) depend on deploy completing

标记为

Parallel Safe: Yes

的场景可以并发执行。通常：

```
00-smoke
```
优先执行（验证前置条件）
```
01-deploy
```
到
```
03-ui-visual
```
可以并行执行
Agent场景（
```
06-07
```
）依赖部署完成后执行

CI Integration

CI集成

bash

undefined

bash

undefined

In CI, set environment and run

在CI中设置环境并执行

export EVE_API_URL=https://api.eh1.incept5.dev eve auth login --email $CI_EMAIL --ssh-key $CI_SSH_KEY

Run all scenarios, collect artifacts

执行所有场景，收集产物

mkdir -p e2e-verification/artifacts

Agent executes each plan, screenshots/logs go to artifacts/

Agent执行每个计划，截图/日志存入artifacts/目录

undefined

undefined

File Structure

文件结构

<project-root>/
  e2e-verification/
    README.md                    # Index of all scenarios
    00-smoke/
      00-smoke-test-plan.md
    01-deploy/
      01-deploy-test-plan.md
    02-core-flow/
      02-core-flow-test-plan.md
      fixtures/
        README.md
        test-data.json
    03-ui-visual/
      03-ui-visual-test-plan.md
    04-input-ingestion/
      04-input-ingestion-test-plan.md
      fixtures/
        README.md
        sample-document.pdf
        sample-import.csv
        scripts/
          make-fixtures.sh
    artifacts/                   # Generated during runs (gitignored)

Numbering:

NN-kebab-name/

directories with matching

NN-kebab-name-test-plan.md

. Numbers imply execution order. Gaps are fine — they allow insertion without renaming.

Fixtures: Every scenario that depends on uploaded or imported inputs gets a sibling

fixtures/

directory.

fixtures/README.md

records provenance, generation steps, and sensitivity notes.

<项目根目录>/
  e2e-verification/
    README.md                    # 所有场景的索引
    00-smoke/
      00-smoke-test-plan.md
    01-deploy/
      01-deploy-test-plan.md
    02-core-flow/
      02-core-flow-test-plan.md
      fixtures/
        README.md
        test-data.json
    03-ui-visual/
      03-ui-visual-test-plan.md
    04-input-ingestion/
      04-input-ingestion-test-plan.md
      fixtures/
        README.md
        sample-document.pdf
        sample-import.csv
        scripts/
          make-fixtures.sh
    artifacts/                   # 运行时生成的产物（git忽略）

编号规则: 使用

NN-kebab-name/

目录，对应的计划文件名为

NN-kebab-name-test-plan.md

。数字代表执行顺序，允许编号留空——方便后续插入新场景无需重命名。

Fixture规则: 每个依赖上传或导入输入的场景都有同级的

fixtures/

目录。

fixtures/README.md

记录来源、生成步骤和敏感度说明。

References

参考文档

```
references/test-plan-format.md
```
— Full format specification with annotated examples
```
references/eve-conformance-checks.md
```
— The Eve way: what to verify and why
```
references/fixture-patterns.md
```
— Fixture sourcing, manufacture, and documentation
```
references/deploy-cycle-patterns.md
```
— Fix/deploy loop for cloud and local
```
references/ui-verification-patterns.md
```
— Browser automation and SSO auth patterns
```
references/agent-verification-patterns.md
```
— Agent testing and optimization integration

```
references/test-plan-format.md
```
——完整格式规范及带注释的示例
```
references/eve-conformance-checks.md
```
——Eve开发规范：需要验证的内容及理由
```
references/fixture-patterns.md
```
——Fixture的来源、生成和文档规范
```
references/deploy-cycle-patterns.md
```
——云端和本地的修复/部署循环
```
references/ui-verification-patterns.md
```
——浏览器自动化和SSO认证模式

references/agent-verification-patterns.md

——Agent测试和优化集成方法

Templates

模板

```
templates/00-smoke-test-plan.md
```
— Starter smoke + conformance template
```
templates/scenario-test-plan.md
```
— General scenario skeleton
```
templates/upload-ingest-test-plan.md
```
— Input-heavy scenario with fixture matrix

```
templates/00-smoke-test-plan.md
```
——冒烟测试+合规性检查入门模板
```
templates/scenario-test-plan.md
```
——通用场景骨架模板
```
templates/upload-ingest-test-plan.md
```
——输入密集型场景的fixture矩阵模板

Skill	Relationship
`eve-agent-optimisation`	Called from agent verification scenarios
`eve-web-ui-testing-agent-browser`	UI verification tool
`eve-deploy-debugging`	Deploy cycle troubleshooting
`eve-cli-primitives`	CLI commands used in service-layer tests
`eve-manifest-authoring`	Manifest conventions that conformance checks validate
`eve-app-cli`	App CLI patterns — verification asserts CLI parity
`eve-pipelines-workflows`	Pipeline conventions tested by pipeline scenarios
`eve-auth-and-secrets`	Auth + secrets model validated by conformance checks
`eve-troubleshooting`	General debugging when verification fails
`eve-read-eve-docs`	Reference source for current CLI/manifest/auth behavior
`eve-agent-native-design`	Design principles encoded in conformance checks

技能	关联关系
`eve-agent-optimisation`	Agent验证场景中调用的工具
`eve-web-ui-testing-agent-browser`	UI验证工具
`eve-deploy-debugging`	部署循环故障排查
`eve-cli-primitives`	服务层测试中使用的CLI命令
`eve-manifest-authoring`	合规性检查验证的manifest约定
`eve-app-cli`	应用CLI模式，验证断言CLI一致性
`eve-pipelines-workflows`	流水线场景测试的流水线约定
`eve-auth-and-secrets`	合规性检查验证的认证+密钥模型
`eve-troubleshooting`	验证失败时的通用调试方法
`eve-read-eve-docs`	当前CLI/manifest/认证行为的参考来源
`eve-agent-native-design`	合规性检查中包含的设计原则