eve-verification-plans
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEve Verification Plans
Eve 验证计划
Author agentic verification plans — markdown documents that fully specify the steps to verify an Eve-compatible app works correctly AND conforms to Eve platform conventions. Plans are actionable by humans or agents.
编写Agent驱动的验证计划——这类Markdown文档完整规定了验证兼容Eve的应用正常运行且符合Eve平台约定的全部步骤,可供人工或Agent执行。
When to Use
适用场景
- Building verification for a new Eve-compatible app
- Auditing an existing app for Eve platform conformance
- After significant feature work that needs structured validation
- Before handoff — proving the app works the Eve way, end to end
- Onboarding a new team to understand what "correctly built on Eve" looks like
- 为新的兼容Eve的应用构建验证体系
- 审计现有应用是否符合Eve平台规范
- 完成重大功能迭代后,需要进行结构化验证
- 交付前,端到端证明应用完全符合Eve开发规范
- 帮助新团队了解「符合Eve规范的正确构建方式」的标准
Quick Start
快速开始
bash
undefinedbash
undefined1. Create the verification directory in your project
1. 在项目中创建验证目录
mkdir -p e2e-verification/00-smoke
mkdir -p e2e-verification/00-smoke
2. Copy the smoke template
2. 复制冒烟测试模板
cp templates/00-smoke-test-plan.md e2e-verification/00-smoke/
cp templates/00-smoke-test-plan.md e2e-verification/00-smoke/
3. Customize for your app (endpoints, services, agents)
3. 结合你的应用自定义内容(端点、服务、Agent等)
4. Run it
4. 执行测试计划
undefinedundefinedThe Six Verification Dimensions
六大验证维度
Every Eve app has up to six dimensions to verify. Cover all that apply.
| Dimension | Tool | When | Conformance Check |
|---|---|---|---|
| Platform conformance | Eve CLI + manifest inspection | Always | CLI parity, manifest conventions, secrets model |
| Service layer | Eve CLI + REST API | Always | Every endpoint reachable via CLI; no kubectl needed |
| Input / ingestion | Repo fixtures + upload commands | When app accepts files | Deterministic fixtures, real parsing flows |
| Data layer | Eve CLI + DB migrations | When app has DB | Migrations via Eve pipeline, not manual SQL |
| UI / visual | agent-browser or Playwright | When frontend exists | SSO login, dark/light mode, agent-testable |
| Agent behavior | | When app has agents | Efficient completion, no blind alleys |
每个Eve应用最多包含六个需要验证的维度,请覆盖所有适用的维度。
| 维度 | 工具 | 适用场景 | 合规检查项 |
|---|---|---|---|
| 平台合规性 | Eve CLI + manifest检查 | 所有场景 | CLI一致性、manifest约定、密钥模型 |
| 服务层 | Eve CLI + REST API | 所有场景 | 所有端点均可通过CLI访问,无需使用kubectl |
| 输入/摄取 | 仓库fixture + 上传命令 | 应用支持接收文件时 | 确定性fixture、真实解析流程 |
| 数据层 | Eve CLI + 数据库迁移 | 应用包含数据库时 | 通过Eve流水线执行迁移,不使用手动SQL |
| UI/视觉 | agent-browser 或 Playwright | 存在前端时 | SSO登录、深色/浅色模式、支持Agent测试 |
| Agent行为 | | 应用包含Agent时 | 高效完成任务,无无效路径 |
Verification Plan Format
验证计划格式
Each scenario is a self-contained markdown document:
markdown
undefined每个场景都是独立的Markdown文档:
markdown
undefinedScenario NN: <Name>
场景 NN: <名称>
Time: ~Nm
Environment: staging | local | both
Parallel Safe: Yes/No
Requires: LLM | Browser | None
<one-paragraph description>预计耗时: ~Nm
环境: 预发 | 本地 | 两者都支持
支持并行执行: 是/否
依赖: LLM | 浏览器 | 无
<一段简短的场景描述>
Prerequisites
前置条件
- What must be true before running
- Required secrets, auth, prior scenarios
- 运行前必须满足的条件
- 所需的密钥、认证、前置场景
Fixtures
Fixture
- File paths used by this scenario
- Provenance or generation command for each
- Why these files are representative
- 本场景使用的文件路径
- 每个文件的来源或生成命令
- 选择这些文件作为代表的原因
Setup
准备步骤
```bash
```bash
Environment detection + auth
环境检测 + 认证
Project/org setup
项目/组织配置
Fixture validation or generation
Fixture校验或生成
```
```
Phases
执行阶段
Phase 1: <Name>
阶段1: <名称>
```bash
```bash
Commands to execute
要执行的命令
```
Expected:
- Bullet list of assertions
- Each assertion is pass/fail verifiable
```
预期结果:
- 断言项列表
- 每个断言都支持明确的通过/失败校验
Phase 2: ...
阶段2: ...
Success Criteria
成功标准
- Checkboxes for every pass/fail assertion
- Grouped by phase
- 所有通过/失败断言的勾选框
- 按执行阶段分组
Debugging
调试指南
| Symptom | Diagnostic | Fix |
|---|---|---|
| ... | ... | ... |
| 现象 | 诊断方法 | 修复方案 |
|---|---|---|
| ... | ... | ... |
Cleanup
清理步骤
```bash
```bash
Teardown commands
环境清理命令
```
undefined```
undefinedFormat Rules
格式规则
- Environment-aware: Every plan starts with environment detection — determines cloud vs local
EVE_API_URL - Self-contained: No assumed state beyond documented prerequisites
- Fixture-explicit: Every uploaded/imported artifact is checked in or generated from documented commands
- Phased: Break into phases that can run independently (parallel where safe)
- Assertion-driven: Every step has explicit with pass/fail criteria
Expected: - Debuggable: Troubleshooting section with symptom → diagnostic → fix
- 感知环境: 所有计划以环境检测开头——通过判断是云端还是本地环境
EVE_API_URL - 自包含: 除了文档中明确列出的前置条件外,不依赖任何假设的状态
- 明确Fixture: 所有上传/导入的产物均已提交到仓库,或可通过文档记录的命令生成
- 分阶段: 拆分为可独立运行的阶段(安全的场景支持并行执行)
- 断言驱动: 每个步骤都有明确的,附带通过/失败判断标准
预期结果 - 可调试: 包含故障排查部分,按「现象→诊断→修复」的结构编写
Platform Conformance Verification
平台合规性验证
Before testing functionality, verify the app is built the Eve way. Every verification suite starts with this checklist:
- exists and passes
.eve/manifest.yamleve project sync --dry-run - Manifest follows current conventions (preferred over legacy
name)project - All services have health endpoints reachable via Eve ingress
- CLI can interact with every API endpoint (no "UI-only" functionality)
- Secrets managed via , not hardcoded or env-file-based
eve secrets - DB migrations run as pipeline steps, not manual scripts
- Agents (if any) defined in with harness profiles
agents.yaml - Pipelines (if any) defined in manifest and runnable via
eve pipeline run - Frontend (if any) authenticates via Eve SSO, not custom auth
- Upload/import flows (if any) have deterministic fixtures checked in or generated locally
See for the full checklist with rationale.
references/eve-conformance-checks.md在测试功能之前,首先验证应用是按照Eve规范构建的。所有验证套件都以这个检查清单开头:
- 存在,且能通过
.eve/manifest.yaml校验eve project sync --dry-run - Manifest符合当前约定(优先使用字段,而非遗留的
name字段)project - 所有服务都有可通过Eve ingress访问的健康检查端点
- CLI可与所有API端点交互,不存在「仅UI可用」的功能
- 密钥通过管理,没有硬编码或基于env文件的配置
eve secrets - 数据库迁移作为流水线步骤执行,不使用手动脚本
- Agent(如果有)在中定义,附带测试 harness 配置
agents.yaml - 流水线(如果有)在manifest中定义,可通过执行
eve pipeline run - 前端(如果有)通过Eve SSO认证,不使用自定义认证体系
- 上传/导入流程(如果有)的确定性fixture已提交到仓库或可在本地生成
完整的检查清单及设计理由请查看。
references/eve-conformance-checks.mdService Layer Verification
服务层验证
Test every API surface CLI-first:
bash
TOKEN=$(eve auth token --raw)所有API接口优先通过CLI测试:
bash
TOKEN=$(eve auth token --raw)Health check (always first)
健康检查(永远是第一步)
curl -sf "${APP_SCHEME}://api.${APP_DOMAIN}/health" | jq '.'
curl -sf "${APP_SCHEME}://api.${APP_DOMAIN}/health" | jq '.'
App API via auth token
携带认证令牌访问应用API
curl -sf -H "Authorization: Bearer $TOKEN"
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'
**CLI parity assertion**: For every `curl` call in a test plan, ask: "Can this also be done via a CLI command?" If not, file an issue — don't accept it.
Pattern:
1. **Eve CLI** commands where they exist (deploy, env, job, secrets)
2. **App CLI** if the app follows `eve-app-cli` patterns
3. **REST API** via `curl` with auth tokens for custom endpoints
4. **Auth**: Mint tokens via `eve auth mint` or `eve auth token --raw`curl -sf -H "Authorization: Bearer $TOKEN"
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'
"${APP_SCHEME}://api.${APP_DOMAIN}/endpoint" | jq '.field'
**CLI一致性断言**: 测试计划中的每一个`curl`调用都需要确认:「这个操作是否也可以通过CLI命令完成?」如果不行,需要提交issue,不能接受这种情况。
调用顺序:
1. 优先使用现有的**Eve CLI**命令(部署、环境、任务、密钥等)
2. 如果应用遵循`eve-app-cli`规范,使用**应用CLI**
3. 自定义端点使用带认证令牌的`curl`调用**REST API**
4. **认证**: 通过`eve auth mint`或`eve auth token --raw`生成令牌Input / Ingestion Verification
输入/摄取验证
When the app accepts uploads, imports, or document bundles, verification must include a fixture plan.
当应用支持上传、导入或接收文档包时,验证流程必须包含fixture计划。
Fixture Selection Order
Fixture选择优先级
- Reuse existing repo fixtures if they match accepted file types and are deterministic
- Manufacture synthetic fixtures locally with committed scripts
- Source small public-domain fixtures only when local manufacture would reduce realism
- 如果仓库现有fixture匹配接收的文件类型且具备确定性,优先复用
- 通过已提交到仓库的脚本在本地生成合成fixture
- 仅当本地生成会降低真实度时,才使用小型公共领域fixture
Fixture Matrix
Fixture矩阵
- Minimal valid — smallest acceptable file that exercises the happy path
- Typical real-world — representative document/media/import file
- Boundary / invalid — wrong type, malformed structure, or size edge
- Cross-format — if the app accepts multiple types (PDF + Markdown + CSV), verify each
- 最小有效文件:可走通正常流程的最小可接收文件
- 典型真实场景文件:具有代表性的文档/媒体/导入文件
- 边界/无效文件:错误类型、结构损坏、大小极端的文件
- 跨格式文件:如果应用支持多种类型(PDF + Markdown + CSV),逐个验证
What to Check
检查项
- File accepted through the real app surface (CLI, REST endpoint, or browser upload)
- MIME/type detection and metadata are correct
- Storage/persistence path is correct
- Downstream processing produces expected results
- Error handling is explicit for rejected or malformed fixtures
Rule: If a plan says "upload a sample PDF", it must include an actual file path or generator step. "Find a PDF online" is not acceptable.
See for detailed guidance.
references/fixture-patterns.md- 文件可通过真实的应用入口接收(CLI、REST端点、浏览器上传)
- MIME/类型检测和元数据正确
- 存储/持久化路径正确
- 下游处理产出预期结果
- 对被拒绝或结构损坏的fixture有明确的错误处理
规则: 如果计划中提到「上传一个示例PDF」,必须提供实际的文件路径或生成步骤。「在网上找一个PDF」是不可接受的。
详细指南请查看。
references/fixture-patterns.mdUI Verification
UI验证
When the app has a frontend, verify visual quality and interaction flows.
当应用包含前端时,验证视觉质量和交互流程。
SSO Token Injection
SSO令牌注入
bash
undefinedbash
undefinedMint an SSO token via CLI
通过CLI生成SSO令牌
SSO_TOKEN=$(eve auth mint --email user@example.com --org $ORG_ID --format sso-jwt)
SSO_TOKEN=$(eve auth mint --email user@example.com --org $ORG_ID --format sso-jwt)
Use agent-browser with the token
携带令牌使用agent-browser
agent-browser --session verify open "${APP_URL}/auth/callback?token=${SSO_TOKEN}"
agent-browser --session verify wait --url "**/dashboard"
agent-browser --session verify screenshot ./e2e-verification/artifacts/dashboard.png
undefinedagent-browser --session verify open "${APP_URL}/auth/callback?token=${SSO_TOKEN}"
agent-browser --session verify wait --url "**/dashboard"
agent-browser --session verify screenshot ./e2e-verification/artifacts/dashboard.png
undefinedWhat to Check
检查项
- Pages render without console errors
- Dark mode and light mode both work (screenshot both)
- Key user flows complete (login → dashboard → action → result)
- Responsive layout at standard breakpoints
- Forms submit correctly and validation fires
See for tool choice guidance and patterns.
references/ui-verification-patterns.md- 页面渲染无控制台错误
- 深色模式和浅色模式均正常工作(两者都要截图)
- 核心用户流程可正常完成(登录→仪表盘→操作→结果)
- 在标准断点下响应式布局正常
- 表单提交正常,校验逻辑生效
工具选择指南和模式请查看。
references/ui-verification-patterns.mdAgent Verification
Agent验证
When the app includes Eve agents, verification extends to behavior quality.
- Create a job that exercises the agent's primary workflow
- Follow execution:
eve job follow <job-id> - Check receipt: (tokens, cost, duration)
eve job receipt <job-id> - Apply diagnostic workflow
eve-agent-optimisation - Record baseline metrics in the test plan for regression detection
当应用包含Eve Agent时,需要额外验证行为质量。
- 创建一个触发Agent核心工作流的任务
- 跟踪执行状态:
eve job follow <job-id> - 查看执行回执:(令牌消耗、成本、耗时)
eve job receipt <job-id> - 执行诊断工作流
eve-agent-optimisation - 在测试计划中记录基准指标,用于回归检测
What to Check
检查项
- Agent completes its task (correct output)
- Token usage within acceptable bounds
- No unnecessary tool calls or blind alleys
- Error cases handled gracefully (bad input, missing secrets)
- Multi-agent coordination works (jobs complete in dependency order)
See for integration with optimization.
references/agent-verification-patterns.md- Agent可完成任务(输出正确)
- 令牌消耗在可接受范围内
- 无不必要的工具调用或无效路径
- 错误场景处理优雅(输入错误、密钥缺失)
- 多Agent协同正常(任务按依赖顺序完成)
与优化工具的集成方法请查看。
references/agent-verification-patterns.mdDeploy Cycle Patterns
部署循环模式
Verification often reveals issues. The fix/deploy loop:
验证过程中经常会发现问题,修复/部署循环如下:
Cloud (Staging) — Default
云端(预发环境)—— 默认
discover bug → fix code → commit → tag release-v* → push tag →
wait for CI (publish-images → infra dispatch → deploy) →
re-run failed scenario发现bug → 修复代码 → 提交 → 打release-v*标签 → 推送标签 →
等待CI执行(发布镜像→触发基础设施部署→部署完成)→
重新运行失败的场景Local (k3d)
本地(k3d)
discover bug → fix code → pnpm build →
./bin/eh k8s-image push → ./bin/eh k8s deploy →
re-run failed scenarioSee for environment detection and wait patterns.
references/deploy-cycle-patterns.md发现bug → 修复代码 → pnpm build →
./bin/eh k8s-image push → ./bin/eh k8s deploy →
重新运行失败的场景环境检测和等待模式请查看。
references/deploy-cycle-patterns.mdScenario Discovery
场景发现
How to identify what scenarios to create for a given app:
- Read the manifest — every service is a verification target
- Read agents config — every agent needs a behavioral test
- Read the API spec — every endpoint group is a potential scenario
- Check upload/import surfaces — every accepted file class needs fixtures
- Check pipelines — build/deploy/workflow pipelines need end-to-end verification
- Check the UI — every page/route needs visual verification
- Check integrations — webhooks, chat gateways, external APIs
如何为特定应用确定需要创建的场景:
- 阅读manifest——每个服务都是一个验证目标
- 阅读Agent配置——每个Agent都需要行为测试
- 阅读API规范——每个端点组都是潜在的测试场景
- 检查上传/导入入口——每类可接收的文件都需要对应的fixture
- 检查流水线——构建/部署/工作流流水线都需要端到端验证
- 检查UI——每个页面/路由都需要视觉验证
- 检查集成——Webhook、聊天网关、外部API
Minimum Scenario Set
最小场景集
| Scenario | Required | What It Covers |
|---|---|---|
| Always | Health, auth, connectivity + Eve conformance checklist |
| Always | Build, release, deploy via pipeline, verify endpoints |
| Always | Primary user journey end-to-end (CLI + API) |
| If frontend | Screenshot verification, SSO login, dark/light mode |
| If uploads/imports | Fixture upload, parsing, storage, error handling |
| If database | Migrations via pipeline, schema correct, data integrity |
| If agents | Agents complete primary tasks correctly |
| If agents | Baseline metrics + optimization pass |
| If pipelines | Each pipeline runs end-to-end, steps succeed in order |
| 场景 | 是否必填 | 覆盖范围 |
|---|---|---|
| 所有场景必填 | 健康检查、认证、连通性 + Eve合规检查清单 |
| 所有场景必填 | 构建、发布、流水线部署、端点验证 |
| 所有场景必填 | 端到端核心用户旅程(CLI + API) |
| 存在前端时必填 | 截图验证、SSO登录、深色/浅色模式 |
| 支持上传/导入时必填 | Fixture上传、解析、存储、错误处理 |
| 存在数据库时必填 | 流水线迁移、schema正确性、数据完整性 |
| 存在Agent时必填 | Agent正确完成核心任务 |
| 存在Agent时必填 | 基准指标 + 优化校验 |
| 存在流水线时必填 | 每条流水线端到端运行,步骤按顺序成功执行 |
Running Verification Plans
执行验证计划
Sequential Execution
顺序执行
bash
undefinedbash
undefinedRun scenarios in order
按顺序执行所有场景
for plan in e2e-verification/*/; do
echo "=== Running: $plan ==="
Agent reads and executes the plan document
done
undefinedfor plan in e2e-verification/*/; do
echo "=== 正在执行: $plan ==="
Agent读取并执行计划文档
done
undefinedParallel Execution
并行执行
Scenarios marked can run concurrently. Typically:
Parallel Safe: Yes- runs first (validates prerequisites)
00-smoke - through
01-deploycan parallelize03-ui-visual - Agent scenarios () depend on deploy completing
06-07
标记为的场景可以并发执行。通常:
Parallel Safe: Yes- 优先执行(验证前置条件)
00-smoke - 到
01-deploy可以并行执行03-ui-visual - Agent场景()依赖部署完成后执行
06-07
CI Integration
CI集成
bash
undefinedbash
undefinedIn CI, set environment and run
在CI中设置环境并执行
export EVE_API_URL=https://api.eh1.incept5.dev
eve auth login --email $CI_EMAIL --ssh-key $CI_SSH_KEY
export EVE_API_URL=https://api.eh1.incept5.dev
eve auth login --email $CI_EMAIL --ssh-key $CI_SSH_KEY
Run all scenarios, collect artifacts
执行所有场景,收集产物
mkdir -p e2e-verification/artifacts
mkdir -p e2e-verification/artifacts
Agent executes each plan, screenshots/logs go to artifacts/
Agent执行每个计划,截图/日志存入artifacts/目录
undefinedundefinedFile Structure
文件结构
<project-root>/
e2e-verification/
README.md # Index of all scenarios
00-smoke/
00-smoke-test-plan.md
01-deploy/
01-deploy-test-plan.md
02-core-flow/
02-core-flow-test-plan.md
fixtures/
README.md
test-data.json
03-ui-visual/
03-ui-visual-test-plan.md
04-input-ingestion/
04-input-ingestion-test-plan.md
fixtures/
README.md
sample-document.pdf
sample-import.csv
scripts/
make-fixtures.sh
artifacts/ # Generated during runs (gitignored)Numbering: directories with matching . Numbers imply execution order. Gaps are fine — they allow insertion without renaming.
NN-kebab-name/NN-kebab-name-test-plan.mdFixtures: Every scenario that depends on uploaded or imported inputs gets a sibling directory. records provenance, generation steps, and sensitivity notes.
fixtures/fixtures/README.md<项目根目录>/
e2e-verification/
README.md # 所有场景的索引
00-smoke/
00-smoke-test-plan.md
01-deploy/
01-deploy-test-plan.md
02-core-flow/
02-core-flow-test-plan.md
fixtures/
README.md
test-data.json
03-ui-visual/
03-ui-visual-test-plan.md
04-input-ingestion/
04-input-ingestion-test-plan.md
fixtures/
README.md
sample-document.pdf
sample-import.csv
scripts/
make-fixtures.sh
artifacts/ # 运行时生成的产物(git忽略)编号规则: 使用目录,对应的计划文件名为。数字代表执行顺序,允许编号留空——方便后续插入新场景无需重命名。
NN-kebab-name/NN-kebab-name-test-plan.mdFixture规则: 每个依赖上传或导入输入的场景都有同级的目录。记录来源、生成步骤和敏感度说明。
fixtures/fixtures/README.mdReferences
参考文档
- — Full format specification with annotated examples
references/test-plan-format.md - — The Eve way: what to verify and why
references/eve-conformance-checks.md - — Fixture sourcing, manufacture, and documentation
references/fixture-patterns.md - — Fix/deploy loop for cloud and local
references/deploy-cycle-patterns.md - — Browser automation and SSO auth patterns
references/ui-verification-patterns.md - — Agent testing and optimization integration
references/agent-verification-patterns.md
- ——完整格式规范及带注释的示例
references/test-plan-format.md - ——Eve开发规范:需要验证的内容及理由
references/eve-conformance-checks.md - ——Fixture的来源、生成和文档规范
references/fixture-patterns.md - ——云端和本地的修复/部署循环
references/deploy-cycle-patterns.md - ——浏览器自动化和SSO认证模式
references/ui-verification-patterns.md - ——Agent测试和优化集成方法
references/agent-verification-patterns.md
Templates
模板
- — Starter smoke + conformance template
templates/00-smoke-test-plan.md - — General scenario skeleton
templates/scenario-test-plan.md - — Input-heavy scenario with fixture matrix
templates/upload-ingest-test-plan.md
- ——冒烟测试+合规性检查入门模板
templates/00-smoke-test-plan.md - ——通用场景骨架模板
templates/scenario-test-plan.md - ——输入密集型场景的fixture矩阵模板
templates/upload-ingest-test-plan.md
Related Skills
相关技能
| Skill | Relationship |
|---|---|
| Called from agent verification scenarios |
| UI verification tool |
| Deploy cycle troubleshooting |
| CLI commands used in service-layer tests |
| Manifest conventions that conformance checks validate |
| App CLI patterns — verification asserts CLI parity |
| Pipeline conventions tested by pipeline scenarios |
| Auth + secrets model validated by conformance checks |
| General debugging when verification fails |
| Reference source for current CLI/manifest/auth behavior |
| Design principles encoded in conformance checks |
| 技能 | 关联关系 |
|---|---|
| Agent验证场景中调用的工具 |
| UI验证工具 |
| 部署循环故障排查 |
| 服务层测试中使用的CLI命令 |
| 合规性检查验证的manifest约定 |
| 应用CLI模式,验证断言CLI一致性 |
| 流水线场景测试的流水线约定 |
| 合规性检查验证的认证+密钥模型 |
| 验证失败时的通用调试方法 |
| 当前CLI/manifest/认证行为的参考来源 |
| 合规性检查中包含的设计原则 |