eve-agent-optimisation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Eve Agent Optimisation

Eve Agent优化

The goal: get the agent to its objective in the fewest tool calls, fewest tokens, shortest time. Find where it wastes effort and eliminate it.

目标：让Agent以最少的工具调用、最少的token、最短的时间达成目标。找出它浪费资源的环节并消除。

Hard Rule: Recommend, Don't Change

铁则：只推荐，不修改

Never change the harness, model, reasoning effort, or permission policy without asking the user first. These are cost and capability decisions that belong to the project owner. Diagnose, explain the tradeoff, and recommend — then wait for approval.

**未经用户事先询问，切勿修改运行框架、模型、推理强度或权限策略。**这些属于项目所有者的成本和能力决策范畴。你只需诊断问题、解释权衡方案并给出建议，然后等待批准即可。

What You're Looking For

排查方向

Analyse agent execution logs to identify:

Wrong turns — agent tried an approach that couldn't work and had to backtrack.
Blind alleys — agent spent tokens exploring something irrelevant to the goal.
Unnecessary tool calls — agent read files it didn't need, ran commands that gave no useful information, or repeated calls with slight variations.
Missing context — agent had to discover something through trial and error that should have been stated in the SKILL.md or job description.
Wrong tool for the job — agent used a slow or fragile tool when a faster/native alternative exists (e.g., shelling out to
```
pdftotext
```
when the LLM reads PDFs natively).
Excessive reading — agent read entire large files when it only needed a section, or read many files looking for something that could have been found with a targeted search.
Verbose output — agent explained its reasoning at length when the task only needed a concise result.
Retry loops — agent repeated the same failing operation, hoping for a different result.

分析Agent执行日志，识别以下问题：

错误转向 — Agent尝试了不可能生效的方案，不得不回溯。
死胡同 — Agent消耗token探索了和目标无关的内容。
不必要的工具调用 — Agent读取了不需要的文件、运行了无法产出有效信息的命令，或是重复调用仅有微小差异的接口。
上下文缺失 — Agent需要通过试错才能获得本应在SKILL.md或任务描述中说明的信息。
工具选型错误 — Agent使用了速度慢、稳定性差的工具，而本可以用更快速的原生替代方案（比如当LLM本身支持读取PDF时，却调用shell执行
```
pdftotext
```
）。
过度读取 — Agent只需部分内容却读取了整个大文件，或是遍历大量文件查找本可以通过定向搜索定位的内容。
输出冗余 — 任务只需要简洁结果，Agent却大段解释推理过程。
重试循环 — Agent重复执行相同的失败操作，期望得到不同结果。

Diagnostic Workflow

诊断工作流

Step 1: Get the Execution Record

步骤1：获取执行记录

bash

eve job diagnose <job-id>          # Full timeline, routing, errors
eve job show <job-id> --verbose    # Phase, attempts, harness, agent
eve job receipt <job-id>           # Token usage + cost

Key numbers:

Input tokens — how much the agent read. High = reading too much.
Output tokens — how much it wrote. High = verbose or excessive reasoning.
Attempt count — more than 1 means the agent crashed or timed out.
Duration — compare against what a focused agent should take.

bash

eve job diagnose <job-id>          # 完整时间线、路由、错误信息
eve job show <job-id> --verbose    # 阶段、尝试次数、运行框架、Agent信息
eve job receipt <job-id>           # Token用量 + 成本

核心指标：

输入token — Agent读取的内容量。数值过高说明读取内容过多。
输出token — Agent生成的内容量。数值过高说明输出冗余或推理过程过于冗长。
尝试次数 — 次数大于1说明Agent发生崩溃或超时。
执行时长 — 和聚焦目标的Agent预期耗时做对比。

Step 2: Stream or Replay the Logs

步骤2：流式查看或回放日志

bash

eve job follow <job-id>            # Real-time (if still active)
eve job logs <job-id>              # Historical

Read the log sequentially. For each tool call, ask:

Did this advance the goal? If not, it's waste.
Could this have been avoided? If the SKILL.md had told the agent where to look, would it have skipped this?
Was this the right tool? Could a different approach have gotten the same information faster?
Was the scope right? Did the agent read an entire file when it needed 10 lines?

bash

eve job follow <job-id>            # 实时日志（任务仍在运行时使用）
eve job logs <job-id>              # 历史日志

按顺序读取日志。针对每一次工具调用，思考：

这次调用推进目标达成了吗？ 如果没有，就是资源浪费。
这次调用可以避免吗？ 如果SKILL.md已经告知Agent查找方向，它是不是就会跳过这一步？
工具选型是否正确？ 有没有其他方案可以更快获得相同信息？
调用范围是否合理？ Agent是不是只需要10行内容，却读取了整个文件？

Step 3: Map the Critical Path

步骤3：梳理关键路径

Identify the minimum set of tool calls needed to achieve the goal:

What files actually mattered?
What commands actually produced useful output?
What decisions were correct on first attempt?

Everything else is waste. Quantify: how many tool calls were on the critical path vs total? What percentage of tokens were spent on productive work?

识别达成目标所需的最小工具调用集合：

哪些文件是真正相关的？
哪些命令真正产出了有效输出？
哪些决策是首次尝试就正确的？

除此之外的所有操作都是浪费。量化分析：关键路径上的工具调用占总调用量的比例？用于有效工作的token占总token的百分比？

Step 4: Identify Root Causes

步骤4：定位根本原因

For each category of waste, trace back to the root cause:

Waste	Root Cause	Fix
Agent explored wrong files	SKILL.md doesn't say where to look	Add specific file paths or search patterns to SKILL.md
Agent tried wrong approach first	SKILL.md doesn't state the preferred approach	Add explicit instructions: "Do X, not Y"
Agent read files it didn't need	Job description too vague	Narrow the description; specify exact scope
Agent retried failing command	No error handling guidance	Add failure mode instructions to SKILL.md
Agent used wrong tool for file type	SKILL.md doesn't mention native capabilities	Add file-type routing: "PDFs: read natively. Images: view directly."
Agent read entire large file	No guidance on targeted reading	Add instructions: "Read only lines 1-50" or "Search for X"
Agent verbose in output	No output format specified	Specify exact format: JSON schema, attachment name, concise summary
Agent lacks context for decisions	Missing resource refs or env vars	Attach the right resources; ensure `with_apis` is configured
Agent re-discovers known facts	No persistent memory strategy	Use org docs, KV store, or attachments to carry forward knowledge
Agent slow due to provisioning	Too many resources, large clone, unnecessary toolchains	Trim resource refs, configure shallow clone, remove unused toolchains

针对每一类浪费，回溯根本原因：

浪费类型	根本原因	修复方案
Agent探索了错误的文件	SKILL.md未说明查找位置	在SKILL.md中添加具体的文件路径或搜索规则
Agent优先尝试了错误方案	SKILL.md未说明推荐方案	添加明确指令：「执行X，不要执行Y」
Agent读取了不需要的文件	任务描述过于模糊	收窄描述范围，明确具体边界
Agent重试失败命令	无错误处理指引	在SKILL.md中添加失败场景处理说明
Agent针对文件类型使用了错误工具	SKILL.md未提及原生能力	添加文件类型路由规则：「PDF：原生读取；图片：直接查看」
Agent读取了整个大文件	无定向读取指引	添加指令：「仅读取1-50行」或「搜索X内容」
Agent输出冗余	未指定输出格式	明确输出格式：JSON schema、附件名称、简洁摘要
Agent缺乏决策上下文	缺失资源引用或环境变量	关联正确的资源，确保 `with_apis` 已正确配置
Agent重复获取已知信息	无持久化内存策略	使用组织文档、KV存储或附件传递历史知识
Agent因环境初始化运行缓慢	资源过多、仓库克隆体积大、工具链冗余	精简资源引用、配置浅克隆、移除未使用的工具链

The Fix Is Almost Always the SKILL.md

优化方案几乎都可以通过修改SKILL.md实现

The SKILL.md is the highest-leverage optimisation target. A precise SKILL.md eliminates entire categories of wasted tool calls.

SKILL.md是杠杆率最高的优化对象。一份精确的SKILL.md可以消除整类工具调用浪费。

Write for Efficiency

编写高效的SKILL.md

State the goal in one sentence. The agent should know exactly what it's trying to achieve before doing anything.
Name specific files and paths. "Check the auth config" wastes tool calls searching. "Read
```
src/config/auth.ts
```
lines 1-30" is one tool call.
State the approach explicitly. "Use native PDF reading via the Read tool — do NOT shell out to conversion tools" prevents the agent from trying the wrong path.
Specify what NOT to do. If there's a common wrong turn, block it. "Do not read the entire test suite; only read the failing test file."
Define the output format. "Write a JSON attachment named
```
findings.json
```
with schema
```
{issues: [{file, line, severity, message}]}
```
." This eliminates formatting deliberation.
Tell the agent what context it has. "The resource index at
```
.eve/resources/index.json
```
lists all attached documents with mime_type. Read it first to determine processing strategy."

Provide decision trees for branches. Instead of "handle different file types appropriately":

Check mime_type in resource index:
- application/pdf → read natively, use page ranges for >10 pages
- text/* → read directly
- image/* → view directly (multimodal)
- other → describe and note for human review

Keep it short. Every word the agent reads consumes input tokens. Cut filler. Use tables and lists over prose.

用一句话说明目标。Agent在执行任何操作前都应该明确知道要达成什么结果。
明确指定文件和路径。「检查鉴权配置」会让Agent浪费工具调用搜索，「读取
```
src/config/auth.ts
```
的1-30行」只需要一次工具调用。
明确说明执行方案。「使用Read工具的原生PDF读取能力——不要调用shell执行转换工具」可以避免Agent走弯路。
明确禁止的操作。如果存在常见的错误转向，直接屏蔽。「不要读取整个测试套件，仅读取失败的测试文件即可」。
定义输出格式。「生成名为
```
findings.json
```
的JSON附件，schema为
```
{issues: [{file, line, severity, message}]}
```
」可以消除Agent在输出格式上的纠结。
告知Agent已有的上下文。「
```
.eve/resources/index.json
```
下的资源索引列出了所有关联文档的mime_type，优先读取该文件确定处理策略」。

提供分支决策树。不要写「合理处理不同文件类型」，而是写成：

检查资源索引中的mime_type：
- application/pdf → 原生读取，超过10页的文件按页范围读取
- text/* → 直接读取
- image/* → 直接查看（多模态能力）
- 其他类型 → 描述内容并标注等待人工审核

保持简洁。Agent读取的每个字都会消耗输入token。删除冗余内容，优先使用表格和列表而非大段文字。

Test the SKILL.md

测试SKILL.md效果

After rewriting, run the same job again and compare:

Fewer tool calls?
Fewer tokens?
Faster completion?
Correct result on first attempt?

bash

eve job compare <old-job-id> <new-job-id>   # Compare receipts

修改完成后，重新运行相同任务，对比以下指标：

工具调用次数是否减少？
Token用量是否减少？
完成速度是否更快？
是否首次尝试就得到正确结果？

bash

eve job compare <old-job-id> <new-job-id>   # 对比执行账单

Beyond the SKILL.md

超出SKILL.md调整范围的优化

When SKILL.md changes aren't sufficient, look at these levers (all require user approval to change):

当修改SKILL.md无法满足需求时，可以考虑以下调整项（所有修改都需要用户批准）：

Harness and Model

运行框架和模型

If the agent is consistently:

Too slow for the task → recommend a faster model (e.g., sonnet → haiku).
Not capable enough → recommend a more capable model (e.g., sonnet → opus).
Using too many thinking tokens → recommend lower reasoning effort.
Not thinking enough → recommend higher reasoning effort.

Present the tradeoff (speed vs cost vs quality) and let the user decide.

如果Agent持续出现以下问题：

任务执行过慢 → 推荐更快的模型（比如sonnet切换为haiku）。
能力不足无法完成任务 → 推荐能力更强的模型（比如sonnet切换为opus）。
思考token消耗过多 → 推荐降低推理强度。
推理深度不足 → 推荐提高推理强度。

给出权衡方案（速度vs成本vs质量），由用户做最终决策。

Permission Policy

权限策略

If the agent is blocked waiting for approvals on every file edit:

Recommend
```
yolo
```
for automated batch work.
Recommend
```
auto_edit
```
for supervised coding.
Explain the security implications.

如果Agent每次编辑文件都需要等待审批被阻塞：

自动化批量任务推荐开启
```
yolo
```
权限。
supervised编码场景推荐开启
```
auto_edit
```
权限。
说明对应的安全影响。

Resource Refs

资源引用

If provisioning is slow:

Remove resource refs the agent doesn't actually use.
Mark optional context as
```
required: false
```
.
Thread
```
mime_type
```
so the agent doesn't need to probe file types.

如果环境初始化过慢：

移除Agent实际不需要的资源引用。
将可选上下文标记为
```
required: false
```
。
补充
```
mime_type
```
信息，避免Agent探测文件类型。

Git Controls

Git控制

If the agent wastes time on git operations:

```
commit: auto
```
+
```
push: on_success
```
eliminates manual git ceremony.
```
create_branch: if_missing
```
avoids branch creation failures.
```
ref_policy: auto
```
minimises clone scope.

如果Agent在git操作上浪费时间：

配置
```
commit: auto
```
+
```
push: on_success
```
消除手动git操作流程。
配置
```
create_branch: if_missing
```
避免分支创建失败。
配置
```
ref_policy: auto
```
最小化克隆范围。

Job Scope

任务范围

If the agent is doing too much in one job:

Split into focused children via orchestration.
Each child gets a narrow scope and specialised SKILL.md.
Cheaper models for simpler children; capable models only where needed.

如果单个任务中Agent需要处理的内容过多：

通过编排拆分为多个聚焦的子任务。
每个子任务配置窄范围的专项SKILL.md。
简单子任务使用更便宜的模型，仅在必要场景使用高能力模型。

Team Coordination

团队协作

If child agents duplicate work:

Ensure skills read
```
.eve/coordination-inbox.md
```
at startup.
Wire
```
depends_on
```
for sequential steps.
Use attachments (not prose) for passing data between jobs.

如果子Agent重复执行相同工作：

确保技能启动时优先读取
```
.eve/coordination-inbox.md
```
。
为串行步骤配置
```
depends_on
```
依赖。
使用附件（而非文字描述）在任务间传递数据。

Optimisation Report Template

优化报告模板

After analysing an agent's execution, present findings in this format:

undefined

分析完Agent执行过程后，按以下格式输出结果：

undefined

Agent Optimisation Report: <job-id>

Agent优化报告：<job-id>

Goal: <what the agent was trying to do> Result: <succeeded/failed> in <duration> using <tokens> tokens (<cost>)

目标： <Agent的执行目标> 结果： <成功/失败>，耗时<duration>，消耗<tokens> token（成本<cost>）

Efficiency Score

效率得分

Total tool calls: N
Productive tool calls: M (X%)
Wasted tool calls: N-M (Y%)

总工具调用次数：N
有效工具调用次数：M (占比X%)
浪费工具调用次数：N-M (占比Y%)

Waste Categories

浪费分类

<category>: N calls, ~X tokens wasted
- Example: <specific wasteful action from logs>
- Fix: <specific SKILL.md or config change>

<类别>：N次调用，浪费约X token
- 示例：<日志中具体的浪费行为>
- 修复方案：<具体的SKILL.md或配置修改建议>

Recommended Changes

Expected Improvement

预期提升

Estimated tool calls: N → M
Estimated tokens: X → Y
Estimated time: A → B

undefined

预计工具调用次数：N → M
预计Token用量：X → Y
预计执行时长：A → B

undefined

Quick Reference: Common Waste Patterns

快速参考：常见浪费模式

Pattern	Signal in Logs	Fix
File hunting	Multiple `Read` calls to different files	Name the target file in SKILL.md
Grep cascade	Multiple searches with different patterns	Provide the right search term
Trial and error	Tool call fails, agent retries with variation	Document the correct approach
Over-reading	Read tool on 5000+ line file	Specify line ranges or tell agent to search first
Unnecessary exploration	Agent reads README, CHANGELOG, etc.	Explicitly say what NOT to read
Format deliberation	Long assistant turns deciding output structure	Specify output format in SKILL.md
Redundant validation	Agent re-checks things it already confirmed	Structure the SKILL.md as a linear flow
Native capability miss	Shell out to CLI tool when LLM can process directly	State native capabilities explicitly
Context re-discovery	Agent re-learns project structure every run	Use org docs or KV store for persistent context
Approval blocking	Agent pauses waiting for permission	Recommend `yolo` or `auto_edit` to user

模式	日志信号	修复方案
文件查找	多次针对不同文件的 `Read` 调用	在SKILL.md中明确目标文件
级联搜索	多次使用不同规则的搜索操作	提供正确的搜索关键词
试错执行	工具调用失败后，Agent微调参数重试	说明正确的执行方案
过度读取	对5000行以上的文件执行Read操作	指定行范围，或告知Agent优先搜索
无意义探索	Agent读取README、CHANGELOG等无关文件	明确说明禁止读取的内容
格式纠结	Agent多次输出长内容讨论输出结构	在SKILL.md中明确输出格式
冗余校验	Agent重复检查已经确认过的内容	将SKILL.md设计为线性流程
忽略原生能力	当LLM可以直接处理时，仍调用CLI工具	明确说明原生能力支持范围
上下文重复获取	Agent每次运行都重新学习项目结构	使用组织文档或KV存储持久化上下文
审批阻塞	Agent暂停运行等待权限	向用户推荐开启 `yolo` 或 `auto_edit`

eve-agent-optimisation

Original

Translation

Eve Agent Optimisation

Eve Agent优化

Hard Rule: Recommend, Don't Change

铁则：只推荐，不修改

What You're Looking For

排查方向

Diagnostic Workflow

诊断工作流

Step 1: Get the Execution Record

步骤1：获取执行记录

Step 2: Stream or Replay the Logs

步骤2：流式查看或回放日志

Step 3: Map the Critical Path

步骤3：梳理关键路径

Step 4: Identify Root Causes

步骤4：定位根本原因

The Fix Is Almost Always the SKILL.md

优化方案几乎都可以通过修改SKILL.md实现

Write for Efficiency

编写高效的SKILL.md

Test the SKILL.md

测试SKILL.md效果

Beyond the SKILL.md

超出SKILL.md调整范围的优化

Harness and Model

运行框架和模型

Permission Policy

权限策略

Resource Refs

资源引用

Git Controls

Git控制

Job Scope

任务范围

Team Coordination

团队协作

Optimisation Report Template

优化报告模板

Agent Optimisation Report: <job-id>

Agent优化报告：<job-id>

Efficiency Score

效率得分

Waste Categories

浪费分类

Recommended Changes

推荐修改项

Expected Improvement

预期提升

Quick Reference: Common Waste Patterns

快速参考：常见浪费模式

Related Skills

相关技能