evanflow-iterate
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEvanFlow: Iterate
EvanFlow:迭代
Vocabulary
术语
See meta-skill. Key terms: deep modules, deletion test, vertical slice.
evanflow请查看元技能。关键术语:deep modules、deletion test、vertical slice。
evanflowWhen to Use
使用时机
- After finishes all tasks
evanflow-executing-plans - After any non-trivial implementation
- When asked to "polish this" / "review this" / "make sure it's clean"
SKIP when: the change is one line or trivially correct.
- 在完成所有任务后
evanflow-executing-plans - 在任何非琐碎的实现完成后
- 当被要求“优化这个”/“审查这个”/“确保代码干净”时
跳过场景: 修改仅为一行代码或显然完全正确。
The Loop
循环流程
Repeat until stopping condition met:
重复执行直到满足停止条件:
1. Run All Quality Checks
1. 运行所有质量检查
Run the project's quality checks — exact commands are project-specific (see CLAUDE.md or the project's README). Typical examples across stacks:
bash
undefined运行项目的质量检查——具体命令因项目而异(请查看CLAUDE.md或项目的README)。不同技术栈的典型示例:
bash
undefinedtypecheck — one of:
typecheck — one of:
tsc --noEmit # TypeScript
pnpm typecheck # if scripted
cargo check # Rust
go vet ./... # Go
tsc --noEmit # TypeScript
pnpm typecheck # if scripted
cargo check # Rust
go vet ./... # Go
lint — one of:
lint — one of:
pnpm lint
eslint .
cargo clippy
ruff check .
pnpm lint
eslint .
cargo clippy
ruff check .
test — one of:
test — one of:
pnpm test
pytest
cargo test
go test ./...
If any check fails: fix and restart the loop. Don't proceed to step 2 with broken checks.pnpm test
pytest
cargo test
go test ./...
若任何检查失败:修复问题并重新开始循环。检查未通过时不要进入步骤2。2. Re-Read the Diff With Fresh Eyes
2. 以全新视角重新阅读代码差异
bash
git diff # working-tree changes
git diff HEAD~N..HEAD # if reviewing a series of past commitsFor each changed file, look critically for:
- Dead code — leftover console.logs, commented-out blocks, unused imports/vars
- Naming — does the name match what the code does? (Ubiquitous language matters; see .)
evanflow-glossary - Deletion test — does each new module earn its existence? Could removing it improve the code?
- Magic strings/numbers — should be enums or constants per CLAUDE.md
- Error handling — boundary inputs validated? External calls wrapped? Loading/error/empty states in UI?
- Type safety — any ,
any,as? Justified?@ts-ignore - Security — where needed? Resource ownership re-derived from
authenticatedProcedure? Per CLAUDE.md.ctx.user - Test coverage — does the new behavior have a test? Does the test verify behavior, not internals?
- Test assertion correctness — research shows 62% of LLM-generated assertions are wrong. For each assertion, would a one-character bug in the implementation still let it pass? If yes, the assertion is too weak.
- Scope creep — anything in the diff that wasn't in the plan?
- Comments — only WHY notes that explain non-obvious constraints. Delete WHAT comments.
Fix what you find. Then restart from step 1.
bash
git diff # working-tree changes
git diff HEAD~N..HEAD # if reviewing a series of past commits针对每个修改的文件,重点检查以下内容:
- 冗余代码——遗留的console.log、注释掉的代码块、未使用的导入/变量
- 命名——名称是否与代码功能匹配?(通用语言很重要;请查看。)
evanflow-glossary - Deletion test——每个新模块是否有存在的必要?移除它是否能优化代码?
- 魔法字符串/数字——根据CLAUDE.md,应替换为枚举或常量
- 错误处理——边界输入是否已验证?外部调用是否已包装?UI中是否有加载/错误/空状态处理?
- 类型安全——是否存在、
any、as?是否有合理理由?@ts-ignore - 安全性——必要处是否使用了?资源所有权是否从
authenticatedProcedure重新推导?遵循CLAUDE.md的要求。ctx.user - 测试覆盖率——新行为是否有对应的测试?测试是否验证行为而非内部实现?
- 测试断言正确性——研究表明,62%的LLM生成的断言存在错误。对于每个断言,若实现中出现单个字符的错误仍能通过测试,则该断言过于薄弱。
- 范围蔓延——代码差异中是否存在计划外的内容?
- 注释——仅保留解释非显而易见约束的“原因”注释。删除描述“内容”的注释。
修复发现的问题,然后从步骤1重新开始。
2.5. Five Failure Modes Check
2.5. 五种失败模式检查
Industry research identifies five predictable failure modes in agentic coding. After step 2's diff review, do an explicit pass against each:
- (a) Hallucinated actions — did the implementation invent file paths, env vars, IDs, function names, library APIs, or other external values that aren't authoritatively confirmed? (Example: a reference when the actual var name is
process.env.STRIPE_SECRET_KEY.)STRIPE_SK - (b) Scope creep — does the diff touch files or behaviors not in the plan? Bundled refactors or stylistic changes that should be separate PRs?
- (c) Cascading errors — was a failure suppressed/caught/wrapped in a way that hides root cause from callers? Are there silent fallbacks that mask bugs (try/catch returning empty arrays, default values that paper over missing data)?
- (d) Context loss — does the diff contradict earlier decisions in the session, the plan, CLAUDE.md, or ? Names, conventions, invariants?
CONTEXT.md - (e) Tool misuse — used the wrong tool (e.g., Bash for file reads, MCP server when CLI was simpler), or used a tool with wrong parameters (e.g., grep without proper escaping, Edit without reading first)?
For each mode flagged, fix and restart from step 1.
行业研究指出智能体编码中存在五种可预见的失败模式。完成步骤2的代码差异审查后,需针对每种模式进行明确检查:
- (a) 幻觉行为——实现过程中是否虚构了未被权威确认的文件路径、环境变量、ID、函数名、库API或其他外部值?(示例:实际变量名为,却引用了
STRIPE_SK。)process.env.STRIPE_SECRET_KEY - (b) 范围蔓延——代码差异是否涉及计划外的文件或行为?是否包含应作为单独PR的重构或样式修改?
- (c) 级联错误——是否以向调用者隐藏根本原因的方式抑制/捕获/包装了错误?是否存在掩盖bug的静默回退(如try/catch返回空数组、用默认值掩盖缺失数据)?
- (d) 上下文丢失——代码差异是否与会话中先前的决策、计划、CLAUDE.md或相矛盾?包括命名、约定、不变量等方面?
CONTEXT.md - (e) 工具误用——是否使用了错误的工具(如用Bash读取文件,而CLI更简单时却使用MCP服务器),或使用工具时参数错误(如grep未正确转义,未先读取就进行编辑)?
针对每个标记出的模式,修复问题并从步骤1重新开始。
3. (UI work only) Visual Verification
3.(仅UI工作)视觉验证
If the diff touches frontend page or component files and the change has visible output:
Default approach (no Playwright needed):
bash
undefined若代码差异涉及前端页面或组件文件,且修改会产生可见输出:
默认方法(无需Playwright):
bash
undefinedMake sure your dev server is running first (e.g., pnpm dev, npm run dev, etc.)
Make sure your dev server is running first (e.g., pnpm dev, npm run dev, etc.)
chromium --headless --no-sandbox
--screenshot=/tmp/iter-$(date +%s).png
--window-size=1440,900
http://localhost:<port>/<route>
--screenshot=/tmp/iter-$(date +%s).png
--window-size=1440,900
http://localhost:<port>/<route>
(If your project doesn't have `chromium`, substitute `google-chrome --headless` or `chrome --headless` with the same flags.)
Then read the screenshot:
Read /tmp/iter-*.png
Check against:
- Any brainstorm mockup or design comp the project maintains
- The project's design system (colors, spacing, typography, component patterns documented in CLAUDE.md)
- Responsive behavior — also screenshot at `--window-size=390,844` (mobile)
**If you need interaction** (click, fill, observe modal): use Playwright MCP. If MCP fails with "chrome not found", configure it to use your installed Chromium binary by adding `"--executable-path", "/path/to/chromium"` to args in the Playwright `.mcp.json`. Don't fight the MCP — fix it once, then use it.chromium --headless --no-sandbox
--screenshot=/tmp/iter-$(date +%s).png
--window-size=1440,900
http://localhost:<port>/<route>
--screenshot=/tmp/iter-$(date +%s).png
--window-size=1440,900
http://localhost:<port>/<route>
(若项目未安装`chromium`,可替换为`google-chrome --headless`或`chrome --headless`并使用相同参数。)
然后读取截图:
Read /tmp/iter-*.png
检查以下内容:
- 项目维护的任何头脑风暴mockup或设计稿
- 项目的设计系统(CLAUDE.md中记录的颜色、间距、排版、组件模式)
- 响应式表现——同时截取`--window-size=390,844`(移动端)的截图
**若需要交互**(点击、填写、观察模态框):使用Playwright MCP。若MCP提示“chrome not found”失败,可在Playwright的`.mcp.json`的args中添加`"--executable-path", "/path/to/chromium"`,配置为使用已安装的Chromium二进制文件。不要强行使用MCP——先修复配置,再使用它。4. Stopping Condition
4. 停止条件
Stop the loop when all are true:
- All quality checks pass
- Re-read the diff and find no new issues you'd want to fix
- (UI) Screenshot matches expectation, OR you've confirmed with the user
Hard cap: 5 iterations. If you're still finding issues at iteration 5, the original plan was wrong — stop and ask the user. Don't iterate forever.
当以下所有条件均满足时,停止循环:
- 所有质量检查通过
- 重新阅读代码差异后未发现新的需要修复的问题
- (UI)截图符合预期,或已与用户确认
硬限制:5次迭代。 若第5次迭代仍发现问题,说明原始计划存在错误——停止并询问用户。不要无限迭代。
Hard Rules
硬性规则
- Don't iterate just to iterate. If everything is clean on the first pass, stop. Don't invent issues.
- Fix root causes, not symptoms. A linter warning that you suppress instead of fix is debt.
- Never auto-commit, never auto-stage, never auto-finish. Iteration produces a clean working tree. After convergence, report what was done and stop. The user decides whether to commit, refactor further, or change direction.
- Never iterate past the user. If the user says "good enough," stop. Their judgment beats the loop.
- Visual verification requires a running dev server. If the dev server isn't up, ask the user to start it (don't try to start it yourself unless the project has a documented "start dev" skill).
- 不要为了迭代而迭代。 若首次检查就无问题,直接停止。不要凭空制造问题。
- 修复根本原因,而非表面症状。 抑制而非修复lint警告会产生技术债务。
- 绝不自动提交、绝不暂存、绝不自动完成。 迭代的结果是一个干净的工作区。收敛后,汇报已完成的工作并停止。由用户决定是否提交、进一步重构或改变方向。
- 不要违背用户意愿继续迭代。 若用户说“足够好了”,停止迭代。用户的判断优先于循环规则。
- 视觉验证需要运行中的开发服务器。 若开发服务器未启动,请用户启动它(除非项目有文档记录的“启动开发服务”技能,否则不要自行尝试启动)。
Hand-offs
交接处理
- Loop converged, all clean → report what was done and STOP. Await user direction. No auto-finish, no staging, no commit.
- Loop hit cap with issues remaining → back to (plan was wrong)
evanflow-writing-plans - Found architectural issues →
evanflow-improve-architecture - Found a bug →
evanflow-debug
- 循环收敛,代码无问题 → 汇报已完成的工作并停止。等待用户指示。 不要自动完成、暂存或提交。
- 循环达到次数限制仍有问题 → 返回(计划存在错误)
evanflow-writing-plans - 发现架构问题 →
evanflow-improve-architecture - 发现bug →
evanflow-debug