agile-tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

TDD (Test-Driven Development)

TDD(测试驱动开发)

Guide the Red-Green-Refactor cycle and pragmatic testing strategy. "Write tests. Not too many. Mostly integration."
Initial context received via slash: $ARGUMENTS
If
$ARGUMENTS
is filled (e.g., module name, feature description), use as starting point. If empty, ask what will be tested.
指导Red-Green-Refactor循环与务实的测试策略。遵循原则:"编写测试,但不要过多,以集成测试为主。"
通过斜杠命令接收初始上下文:$ARGUMENTS
如果
$ARGUMENTS
已填写(如模块名称、功能描述),则以此为起点。 如果为空,则询问用户要测试的内容。

Language

语言规范

Write artifacts and test descriptions in the user's language. When in doubt, ask. Test code itself (function names, assertions) stays in English.
用用户的语言编写工件和测试描述。如有疑问,可询问用户。测试代码本身(函数名、断言)需保持英文。

Project root

项目根目录

This skill writes artifacts at paths relative to the project root (the repo where the work happens), not the agent's current working directory.
  • If invoked from inside the project, use the relative paths shown in this skill.
  • If invoked from another directory (e.g., a sibling repo, or when the project lives elsewhere), prepend
    <project-root>/
    to every artifact path.
  • When the project root is ambiguous, confirm with the user via the harness question tool before writing.
本技能会在项目根目录(即工作所在的仓库)的相对路径下生成工件,而非Agent当前的工作目录。
  • 如果从项目内部调用,使用本技能中显示的相对路径。
  • 如果从其他目录调用(如兄弟仓库,或项目位于其他位置),请在每个工件路径前添加
    <project-root>/
  • 当项目根目录不明确时,在生成工件前通过工具询问用户确认。

Prompting

提示规范

Follow the project-wide convention in
CLAUDE.md
/
AGENTS.md
("Skill Prompting Conventions"). Use the harness's structured-question tool —
AskUserQuestion
(Claude Code),
ask_user_question
(Codex), or
question
(OpenCode) — for the decision points below. Use free-form text only where a path/name/value cannot be enumerated.
Decision pointWhy structuredSuggested options
Enforcement mode (when installing)Hard-to-undo policy choicewarn · block · keep current
Test strategy (when ambiguous)Affects file layoutsibling · sibling_dir · tests_root
Exempt a specific pathEdits guardrails configyes · no · review later
Free-form prompts (no structured tool):
  • Test descriptions
  • Exemption rationale
No-pause mode: if the user has explicitly disabled mid-skill clarification, convert every structured prompt into an entry under Open questions (or equivalent) and proceed without blocking.
遵循
CLAUDE.md
/
AGENTS.md
中的项目级规范("技能提示约定")。使用工具的结构化提问功能——Claude Code的
AskUserQuestion
、Codex的
ask_user_question
或OpenCode的
question
——处理以下决策点。仅在路径/名称/值无法枚举时使用自由格式文本。
决策点为何使用结构化建议选项
安装时的强制执行模式难以撤销的策略选择warn · block · keep current
测试策略(存在歧义时)影响文件布局sibling · sibling_dir · tests_root
豁免特定路径编辑防护配置yes · no · review later
自由格式提示(无需结构化工具):
  • 测试描述
  • 豁免理由
无暂停模式:如果用户明确禁用了技能执行中的澄清环节,请将所有结构化提示转换为「未解决问题」(或等效条目)并继续执行,无需阻塞。

When to use

适用场景

  • Starting a new feature with TDD
  • Adding tests to existing code
  • Establishing test coverage for a module
  • Unclear whether something needs unit, integration, or E2E tests
  • 采用TDD启动新功能开发
  • 为现有代码添加测试
  • 为模块建立测试覆盖率
  • 不确定某功能需要单元测试、集成测试还是端到端(E2E)测试

When NOT to use

不适用场景

  • Quick prototypes where tests add no value -- use
    /agile-proto
  • Throwaway scripts
  • Pure documentation changes
  • 测试无价值的快速原型——请使用
    /agile-proto
  • 一次性脚本
  • 纯文档变更

TDD cycle

TDD循环

  1. Red -- write a failing test that describes the desired behavior
  2. Green -- write the minimum code to make it pass
  3. Refactor -- improve structure without changing behavior
  4. Repeat
Present each step explicitly. Do not skip Red -- the test must fail first.
  1. Red(红)——编写一个描述期望行为的失败测试
  2. Green(绿)——编写最少代码使测试通过
  3. Refactor(重构)——在不改变行为的前提下优化代码结构
  4. 重复上述步骤
需明确呈现每个步骤,不得跳过Red阶段——测试必须先失败。

Test pyramid (pragmatic)

务实测试金字塔

LayerTargetFocus
Unit60%Pure functions, transformers, utils
Integration30%Services, DB interactions, API routes
E2E10%Critical user flows
Overall coverage target: 75%+.
For front-end work, treat these percentages as risk guidance, not quotas. Prefer integration tests that exercise user behavior, validation, local state, API contracts, permissions, offline/sync behavior, and critical flows. Avoid tests that only assert static text, that a button rendered, or implementation details of a design-system component.
When a project keeps business rules in
planning/<initiative>/business/*.md
, use those rule IDs to decide what deserves tests. Tests should prove behavior behind important rules, not restate the rule text.
层级占比目标关注重点
单元测试60%纯函数、转换器、工具类
集成测试30%服务、数据库交互、API路由
E2E测试10%关键用户流程
整体覆盖率目标:75%+。
对于前端工作,这些百分比仅作为风险指导,而非硬性指标。优先选择能覆盖用户行为、验证逻辑、本地状态、API契约、权限、离线/同步行为及关键流程的集成测试。避免仅断言静态文本、按钮是否渲染或设计系统组件实现细节的测试。
当项目在
planning/<initiative>/business/*.md
中存储业务规则时,使用这些规则ID来决定哪些内容需要测试。测试应验证重要规则背后的行为,而非重复规则文本。

File structure

文件结构

  • Unit: co-located with source (
    foo.test.ts
    beside
    foo.ts
    )
  • Integration/E2E:
    tests/
    with
    integration/
    ,
    e2e/
    ,
    helpers/
    ,
    fixtures/
    ,
    mocks/
  • Naming:
    .test.ts
    (unit/integration),
    .e2e.test.ts
    (E2E)
  • Never
    .spec.ts
  • 单元测试:与源码同目录(
    foo.test.ts
    foo.ts
    相邻)
  • 集成/E2E测试:存放在
    tests/
    目录下,包含
    integration/
    e2e/
    helpers/
    fixtures/
    mocks/
    子目录
  • 命名规范
    .test.ts
    (单元/集成测试),
    .e2e.test.ts
    (E2E测试)
  • 禁止使用
    .spec.ts

Rules

规则

  • AAA pattern (Arrange / Act / Assert)
  • One concept per test
  • Descriptive names that read as sentences
  • Always use factories (e.g.,
    faker
    ) over hardcoded data
  • Isolate with
    beforeEach
    -- no shared state between tests
  • Test behavior, not implementation details
  • AAA模式(Arrange / Act / Assert,准备/执行/断言)
  • 每个测试仅验证一个概念
  • 描述性命名,可读为完整句子
  • 始终使用工厂函数(如
    faker
    )而非硬编码数据
  • 使用
    beforeEach
    隔离测试——测试间无共享状态
  • 测试行为,而非实现细节

Anti-patterns (avoid)

反模式(需避免)

  • Interdependent tests (test A depends on test B running first)
  • Arbitrary
    sleep(ms)
    -- use proper waits
  • Testing private methods -- test through public API
  • console.log
    in tests -- use proper assertions
  • Order-dependent tests
  • Mocking what you own (mock external dependencies, not your own code)
  • 依赖型测试(测试A依赖测试B先执行)
  • 随意使用
    sleep(ms)
    ——使用合适的等待机制
  • 测试私有方法——通过公共API进行测试
  • 测试中使用
    console.log
    ——使用正规断言
  • 顺序依赖型测试
  • 模拟自身代码(仅模拟外部依赖,而非自有代码)

Coverage targets (granular)

细分覆盖率目标

AreaTarget
Transformers / pure functions90%+
Utils85%+
Services80%+
Routes / handlers70%+
领域目标
转换器/纯函数90%+
工具类85%+
服务80%+
路由/处理器70%+

Commands (Bun)

Bun命令

bun test
bun test --watch
bun test --coverage
bun test --filter "name"
bun test src/dir/
Adjust for other runtimes (vitest, jest) as needed. Detect the project's test runner from
package.json
or config files before suggesting commands.
bun test
bun test --watch
bun test --coverage
bun test --filter "name"
bun test src/dir/
根据其他运行时(vitest、jest)调整命令。在推荐命令前,先从
package.json
或配置文件中检测项目的测试运行器。

Process

流程

1. Understand what to test

1. 明确测试对象

Explore the code to understand:
  • What module or feature needs tests
  • What behaviors are critical
  • What is already covered (check existing tests)
  • Which business rule IDs, acceptance criteria, or prototype flows the change must satisfy
探索代码以了解:
  • 哪个模块或功能需要测试
  • 哪些行为是关键的
  • 已有哪些测试覆盖(检查现有测试)
  • 变更必须满足哪些业务规则ID、验收标准或原型流程

2. Choose the right test type

2. 选择合适的测试类型

Use the test pyramid as guide:
  • Pure function with no side effects? Unit test.
  • Service that talks to DB or external API? Integration test.
  • Critical user flow that spans multiple systems? E2E test.
  • Front-end behavior with validation, API contract, permission, optimistic update, or offline/sync state? Integration test.
  • Static copy, simple rendering, or visual-only detail with no rule? Usually no test unless it protects a known regression.
For local-first products, give priority to tests that cover command validation, optimistic state, offline queue persistence, reconciliation, conflict handling, permissions, and audit events.
以测试金字塔为指导:
  • 无副作用的纯函数?使用单元测试。
  • 与数据库或外部API交互的服务?使用集成测试。
  • 跨多个系统的关键用户流程?使用E2E测试。
  • 包含验证逻辑、API契约、权限、乐观更新或离线/同步状态的前端行为?使用集成测试。
  • 静态文案、简单渲染或无规则的纯视觉细节?通常无需测试,除非能防止已知回归。
对于本地优先产品,优先测试命令验证、乐观状态、离线队列持久化、协调、冲突处理、权限和审计事件。

3. Execute the TDD cycle

3. 执行TDD循环

For each behavior:
  1. Write the failing test (Red)
  2. Implement the minimum code (Green)
  3. Refactor if needed
  4. Verify the test still passes
Record the business rule ID or acceptance criterion in the test description or surrounding story artifact when that mapping helps future refinement.
针对每个行为:
  1. 编写失败测试(Red)
  2. 实现最少代码(Green)
  3. 必要时进行重构
  4. 验证测试仍能通过
当业务规则ID或验收标准与测试的映射有助于未来优化时,将其记录在测试描述或相关的故事工件中。

4. Verify coverage

4. 验证覆盖率

Run coverage and check against targets. Fill gaps in critical areas first.
运行覆盖率检查并对照目标。优先填补关键领域的缺口。

Chaining

流程联动

  • During feature implementation: work inside the
    /agile-story
    checklist
  • After implementation:
    /agile-refinement
    to review test quality
  • Before closing: ensure tests are part of
    /agile-status
    (closure mode) verification
  • If the TDD workflow exposes repeated friction, missing guidance, weak templates, or unclear verification, capture a concise skill feedback note with the affected skill/template, evidence, proposed change, and validation artifact.
  • If repeated TDD friction suggests a skill/template change, use
    /agile-skill-feedback
    before editing the process library.
  • 功能实现期间:在
    /agile-story
    检查清单内工作
  • 实现完成后:使用
    /agile-refinement
    评审测试质量
  • 关闭前:确保测试是
    /agile-status
    (关闭模式)验证的一部分
  • 如果TDD工作流程暴露出重复摩擦、缺失指导、薄弱模板或模糊验证,需记录简洁的技能反馈笔记,包含受影响的技能/模板、证据、提议的变更及验证工件。
  • 如果重复的TDD摩擦表明需要修改技能/模板,请在编辑流程库前使用
    /agile-skill-feedback

Enforcement (optional, opt-in per project)

强制执行(可选,按项目启用)

Beyond advisory guidance, this skill ships hook templates that turn the TDD rule into a project-level guardrail. When a project enables them, every implementation session is checked at the file-write level — without the agent having to remember to invoke this skill.
除了指导性建议,本技能还提供钩子模板,可将TDD规则转化为项目级防护措施。当项目启用后,每次实现会话都会在文件写入层面进行检查——无需Agent刻意调用本技能。

What gets enforced

强制执行内容

  • PreToolUse on
    Write|Edit|MultiEdit
    — if the target file matches
    source_paths
    in
    .tdd-guardrails.yml
    and a companion test does not exist, the hook warns (
    mode: warn
    ) or blocks the tool call (
    mode: block
    ).
  • Stop hook — at session end, scans the git diff and reports source files touched without a companion test.
  • SessionStart hook — announces "TDD enforcement active" so the agent knows the rule is in force.
The hook templates live at
skills/agile-tdd/templates/hooks/*.sh.tmpl
and the config schema at
skills/agile-tdd/templates/tdd-guardrails.yml.tmpl
.
  • Write|Edit|MultiEdit
    操作前的PreToolUse检查
    ——如果目标文件匹配
    .tdd-guardrails.yml
    中的
    source_paths
    且未存在配套测试,钩子会发出警告(
    mode: warn
    )或阻止工具调用(
    mode: block
    )。
  • Stop钩子——会话结束时,扫描git差异并报告未添加配套测试的已修改源码文件。
  • SessionStart钩子——宣布"TDD强制执行已激活",让Agent知晓规则生效。
钩子模板位于
skills/agile-tdd/templates/hooks/*.sh.tmpl
,配置模式位于
skills/agile-tdd/templates/tdd-guardrails.yml.tmpl

Config (
.tdd-guardrails.yml
)

配置(
.tdd-guardrails.yml

KeyMeaning
enabled
Global on/off
mode
warn
(stderr only) or
block
(PreToolUse rejects the call)
source_paths
Globs that require a companion test
test_strategy
sibling
(
foo.ts
foo.test.ts
),
sibling_dir
(
__tests__/foo.test.ts
), or
tests_root
(
<app>/tests/integration/foo.test.ts
)
exemptions
Globs allowed without a test (entry points, generated files, UI primitives)
Pattern semantics: the hook scripts use bash
case
globs
, not extended globstar. In bash
case
patterns,
*
matches any sequence of characters including
/
; there is no
**
. So
apps/*/src/*.ts
matches both
apps/server/src/handler.ts
and
apps/server/src/auth/handler.ts
. Do not use
**
in
source_paths
or
exemptions
.
键名含义
enabled
全局开关
mode
warn
(仅输出到stderr)或
block
(PreToolUse拒绝调用)
source_paths
需要配套测试的文件匹配模式(Glob)
test_strategy
sibling
foo.ts
foo.test.ts
)、
sibling_dir
__tests__/foo.test.ts
)或
tests_root
<app>/tests/integration/foo.test.ts
exemptions
无需测试的文件匹配模式(入口文件、生成文件、UI原语)
模式语义:钩子脚本使用bash
case
模式
,而非扩展globstar。在bash
case
模式中,
*
匹配包括
/
在内的任意字符序列;不存在
**
。因此
apps/*/src/*.ts
既匹配
apps/server/src/handler.ts
也匹配
apps/server/src/auth/handler.ts
。请勿在
source_paths
exemptions
中使用
**

Enforcement caveats

强制执行注意事项

The hook checks file-pair existence — it does not, and cannot, verify:
  • That the test was written before the source (no Red-before-Green order check).
  • That the test actually exercises the source (no semantic match).
  • That the test currently passes (no test execution).
Semantic discipline (one behavior per test, factories over hardcoded data, descriptive names, AAA) still belongs to the agent. The hooks are guardrails, not a guarantee.
钩子仅检查文件对是否存在——无法验证:
  • 测试是否在源码之前编写(无法检查Red先于Green的顺序)。
  • 测试是否实际覆盖了源码(无语义匹配)。
  • 测试当前是否通过(无测试执行)。
语义规范(每个测试对应一个行为、使用工厂函数而非硬编码数据、描述性命名、AAA模式)仍需Agent遵守。钩子只是防护措施,而非质量保证。

Manual install (until a
tdd-init
script exists)

手动安装(直到
tdd-init
脚本推出)

Per-harness mechanics differ — Claude Code and Codex run shell hooks directly; OpenCode runs a JS plugin that orchestrates the same shell scripts.
  1. Copy
    templates/tdd-guardrails.yml.tmpl
    to
    <project-root>/.tdd-guardrails.yml
    and edit
    source_paths
    and
    exemptions
    to match the repo layout.
    Then run the project-type detectors to populate
    project_types
    :
    bash
    for type in .claude/skills/agile-tdd/templates/project-types/*/; do
      name="$(basename "$type")"
      [ -f "$type/detect.sh" ] || continue
      bash "$type/detect.sh" "$PWD" && echo "detected: $name"
    done
    For each detected type, append its
    guardrails.partial.yml
    to the newly-created
    .tdd-guardrails.yml
    (or merge if the block already exists) and add the type slug to the
    project_types
    list.
  2. Copy the hook templates. From
    templates/hooks/
    to
    <project-root>/.claude/hooks/
    ,
    <project-root>/.codex/hooks/
    , and
    <project-root>/.opencode/hooks/
    (the OpenCode plugin invokes the same scripts via
    node:child_process.spawn
    ). Drop the
    .tmpl
    suffix and
    chmod +x
    .
    Also copy the shared helpers:
    templates/lib/audit-helpers.sh.tmpl → .claude/hooks/lib/audit-helpers.sh
    and the project-types directory:
    templates/project-types/ → .claude/hooks/project-types/
    The session-audit script searches both locations. Codex/OpenCode should mirror under
    .codex/hooks/
    and
    .opencode/hooks/
    .
  3. Register the hooks in each harness config:
    • Claude Code (
      .claude/settings.json
      ): add a
      PreToolUse
      entry matching
      Write|Edit|MultiEdit
      calling
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-pre-write.sh
      , a
      Stop
      entry calling
      tdd-session-audit.sh
      , and a
      SessionStart
      entry calling
      tdd-announce.sh
      . Merge with existing hooks (e.g. wiki-init) — do not replace.
    • Codex (
      .codex/hooks.json
      ): add
      PreToolUse
      matching
      apply_patch|Edit|Write|MultiEdit
      ,
      Stop
      , and
      SessionStart
      entries. Use
      bash "$(git rev-parse --show-toplevel)/.codex/hooks/<script>.sh"
      as the command form (Codex pattern). Include
      statusMessage
      field for each.
    • OpenCode: copy
      templates/opencode-plugin.js.tmpl
      to
      <project-root>/.opencode/plugins/tdd-guardrails.js
      . The plugin subscribes to
      tool.execute.before
      (PreToolUse equivalent),
      session.created
      (SessionStart equivalent), and
      session.idle
      (closest to Stop — the audit shell is idempotent via a tmp state file so multi-firing is safe). The plugin spawns the same
      .opencode/hooks/tdd-*.sh
      scripts via
      node:child_process.spawn
      . OpenCode does not invoke shell scripts directly; the plugin is the entry point.
  4. Append the contents of
    templates/agents-block.md.tmpl
    to
    AGENTS.md
    and
    CLAUDE.md
    so the agent is told the project has TDD enforcement.
    For each detected project-type, also append its
    templates/project-types/<type>/agents-block.partial.md
    to the same files (preserving the
    <!-- agile-tdd:<type>:start --> / :end
    markers so re-installs can update in place).
  5. Register the type-specific matchers in
    .claude/settings.json
    :
    • For
      tauri
      : a
      PostToolUse
      entry matching
      mcp__tauri__webview_screenshot|mcp__tauri__webview_execute_js|mcp__tauri__webview_dom_snapshot|mcp__tauri__manage_window
      calling
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-record-mcp.sh
      .
    The Codex equivalent goes in
    .codex/hooks.json
    ; for OpenCode, subscribe to
    tool.execute.after
    filtered by tool name and spawn the same shell via
    node:child_process.spawn
    .
The hooks are guardrails, not a guarantee — they check file-pair existence, not test quality. Semantic discipline (one behavior per test, factories over hardcoded data) still belongs to the agent.
不同工具的实现机制不同——Claude Code和Codex直接运行shell钩子;OpenCode运行JS插件来协调相同的shell脚本。
  1. templates/tdd-guardrails.yml.tmpl
    复制到
    <project-root>/.tdd-guardrails.yml
    ,并编辑
    source_paths
    exemptions
    以匹配仓库布局。
    然后运行项目类型检测器来填充
    project_types
    bash
    for type in .claude/skills/agile-tdd/templates/project-types/*/; do
      name="$(basename "$type")"
      [ -f "$type/detect.sh" ] || continue
      bash "$type/detect.sh" "$PWD" && echo "detected: $name"
    done
    对于每个检测到的类型,将其
    guardrails.partial.yml
    追加到新创建的
    .tdd-guardrails.yml
    中(如果块已存在则合并),并将类型标识添加到
    project_types
    列表。
  2. 复制钩子模板。从
    templates/hooks/
    复制到
    <project-root>/.claude/hooks/
    <project-root>/.codex/hooks/
    以及
    <project-root>/.opencode/hooks/
    (OpenCode插件通过
    node:child_process.spawn
    调用相同的脚本)。移除
    .tmpl
    后缀并执行
    chmod +x
    赋予执行权限。
    同时复制共享工具函数:
    templates/lib/audit-helpers.sh.tmpl → .claude/hooks/lib/audit-helpers.sh
    以及项目类型目录:
    templates/project-types/ → .claude/hooks/project-types/
    会话审计脚本会搜索这两个位置。Codex/OpenCode应在
    .codex/hooks/
    .opencode/hooks/
    下镜像相同结构。
  3. 在每个工具的配置中注册钩子:
    • Claude Code
      .claude/settings.json
      ):添加匹配
      Write|Edit|MultiEdit
      PreToolUse
      条目,调用
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-pre-write.sh
      ;添加调用
      tdd-session-audit.sh
      Stop
      条目;添加调用
      tdd-announce.sh
      SessionStart
      条目。与现有钩子(如wiki-init)合并——请勿替换。
    • Codex
      .codex/hooks.json
      ):添加匹配
      apply_patch|Edit|Write|MultiEdit
      PreToolUse
      Stop
      SessionStart
      条目。使用
      bash "$(git rev-parse --show-toplevel)/.codex/hooks/<script>.sh"
      作为命令格式(Codex规范)。为每个条目添加
      statusMessage
      字段。
    • OpenCode:将
      templates/opencode-plugin.js.tmpl
      复制到
      <project-root>/.opencode/plugins/tdd-guardrails.js
      。该插件订阅
      tool.execute.before
      (等效于PreToolUse)、
      session.created
      (等效于SessionStart)和
      session.idle
      (最接近Stop——审计shell通过临时状态文件实现幂等,因此多次触发安全)。插件通过
      node:child_process.spawn
      调用相同的
      .opencode/hooks/tdd-*.sh
      脚本。OpenCode直接调用shell脚本——插件是入口点。
  4. templates/agents-block.md.tmpl
    的内容追加到
    AGENTS.md
    CLAUDE.md
    中,告知Agent项目已启用TDD强制执行。
    对于每个检测到的项目类型,还需将其
    templates/project-types/<type>/agents-block.partial.md
    追加到相同文件中(保留
    <!-- agile-tdd:<type>:start --> / :end
    标记,以便重新安装时可以原地更新)。
  5. .claude/settings.json
    中注册类型特定匹配器
    • 对于
      tauri
      :添加匹配
      mcp__tauri__webview_screenshot|mcp__tauri__webview_execute_js|mcp__tauri__webview_dom_snapshot|mcp__tauri__manage_window
      PostToolUse
      条目,调用
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-record-mcp.sh
    Codex的等效配置在
    .codex/hooks.json
    中;对于OpenCode,订阅
    tool.execute.after
    并按工具名称过滤,通过
    node:child_process.spawn
    调用相同的shell脚本。
钩子只是防护措施,而非质量保证——它们仅检查文件对是否存在,不验证测试质量。语义规范(每个测试对应一个行为、使用工厂函数而非硬编码数据)仍需Agent遵守。

Harness compatibility matrix

工具兼容性矩阵

HarnessEntry pointPre-write eventStop equivalentSession start
Claude Code
.claude/settings.json
hooks
PreToolUse
(matcher
Write|Edit|MultiEdit
)
Stop
SessionStart
Codex
.codex/hooks.json
hooks
PreToolUse
(matcher
apply_patch|Edit|Write|MultiEdit
)
Stop
SessionStart
OpenCode
.opencode/plugins/tdd-guardrails.js
JS plugin
tool.execute.before
session.idle
(closest available; audit is idempotent)
session.created
工具入口点预写入事件等效Stop事件会话启动事件
Claude Code
.claude/settings.json
钩子
PreToolUse
(匹配
Write|Edit|MultiEdit
Stop
SessionStart
Codex
.codex/hooks.json
钩子
PreToolUse
(匹配
apply_patch|Edit|Write|MultiEdit
Stop
SessionStart
OpenCode
.opencode/plugins/tdd-guardrails.js
JS插件
tool.execute.before
session.idle
(最接近的可用事件;审计是幂等的)
session.created

Bypassing intentionally

有意绕过强制执行

  • For one path: add it to
    .tdd-guardrails.yml → exemptions
    .
  • For one session: temporarily set
    enabled: false
    (and revert before commit).
  • Never delete the test file just to silence the hook — that defeats the point.
  • 单个路径:将其添加到
    .tdd-guardrails.yml → exemptions
  • 单个会话:临时设置
    enabled: false
    (提交前恢复)。
  • 请勿为了消除钩子警告而删除测试文件——这违背了初衷。

Project-type templates

项目类型模板

Beyond the base companion-test rule,
agile-tdd
ships project-type templates under
templates/project-types/<type>/
. Each template opts into project-specific evidence that the Stop hook checks in addition to companion tests. The session-audit script composes the base check with one fragment per active type — modes are independent per type, so you can run TDD in
warn
and a stricter type in
block
.
除了基础的配套测试规则,
agile-tdd
还在
templates/project-types/<type>/
下提供项目类型模板。每个模板会启用项目特定的验证规则,Stop钩子除了检查配套测试外还会检查这些规则。会话审计脚本会将基础检查与每个激活类型的片段组合——每种类型的模式独立,因此可以在
warn
模式下运行TDD,同时在
block
模式下运行更严格的类型规则。

Available types

可用类型

TypeDetectsEnforces (in addition to companion tests)
tauri
src-tauri/Cargo.toml
(single-repo or monorepo)
cargo check
fresh +
bun run typecheck
fresh + ≥1
mcp__tauri__webview_screenshot
per affected app after the last edit
pwa
(planejado)
manifest.webmanifest
+ service worker
service worker rebuilt + offline smoke test
mobile
(planejado)
app.json
(Expo) /
Podfile
/
android/build.gradle
platform build green + simulator screenshot
desktop
(planejado) electron/forge configelectron-builder validate + window screenshot
Active types are listed in
.tdd-guardrails.yml → project_types
. At install, each
templates/project-types/<type>/detect.sh
runs against the repo; types that succeed get appended to the config along with their
guardrails.partial.yml
block.
类型检测依据额外强制执行规则(除配套测试外)
tauri
src-tauri/Cargo.toml
(单仓库或单体仓库)
cargo check
执行成功 +
bun run typecheck
执行成功 + 最后一次编辑后每个受影响应用至少有1次
mcp__tauri__webview_screenshot
调用
pwa
(规划中)
manifest.webmanifest
+ 服务工作者
服务工作者已重建 + 离线冒烟测试通过
mobile
(规划中)
app.json
(Expo)/
Podfile
/
android/build.gradle
平台构建成功 + 模拟器截图
desktop
(规划中)electron/forge配置electron-builder验证通过 + 窗口截图
激活类型列在
.tdd-guardrails.yml → project_types
中。安装时,每个
templates/project-types/<type>/detect.sh
会针对仓库运行;检测成功的类型会被追加到配置中,并附带其
guardrails.partial.yml
块。

Anatomy of a type

类型结构

A type is a folder with:
  • detect.sh
    — exits 0 when the type applies to the repo.
  • guardrails.partial.yml
    — block appended to
    .tdd-guardrails.yml
    .
  • agents-block.partial.md
    — section appended to
    CLAUDE.md
    /
    AGENTS.md
    inside
    <!-- agile-tdd:<type>:start -->
    markers.
  • audit.partial.sh
    — sourced by
    tdd-session-audit.sh
    . Exports
    check_<type>_evidence "$ROOT"
    returning textual violations. Sets
    <TYPE>_MODE
    (uppercase) so the framework can decide block vs warn.
  • hooks/*.sh.tmpl
    — optional extra hooks the type registers (e.g. PostToolUse evidence recorder).
  • README.md
    — human description of the type.
The framework calls helpers from
templates/lib/audit-helpers.sh
for yml parsing, glob matching, app-root discovery, and freshness checks. Adding a new type does not require touching the framework script.
一个类型是包含以下文件的文件夹:
  • detect.sh
    ——当类型适用于仓库时返回0。
  • guardrails.partial.yml
    ——追加到
    .tdd-guardrails.yml
    的配置块。
  • agents-block.partial.md
    ——追加到
    CLAUDE.md
    /
    AGENTS.md
    的部分内容,位于
    <!-- agile-tdd:<type>:start -->
    标记之间。
  • audit.partial.sh
    ——由
    tdd-session-audit.sh
    加载。导出
    check_<type>_evidence "$ROOT"
    函数,返回文本形式的违规信息。设置
    <TYPE>_MODE
    (大写),以便框架决定是block还是warn模式。
  • hooks/*.sh.tmpl
    ——可选的额外钩子,由类型注册(如PostToolUse证据记录器)。
  • README.md
    ——类型的人工描述。
框架从
templates/lib/audit-helpers.sh
调用工具函数进行yml解析、glob匹配、应用根目录发现和新鲜度检查。添加新类型无需修改框架脚本。

Installing types

安装类型

The base
agile-tdd
install already follows the "Manual install" steps above. For each detected type:
  1. Append
    templates/project-types/<type>/guardrails.partial.yml
    to
    .tdd-guardrails.yml
    (or merge if the block already exists).
  2. Append
    agents-block.partial.md
    to
    CLAUDE.md
    /
    AGENTS.md
    between the marker pair.
  3. Copy
    hooks/*.sh.tmpl
    to
    .claude/hooks/
    ,
    .codex/hooks/
    , and
    .opencode/hooks/
    , dropping the
    .tmpl
    suffix and
    chmod +x
    -ing.
  4. Add the type's matcher to each harness config:
    • Claude Code (
      .claude/settings.json
      PostToolUse
      ): add a matcher entry for the type's PostToolUse hooks. For
      tauri
      :
      mcp__tauri__webview_screenshot|mcp__tauri__webview_execute_js|mcp__tauri__webview_dom_snapshot|mcp__tauri__manage_window
      calling
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-record-mcp.sh
      .
    • Codex (
      .codex/hooks.json
      ): same matcher pattern; command uses
      bash "$(git rev-parse --show-toplevel)/.codex/hooks/tdd-record-mcp.sh"
      .
    • OpenCode: extend
      .opencode/plugins/tdd-guardrails.js
      to subscribe to
      tool.execute.after
      with a filter on tool name, spawning the same shell script via
      node:child_process.spawn
      .
  5. Append the type slug (e.g.
    tauri
    ) to
    project_types
    in
    .tdd-guardrails.yml
    . The session-audit reads this list at run.
基础
agile-tdd
安装已遵循上述「手动安装」步骤。对于每个检测到的类型:
  1. templates/project-types/<type>/guardrails.partial.yml
    追加到
    .tdd-guardrails.yml
    (如果块已存在则合并)。
  2. agents-block.partial.md
    追加到
    CLAUDE.md
    /
    AGENTS.md
    的标记对之间。
  3. hooks/*.sh.tmpl
    复制到
    .claude/hooks/
    .codex/hooks/
    .opencode/hooks/
    ,移除
    .tmpl
    后缀并执行
    chmod +x
    赋予执行权限。
  4. 在每个工具配置中添加类型的匹配器:
    • Claude Code
      .claude/settings.json
      PostToolUse
      ):添加类型PostToolUse钩子的匹配器条目。对于
      tauri
      mcp__tauri__webview_screenshot|mcp__tauri__webview_execute_js|mcp__tauri__webview_dom_snapshot|mcp__tauri__manage_window
      ,调用
      $CLAUDE_PROJECT_DIR/.claude/hooks/tdd-record-mcp.sh
    • Codex
      .codex/hooks.json
      ):使用相同的匹配模式;命令格式为
      bash "$(git rev-parse --show-toplevel)/.codex/hooks/tdd-record-mcp.sh"
    • OpenCode:扩展
      .opencode/plugins/tdd-guardrails.js
      以订阅
      tool.execute.after
      并按工具名称过滤,通过
      node:child_process.spawn
      调用相同的shell脚本。
  5. 将类型标识(如
    tauri
    )追加到
    .tdd-guardrails.yml
    project_types
    列表中。会话审计脚本会在运行时读取此列表。

Tauri MCP validation (project type)

Tauri MCP验证(项目类型)

Knowhow consolidated from real Tauri debug/test sessions. Read this before touching
src-tauri/src/**
or any
src/{routes,components,hooks}/**
in a Tauri-detected repo.
整合自真实Tauri调试/测试会话的经验。在Tauri检测到的仓库中修改
src-tauri/src/**
src/{routes,components,hooks}/**
前,请阅读以下内容。

Toolbox (canonical references)

工具库(权威参考)

  • mcp__tauri__driver_session(start)
    — always invoke first; subsequent webview tools assume an active session.
  • mcp__tauri__webview_screenshot
    — the canonical "I opened the screen" signal. Pass
    windowId="main"
    (Tauri apps usually have
    main
    +
    tray-panel
    ).
  • mcp__tauri__webview_execute_js
    — for invokes that take longer than the JS executor timeout (≈seconds), fire-and-forget via
    window.__TAURI_INTERNALS__.invoke(cmd, args).then(r => window.__last = r)
    ; poll DB or screenshot instead of
    await
    -ing.
  • mcp__tauri__webview_interact
    /
    webview_find_element
    / refs in
    webview_dom_snapshot
    — these depend on
    window.__MCP__.resolveRef
    which is undefined after dev-server HMR reload. Prefer
    document.querySelector(...).click()
    via
    webview_execute_js
    .
  • mcp__tauri__driver_session(start)
    ——始终先调用此工具;后续webview工具假设存在活跃会话。
  • mcp__tauri__webview_screenshot
    ——标准的「已打开页面」信号。传递
    windowId="main"
    (Tauri应用通常有
    main
    +
    tray-panel
    窗口)。
  • mcp__tauri__webview_execute_js
    ——对于执行时间超过JS执行器超时(约数秒)的调用,通过
    window.__TAURI_INTERNALS__.invoke(cmd, args).then(r => window.__last = r)
    实现即发即弃;轮询数据库或截图而非使用
    await
  • mcp__tauri__webview_interact
    /
    webview_find_element
    /
    webview_dom_snapshot
    中的引用——这些依赖
    window.__MCP__.resolveRef
    ,而该对象在开发服务器HMR重载后未定义。优先通过
    webview_execute_js
    执行
    document.querySelector(...).click()

Rebuild flow

重建流程

  • Rust change →
    touch src-tauri/src/<file.rs>
    forces
    tauri-dev
    to rebuild (cargo's mtime watcher).
  • Monitor the dev process output for
    Finished
    +
    MCP Bridge plugin initialized
    before re-running MCP calls; HMR only covers the frontend.
  • In Claude Code, prefer
    Monitor
    with the filter
    tail -f tauri.log | grep --line-buffered -E "Finished|error\\[|MCP Bridge plugin initialized"
    .
  • Rust代码变更→
    touch src-tauri/src/<file.rs>
    强制
    tauri-dev
    重建(cargo的修改时间监视器)。
  • 在重新运行MCP调用前,监控开发进程输出是否出现
    Finished
    +
    MCP Bridge plugin initialized
    ;HMR仅覆盖前端。
  • 在Claude Code中,优先使用
    Monitor
    工具并设置过滤条件
    tail -f tauri.log | grep --line-buffered -E "Finished|error\\[|MCP Bridge plugin initialized"

Validation patterns

验证模式

  • DB direct read — cheaper than
    invoke()
    for state assertions:
    sqlite3 <workspace>/<app>.db "SELECT ..."
    .
  • GPU offload check
    ps -p $(pgrep -x <app-name>) -o pcpu,etime
    . Apps using Metal/CUDA show low CPU (5-15%) when the GPU is doing the work; CPU-only fallback shows 200-400%.
  • Visual check
    location.assign('/route/...')
    via
    webview_execute_js
    , wait ~500ms, then
    webview_screenshot
    .
  • 直接读取数据库——比
    invoke()
    更高效的状态断言方式:
    sqlite3 <workspace>/<app>.db "SELECT ..."
  • GPU卸载检查——
    ps -p $(pgrep -x <app-name>) -o pcpu,etime
    。使用Metal/CUDA的应用在GPU工作时CPU占用率低(5-15%);仅使用CPU的回退方案CPU占用率为200-400%。
  • 视觉检查——通过
    webview_execute_js
    执行
    location.assign('/route/...')
    ,等待约500ms,然后调用
    webview_screenshot

Known gotchas (cross-project patterns)

已知陷阱(跨项目模式)

Each project keeps its own list of in-tree fixes (e.g. in a
wiki/technical/tauri-gotchas.md
for projects that follow the LLM wiki pattern). The items below describe the patterns — the actual file paths vary per project.
  • whisper-rs 0.14.x
    set_abort_callback_safe
    type confusion
    — in v0.14.4 the trampoline is instantiated with the original closure type, while
    user_data
    is actually written as
    Box<dyn FnMut() -> bool>
    . The callback dereferences the wrong memory layout and returns garbage bools, aborting whisper at 0–5% with error
    -6
    . Workaround: use the
    unsafe
    variant
    set_abort_callback
    +
    set_abort_callback_user_data
    with a hand-written trampoline that matches the stored type:
    rust
    unsafe extern "C" fn abort_trampoline(ud: *mut std::ffi::c_void) -> bool {
        let f = &mut *(ud as *mut Box<dyn FnMut() -> bool>);
        f()
    }
    let closure: Box<dyn FnMut() -> bool> = Box::new({
        let cancel = cancel.clone();
        move || cancel.load(Ordering::Relaxed)
    });
    let ud = Box::into_raw(Box::new(closure)) as *mut std::ffi::c_void;
    unsafe {
        params.set_abort_callback(Some(abort_trampoline));
        params.set_abort_callback_user_data(ud);
    }
  • Background jobs stuck
    running
    after binary kill
    — a tokio task that dies mid-flight (cargo restart, app crash) never updates its DB row to a terminal state. Add a reconcile pass at pool-open:
    rust
    // db::open_pool, after schema migration
    sqlx::query(
        "UPDATE jobs SET status='cancelled',
             error_message=COALESCE(error_message,'interrupted by app restart'),
             finished_at=?
         WHERE status IN ('running','pending')"
    ).bind(now_iso()).execute(&pool).await?;
  • Modal libraries that couple
    open
    to data state
    (e.g. Base UI Dialog, some Radix patterns) — when the data prop goes
    null
    in the same render that
    open
    flips false, the modal unmounts before the close animation finishes; pointer-events stay trapped on
    document.body
    and the page becomes unclickable. Decouple
    open
    (boolean) from the data, then defer the data clear with
    setTimeout(..., 200)
    (or use an
    onCloseComplete
    callback when the library exposes one).
  • CREATE TABLE IF NOT EXISTS
    is a no-op on existing tables
    — reused workspaces / DBs do NOT pick up new columns added in the schema file. Symptom: runtime
    no such column
    after upgrade. Fix options: explicit
    ALTER TABLE … ADD COLUMN IF NOT EXISTS
    per drift; OR, if the DB has no production data, drop and re-bootstrap.
  • hound::WavReader
    only decodes WAV
    — for
    .mp3
    /
    .mp4
    /
    .m4a
    /
    .webm
    , pipe through ffmpeg to
    f32le @ 16 kHz mono
    before feeding Whisper:
    rust
    // ffmpeg -hide_banner -loglevel error -nostdin -i <input>
    //   -ac 1 -ar 16000 -f f32le -
    let mut child = Command::new(ffmpeg)
        .args(["-hide_banner","-loglevel","error","-nostdin","-i"])
        .arg(input)
        .args(["-ac","1","-ar","16000","-f","f32le","-"])
        .stdout(Stdio::piped()).spawn()?;
    // read stdout, reinterpret bytes as &[f32]
每个项目都有自己的内部修复列表(例如,遵循LLM wiki模式的项目会存放在
wiki/technical/tauri-gotchas.md
中)。以下描述的是模式——实际文件路径因项目而异。
  • whisper-rs 0.14.x
    set_abort_callback_safe
    类型混淆
    ——在v0.14.4中,蹦床使用原始闭包类型实例化,而
    user_data
    实际存储为
    Box<dyn FnMut() -> bool>
    。回调会错误地解引用内存布局并返回无效布尔值,导致whisper在0–5%时中止并返回错误
    -6
    。解决方法:使用
    unsafe
    变体
    set_abort_callback
    +
    set_abort_callback_user_data
    ,并手写与存储类型匹配的蹦床:
    rust
    unsafe extern "C" fn abort_trampoline(ud: *mut std::ffi::c_void) -> bool {
        let f = &mut *(ud as *mut Box<dyn FnMut() -> bool>);
        f()
    }
    let closure: Box<dyn FnMut() -> bool> = Box::new({
        let cancel = cancel.clone();
        move || cancel.load(Ordering::Relaxed)
    });
    let ud = Box::into_raw(Box::new(closure)) as *mut std::ffi::c_void;
    unsafe {
        params.set_abort_callback(Some(abort_trampoline));
        params.set_abort_callback_user_data(ud);
    }
  • 二进制终止后后台任务仍处于
    running
    状态
    ——中途终止的tokio任务(cargo重启、应用崩溃)永远不会将其数据库行更新为终端状态。在连接池打开时添加协调步骤:
    rust
    // db::open_pool,在 schema 迁移后
    sqlx::query(
        "UPDATE jobs SET status='cancelled',
             error_message=COALESCE(error_message,'interrupted by app restart'),
             finished_at=?
         WHERE status IN ('running','pending')"
    ).bind(now_iso()).execute(&pool).await?;
  • open
    与数据状态耦合的模态库
    (如Base UI Dialog、部分Radix模式)——当数据属性在
    open
    变为false的同一渲染周期中变为
    null
    时,模态框会在关闭动画完成前卸载;指针事件仍被
    document.body
    捕获,页面变得无法点击。将
    open
    (布尔值)与数据解耦,然后使用
    setTimeout(..., 200)
    延迟清除数据(或使用库提供的
    onCloseComplete
    回调)。
  • CREATE TABLE IF NOT EXISTS
    对现有表无作用
    ——复用的工作区/数据库不会拾取模式文件中新增的列。症状:升级后运行时出现
    no such column
    错误。修复选项:对每个变更使用显式的
    ALTER TABLE … ADD COLUMN IF NOT EXISTS
    ;或者,如果数据库没有生产数据,删除并重新初始化。
  • hound::WavReader
    仅解码WAV格式
    ——对于
    .mp3
    /
    .mp4
    /
    .m4a
    /
    .webm
    格式,先通过ffmpeg转换为
    f32le @ 16 kHz mono
    格式,再输入给Whisper:
    rust
    // ffmpeg -hide_banner -loglevel error -nostdin -i <input>
    //   -ac 1 -ar 16000 -f f32le -
    let mut child = Command::new(ffmpeg)
        .args(["-hide_banner","-loglevel","error","-nostdin","-i"])
        .arg(input)
        .args(["-ac","1","-ar","16000","-f","f32le","-"])
        .stdout(Stdio::piped()).spawn()?;
    // 读取stdout,将字节重新解释为&[f32]

Definition of Done (Tauri)

完成定义(Tauri)

For each change that touches
affected_paths
in every affected app (monorepo: per-app):
  1. cargo check --manifest-path <app>/src-tauri/Cargo.toml --lib
    green.
  2. bun run typecheck
    (or
    tsc --noEmit
    ) green.
  3. tauri-dev
    showed
    Finished
    after the last
    touch
    in
    <app>/src-tauri/src/**
    .
  4. Companion test exists (base TDD rule, if applicable).
  5. ≥1
    mcp__tauri__webview_screenshot
    call after the last edit under
    <app>/{src-tauri,src}/**
    .
  6. Post-operation state confirmed: DB query, DOM snapshot, or evidence visible in the screenshot.
The
tdd-session-audit.sh
Stop hook checks (1), (2), and (5) automatically via the
tauri
template. Steps (3), (4), and (6) remain agent responsibility.
对于每个修改每个受影响应用(单体仓库:按应用)中
affected_paths
的变更:
  1. cargo check --manifest-path <app>/src-tauri/Cargo.toml --lib
    执行成功。
  2. bun run typecheck
    (或
    tsc --noEmit
    )执行成功。
  3. 在最后一次修改
    <app>/src-tauri/src/**
    后,
    tauri-dev
    显示
    Finished
  4. 存在配套测试(基础TDD规则,如适用)。
  5. **最后一次修改
    <app>/{src-tauri,src}/**
    后至少有1次
    mcp__tauri__webview_screenshot
    调用。
  6. 确认操作后状态:数据库查询、DOM快照或截图中可见的证据。
tdd-session-audit.sh
Stop钩子会通过
tauri
模板自动检查(1)、(2)和(5)。步骤(3)、(4)和(6)仍由Agent负责。

Relationship with the flow

与流程的关系

mermaid
flowchart LR
    A["/agile-story"] --> B[TDD cycle]
    B --> C[Red: failing test]
    C --> D[Green: minimum code]
    D --> E[Refactor]
    E --> F{More?}
    F -->|Yes| C
    F -->|No| G["/agile-refinement"]
This skill operates during execution. It pairs with
/agile-story
(which defines what to build) and feeds into
/agile-refinement
(which validates the result). When the optional enforcement is installed, the rule is also applied automatically at every
Write/Edit/MultiEdit
tool call.
mermaid
flowchart LR
    A["/agile-story"] --> B[TDD cycle]
    B --> C[Red: failing test]
    C --> D[Green: minimum code]
    D --> E[Refactor]
    E --> F{More?}
    F -->|Yes| C
    F -->|No| G["/agile-refinement"]
本技能在执行阶段运作。它与
/agile-story
(定义要构建的内容)配合,并为
/agile-refinement
(验证结果)提供输入。当启用可选的强制执行时,规则也会在每次
Write/Edit/MultiEdit
工具调用时自动应用。