tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Who you are: If
.helpmetest/SOUL.md
exists, read it — it defines your character.
No MCP? Use
helpmetest <command>
instead of MCP tools.
你是谁: 如果存在
.helpmetest/SOUL.md
,请阅读它——它定义了你的角色。
没有MCP? 请使用
helpmetest <command>
代替MCP工具。

Tests — Write, Generate, Fix

测试——编写、生成、修复

Orient First (Always)

先梳理整体情况(必做)

Before doing anything, check what already exists:
helpmetest_status()
helpmetest_search_artifacts({ query: "" })
helpmetest_search_artifacts({ type: "Tasks" })
  • Tests already failing? → that's the priority, not creating new ones
  • Tasks artifact in progress? → resume it, don't start over
  • Feature artifacts exist? → use them, don't re-discover

开展任何操作前,先检查现有内容:
helpmetest_status()
helpmetest_search_artifacts({ query: "" })
helpmetest_search_artifacts({ type: "Tasks" })
  • 已有测试失败?→ 优先处理该问题,而非创建新测试
  • 存在进行中的Tasks工件?→ 继续完成该任务,不要重新开始
  • 存在Feature工件?→ 直接使用,不要重复梳理需求

Use Cases

使用场景

"I need to build something" (TDD)

"我需要开发某个功能"(TDD)

New feature, bug fix, or refactor. Tests come first — they define what "done" means.
1. Create a Tasks artifact to track the work:
json
{
  "id": "tasks-[feature-name]",
  "type": "Tasks",
  "content": {
    "overview": "What this implements and why",
    "tasks": [
      { "id": "1.0", "title": "Write all tests first", "status": "pending", "priority": "critical" },
      { "id": "2.0", "title": "Implement to make tests pass", "status": "pending" },
      { "id": "3.0", "title": "All green — review for gaps", "status": "pending" }
    ]
  }
}
2. Create a Feature artifact with all scenarios before writing a single test:
json
{
  "id": "feature-[name]",
  "type": "Feature",
  "content": {
    "goal": "What this feature does",
    "functional": [
      { "name": "User can do X", "given": "...", "when": "...", "then": "...", "tags": ["priority:critical"], "test_ids": [] }
    ],
    "edge_cases": [],
    "bugs": []
  }
}
3. Write ALL tests — happy paths, edge cases, errors — before implementing anything. Failing tests are your spec.
4. Implement incrementally — pick the highest-priority failing test, make it pass, move to the next.
5. Done when all tests are green and you've reviewed for missing edge cases.

新功能开发、Bug修复或者重构,测试优先——测试定义了"完成"的标准。
1. 创建Tasks工件 追踪工作进度:
json
{
  "id": "tasks-[feature-name]",
  "type": "Tasks",
  "content": {
    "overview": "What this implements and why",
    "tasks": [
      { "id": "1.0", "title": "Write all tests first", "status": "pending", "priority": "critical" },
      { "id": "2.0", "title": "Implement to make tests pass", "status": "pending" },
      { "id": "3.0", "title": "All green — review for gaps", "status": "pending" }
    ]
  }
}
2. 创建Feature工件,在编写任何测试前梳理所有场景:
json
{
  "id": "feature-[name]",
  "type": "Feature",
  "content": {
    "goal": "What this feature does",
    "functional": [
      { "name": "User can do X", "given": "...", "when": "...", "then": "...", "tags": ["priority:critical"], "test_ids": [] }
    ],
    "edge_cases": [],
    "bugs": []
  }
}
3. 编写所有测试——覆盖正常路径、边界 case、错误场景,再开始开发功能。失败的测试就是你的需求规格说明。
4. 增量实现功能——优先选择优先级最高的失败测试,使其通过,再推进下一个。
5. 完成标准:所有测试通过,且已完成边界场景遗漏排查。

"Write tests for an existing feature"

"为现有功能编写测试"

Feature exists (or was just built by someone else). Your job is tests only.
1. Read the Feature artifact
helpmetest_get_artifact({ id: "feature-X" })
. If none exists, create one first based on what you know.
2. Explore interactively before writing — run the scenario step by step using
helpmetest_run_interactive_command
. A test written after seeing real behavior uses real selectors and reflects actual timing. A test written from a description is a guess.
As  <persona>
Go To  <url>
功能已经存在(或刚由其他人员开发完成),你的工作仅负责测试部分。
1. 读取Feature工件 —— 执行
helpmetest_get_artifact({ id: "feature-X" })
。如果不存在相关工件,先基于已知信息创建一个。
2. 编写测试前先交互式探索——使用
helpmetest_run_interactive_command
逐步运行场景。基于真实运行行为编写的测试会使用真实选择器,符合实际运行时序,仅基于需求描述编写的测试只是猜测。
As  <persona>
Go To  <url>

Execute each Given/When/Then step, observe what actually happens

执行每个Given/When/Then步骤,观察实际运行结果


**3. Write tests** for `priority:critical` scenarios first, then high, then medium. For each:
- 5+ meaningful steps
- Verify business outcomes (data saved, state changed) — not just that an element is visible
- Use `Create Fake Email` for any registration/email fields — never hardcode

**4. Link tests back** — add each test ID to `scenario.test_ids` in the Feature artifact.

**5. Validate** each test with `validate-tests` before running. Catches tests that pass when the feature is broken.

**6. Run and fix** — see "Fix broken tests" below if a newly-written test fails.

---

**3. 编写测试**——优先覆盖`priority:critical`级别的场景,再依次处理高、中优先级场景。每个测试需要满足:
- 包含5个以上有意义的步骤
- 验证业务结果(数据保存、状态变更),而非仅验证元素可见
- 所有注册/邮箱字段使用`Create Fake Email`生成,绝对不要硬编码

**4. 关联测试**——将每个测试ID添加到Feature工件的`scenario.test_ids`字段中。

**5. 验证测试**——运行前使用`validate-tests`校验每个测试,避免出现功能损坏但测试仍通过的问题。

**6. 运行并修复**——如果新编写的测试失败,参考下文"修复损坏的测试"部分处理。

---

"Fix broken tests" / "Tests are failing"

"修复损坏的测试" / "测试运行失败"

First: understand the failure pattern
Check recent code changes:
bash
git diff --stat HEAD
git log --oneline -5
Map changed files to likely causes:
  • components/
    ,
    pages/
    → selector changes
  • auth/
    ,
    session/
    → auth state issues
  • api/
    ,
    routes/
    → backend errors or changed response shapes
Then get test history:
helpmetest_status({ id: "test-id", testRunLimit: 10 })
Classify:
  • Consistent failure after a code change → selector/behavior changed
  • Intermittent PASS/FAIL with changing errors → isolation issue (shared state, test order dependency)
  • Timeout / element not visible → timing issue
  • Auth/session error → state not restored correctly
  • Backend error in test output → real bug, not a test issue
Reproduce interactively — always do this before fixing
Run the failing steps one at a time:
As  <persona>
Go To  <url>
第一步:理解失败规律
检查最近的代码变更:
bash
git diff --stat HEAD
git log --oneline -5
将变更文件对应到可能的原因:
  • components/
    pages/
    → 选择器变更
  • auth/
    session/
    → 鉴权状态问题
  • api/
    routes/
    → 后端错误或者响应结构变更
然后获取测试历史:
helpmetest_status({ id: "test-id", testRunLimit: 10 })
分类判断:
  • 代码变更后持续失败 → 选择器/功能行为变更
  • 随机成功/失败,错误信息不固定 → 隔离性问题(共享状态、测试顺序依赖)
  • 超时/元素不可见 → 时序问题
  • 鉴权/会话错误 → 状态未正确重置
  • 测试输出中出现后端错误 → 真实业务Bug,不是测试问题
交互式复现问题——修复前必须执行此步骤
逐步运行失败的测试步骤:
As  <persona>
Go To  <url>

Execute each step, observe what actually happens at the point of failure

执行每个步骤,观察失败节点的实际运行情况


For "element not found": list all elements of that type, try alternate selectors.
For "wrong value": check what's actually displayed vs what the test expected.
For timeouts: try longer waits, check whether the element ever appears.

**Decide: test issue or app bug?**

- **Test issue** (selector changed, timing, wrong expectation) → fix the test, validate the fix interactively before saving
- **App bug** (feature is actually broken) → document in `feature.bugs[]`, update Feature.status to "broken" or "partial"

**Many tests broke after a UI change?**

Work through them systematically one by one. For each:
1. Classify the failure (usually selector or timing)
2. Reproduce interactively
3. Fix
4. Run to confirm

Don't shotgun-fix by guessing — one wrong fix creates two broken tests.

**Tests are out of date after a refactor?**

1. Get test list: `helpmetest_status()`
2. For each failing test, check whether the Feature artifact scenario still matches intended behavior
3. If the code is the source of truth → update the test
4. If the test was right and the refactor broke behavior → document the regression

---

如果是"元素未找到":列出该类型的所有元素,尝试其他选择器。
如果是"值错误":对比实际展示内容和测试预期内容的差异。
如果是超时:尝试延长等待时间,确认元素最终是否会出现。

**判断:是测试问题还是应用Bug?**

- **测试问题**(选择器变更、时序问题、预期错误)→ 修复测试,保存前交互式验证修复效果
- **应用Bug**(功能实际损坏)→ 记录到`feature.bugs[]`中,将Feature.status更新为"broken"或"partial"

**UI变更后大量测试失败?**

逐个系统性处理,每个测试按以下步骤操作:
1. 分类失败原因(通常是选择器或时序问题)
2. 交互式复现
3. 修复
4. 运行确认

不要盲目猜测批量修改——一个错误的修复会产生更多损坏的测试。

**重构后测试过时?**

1. 获取测试列表:`helpmetest_status()`
2. 针对每个失败的测试,检查Feature工件的场景是否仍符合预期行为
3. 如果代码是正确的事实来源 → 更新测试
4. 如果测试是正确的,重构损坏了功能 → 记录回归问题

---

Writing Tests

测试编写规范

Structure

结构

As  <persona>          # auth state — always first
Go To  <url>
As  <persona>          # 鉴权状态 —— 必须放在第一位
Go To  <url>

Given — establish precondition

Given —— 建立前置条件

<steps>
<steps>

When — perform the action

When —— 执行操作

<steps>
<steps>

Then — verify the outcome

Then —— 验证结果

<assertions>
<assertions>

Persistence check (if relevant)

持久化校验(如果相关)

Reload <re-assert that state survived>
undefined
Reload <re-assert that state survived>
undefined

What makes a good test

优质测试的标准

✅ Verifies a business outcome — data saved, filter applied, order created ✅ Would FAIL if the feature is broken ✅ 5+ meaningful steps ✅ Checks state change, not just that a button exists
❌ Just navigates to a page and counts elements ❌ Clicks something without checking what happened ❌ Passes when the feature is broken
✅ 验证业务结果——数据保存、筛选生效、订单创建 ✅ 功能损坏时测试一定会失败 ✅ 包含5个以上有意义的步骤 ✅ 检查状态变更,而非仅验证按钮是否存在
❌ 仅导航到页面统计元素数量 ❌ 点击操作后不校验结果 ❌ 功能损坏时测试仍能通过

Test naming

测试命名规范

Format:
User can <action>
or
<Feature> <behavior>
  • User can update profile email
  • Cart total updates when quantity changes
  • MyApp Login Test
  • SiteName Checkout
格式:
User can <action>
或者
<Feature> <behavior>
  • User can update profile email
  • Cart total updates when quantity changes
  • MyApp Login Test
  • SiteName Checkout

Auth

鉴权

Use
Save As <StateName>
once to capture auth state. Reuse with
As <StateName>
in every test — never re-authenticate inside tests.
仅执行一次
Save As <StateName>
捕获鉴权状态,所有测试复用
As <StateName>
——绝对不要在测试内部重新鉴权。

Emails

邮箱

Use
Create Fake Email
— never hardcode
test@example.com
. Hardcoded emails break on second run.
${email}=  Create Fake Email
Fill Text  input[name=email]  ${email}
${code}=   Get Email Verification Code  ${email}
使用
Create Fake Email
——绝对不要硬编码
test@example.com
,硬编码邮箱第二次运行就会失败。
${email}=  Create Fake Email
Fill Text  input[name=email]  ${email}
${code}=   Get Email Verification Code  ${email}

Localhost

本地服务

If testing a local server, set up the proxy first:
helpmetest_proxy({ action: "start", domain: "dev.local", sourcePort: 3000 })
Verify it works before writing any tests. See the
proxy
skill for details.

如果测试本地服务,先启动代理:
helpmetest_proxy({ action: "start", domain: "dev.local", sourcePort: 3000 })
编写任何测试前先验证代理正常运行,参考
proxy
技能获取详情。

Tags

标签规范

  • priority:critical|high|medium|low
  • feature:[feature-name]
  • type:e2e|smoke|regression

  • priority:critical|high|medium|low
  • feature:[feature-name]
  • type:e2e|smoke|regression

Done means

完成标准

  • ✅ All tests passing
  • ✅ All
    priority:critical
    scenarios have
    test_ids
  • ✅ Bugs documented in
    feature.bugs[]
  • ✅ Feature.status updated (
    working
    /
    broken
    /
    partial
    )
  • ✅ Tasks artifact all done
  • ✅ 所有测试通过
  • ✅ 所有
    priority:critical
    级别的场景都关联了
    test_ids
  • ✅ Bug已记录到
    feature.bugs[]
  • ✅ Feature.status已更新(
    working
    /
    broken
    /
    partial
  • ✅ Tasks工件所有任务已完成