tdd

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Who you are: If
.helpmetest/SOUL.md
exists, read it — it defines your character.

No MCP? Use
helpmetest <command>
instead of MCP tools.

你是谁： 如果存在
.helpmetest/SOUL.md
，请阅读它——它定义了你的角色。

没有MCP？ 请使用
helpmetest <command>
代替MCP工具。

Tests — Write, Generate, Fix

测试——编写、生成、修复

Orient First (Always)

先梳理整体情况（必做）

Before doing anything, check what already exists:

helpmetest_status()
helpmetest_search_artifacts({ query: "" })
helpmetest_search_artifacts({ type: "Tasks" })

Tests already failing? → that's the priority, not creating new ones
Tasks artifact in progress? → resume it, don't start over
Feature artifacts exist? → use them, don't re-discover

开展任何操作前，先检查现有内容：

helpmetest_status()
helpmetest_search_artifacts({ query: "" })
helpmetest_search_artifacts({ type: "Tasks" })

已有测试失败？→ 优先处理该问题，而非创建新测试
存在进行中的Tasks工件？→ 继续完成该任务，不要重新开始
存在Feature工件？→ 直接使用，不要重复梳理需求

Use Cases

使用场景

"I need to build something" (TDD)

"我需要开发某个功能"（TDD）

New feature, bug fix, or refactor. Tests come first — they define what "done" means.

1. Create a Tasks artifact to track the work:

json

{
  "id": "tasks-[feature-name]",
  "type": "Tasks",
  "content": {
    "overview": "What this implements and why",
    "tasks": [
      { "id": "1.0", "title": "Write all tests first", "status": "pending", "priority": "critical" },
      { "id": "2.0", "title": "Implement to make tests pass", "status": "pending" },
      { "id": "3.0", "title": "All green — review for gaps", "status": "pending" }
    ]
  }
}

2. Create a Feature artifact with all scenarios before writing a single test:

json

{
  "id": "feature-[name]",
  "type": "Feature",
  "content": {
    "goal": "What this feature does",
    "functional": [
      { "name": "User can do X", "given": "...", "when": "...", "then": "...", "tags": ["priority:critical"], "test_ids": [] }
    ],
    "edge_cases": [],
    "bugs": []
  }
}

3. Write ALL tests — happy paths, edge cases, errors — before implementing anything. Failing tests are your spec.

4. Implement incrementally — pick the highest-priority failing test, make it pass, move to the next.

5. Done when all tests are green and you've reviewed for missing edge cases.

新功能开发、Bug修复或者重构，测试优先——测试定义了"完成"的标准。

1. 创建Tasks工件 追踪工作进度：

json

{
  "id": "tasks-[feature-name]",
  "type": "Tasks",
  "content": {
    "overview": "What this implements and why",
    "tasks": [
      { "id": "1.0", "title": "Write all tests first", "status": "pending", "priority": "critical" },
      { "id": "2.0", "title": "Implement to make tests pass", "status": "pending" },
      { "id": "3.0", "title": "All green — review for gaps", "status": "pending" }
    ]
  }
}

2. 创建Feature工件，在编写任何测试前梳理所有场景：

json

{
  "id": "feature-[name]",
  "type": "Feature",
  "content": {
    "goal": "What this feature does",
    "functional": [
      { "name": "User can do X", "given": "...", "when": "...", "then": "...", "tags": ["priority:critical"], "test_ids": [] }
    ],
    "edge_cases": [],
    "bugs": []
  }
}

3. 编写所有测试——覆盖正常路径、边界 case、错误场景，再开始开发功能。失败的测试就是你的需求规格说明。

4. 增量实现功能——优先选择优先级最高的失败测试，使其通过，再推进下一个。

5. 完成标准：所有测试通过，且已完成边界场景遗漏排查。

"Write tests for an existing feature"

"为现有功能编写测试"

Feature exists (or was just built by someone else). Your job is tests only.

1. Read the Feature artifact —

helpmetest_get_artifact({ id: "feature-X" })

. If none exists, create one first based on what you know.

2. Explore interactively before writing — run the scenario step by step using

helpmetest_run_interactive_command

. A test written after seeing real behavior uses real selectors and reflects actual timing. A test written from a description is a guess.

As  <persona>
Go To  <url>

功能已经存在（或刚由其他人员开发完成），你的工作仅负责测试部分。

1. 读取Feature工件 —— 执行

helpmetest_get_artifact({ id: "feature-X" })

。如果不存在相关工件，先基于已知信息创建一个。

2. 编写测试前先交互式探索——使用

helpmetest_run_interactive_command

逐步运行场景。基于真实运行行为编写的测试会使用真实选择器，符合实际运行时序，仅基于需求描述编写的测试只是猜测。

As  <persona>
Go To  <url>

Execute each Given/When/Then step, observe what actually happens

执行每个Given/When/Then步骤，观察实际运行结果


**3. Write tests** for `priority:critical` scenarios first, then high, then medium. For each:
- 5+ meaningful steps
- Verify business outcomes (data saved, state changed) — not just that an element is visible
- Use `Create Fake Email` for any registration/email fields — never hardcode

**4. Link tests back** — add each test ID to `scenario.test_ids` in the Feature artifact.

**5. Validate** each test with `validate-tests` before running. Catches tests that pass when the feature is broken.

**6. Run and fix** — see "Fix broken tests" below if a newly-written test fails.

---


**3. 编写测试**——优先覆盖`priority:critical`级别的场景，再依次处理高、中优先级场景。每个测试需要满足：
- 包含5个以上有意义的步骤
- 验证业务结果（数据保存、状态变更），而非仅验证元素可见
- 所有注册/邮箱字段使用`Create Fake Email`生成，绝对不要硬编码

**4. 关联测试**——将每个测试ID添加到Feature工件的`scenario.test_ids`字段中。

**5. 验证测试**——运行前使用`validate-tests`校验每个测试，避免出现功能损坏但测试仍通过的问题。

**6. 运行并修复**——如果新编写的测试失败，参考下文"修复损坏的测试"部分处理。

---

"Fix broken tests" / "Tests are failing"

"修复损坏的测试" / "测试运行失败"

First: understand the failure pattern

Check recent code changes:

bash

git diff --stat HEAD
git log --oneline -5

Map changed files to likely causes:

```
components/
```
,
```
pages/
```
→ selector changes
```
auth/
```
,
```
session/
```
→ auth state issues
```
api/
```
,
```
routes/
```
→ backend errors or changed response shapes

Then get test history:

helpmetest_status({ id: "test-id", testRunLimit: 10 })

Classify:

Consistent failure after a code change → selector/behavior changed
Intermittent PASS/FAIL with changing errors → isolation issue (shared state, test order dependency)
Timeout / element not visible → timing issue
Auth/session error → state not restored correctly
Backend error in test output → real bug, not a test issue

Reproduce interactively — always do this before fixing

Run the failing steps one at a time:

As  <persona>
Go To  <url>

第一步：理解失败规律

检查最近的代码变更：

bash

git diff --stat HEAD
git log --oneline -5

将变更文件对应到可能的原因：

```
components/
```
、
```
pages/
```
→ 选择器变更
```
auth/
```
、
```
session/
```
→ 鉴权状态问题
```
api/
```
、
```
routes/
```
→ 后端错误或者响应结构变更

然后获取测试历史：

helpmetest_status({ id: "test-id", testRunLimit: 10 })

分类判断：

代码变更后持续失败 → 选择器/功能行为变更
随机成功/失败，错误信息不固定 → 隔离性问题（共享状态、测试顺序依赖）
超时/元素不可见 → 时序问题
鉴权/会话错误 → 状态未正确重置
测试输出中出现后端错误 → 真实业务Bug，不是测试问题

交互式复现问题——修复前必须执行此步骤

逐步运行失败的测试步骤：

As  <persona>
Go To  <url>

Execute each step, observe what actually happens at the point of failure

执行每个步骤，观察失败节点的实际运行情况


For "element not found": list all elements of that type, try alternate selectors.
For "wrong value": check what's actually displayed vs what the test expected.
For timeouts: try longer waits, check whether the element ever appears.

**Decide: test issue or app bug?**

- **Test issue** (selector changed, timing, wrong expectation) → fix the test, validate the fix interactively before saving
- **App bug** (feature is actually broken) → document in `feature.bugs[]`, update Feature.status to "broken" or "partial"

**Many tests broke after a UI change?**

Work through them systematically one by one. For each:
1. Classify the failure (usually selector or timing)
2. Reproduce interactively
3. Fix
4. Run to confirm

Don't shotgun-fix by guessing — one wrong fix creates two broken tests.

**Tests are out of date after a refactor?**

1. Get test list: `helpmetest_status()`
2. For each failing test, check whether the Feature artifact scenario still matches intended behavior
3. If the code is the source of truth → update the test
4. If the test was right and the refactor broke behavior → document the regression

---


如果是"元素未找到"：列出该类型的所有元素，尝试其他选择器。
如果是"值错误"：对比实际展示内容和测试预期内容的差异。
如果是超时：尝试延长等待时间，确认元素最终是否会出现。

**判断：是测试问题还是应用Bug？**

- **测试问题**（选择器变更、时序问题、预期错误）→ 修复测试，保存前交互式验证修复效果
- **应用Bug**（功能实际损坏）→ 记录到`feature.bugs[]`中，将Feature.status更新为"broken"或"partial"

**UI变更后大量测试失败？**

逐个系统性处理，每个测试按以下步骤操作：
1. 分类失败原因（通常是选择器或时序问题）
2. 交互式复现
3. 修复
4. 运行确认

不要盲目猜测批量修改——一个错误的修复会产生更多损坏的测试。

**重构后测试过时？**

1. 获取测试列表：`helpmetest_status()`
2. 针对每个失败的测试，检查Feature工件的场景是否仍符合预期行为
3. 如果代码是正确的事实来源 → 更新测试
4. 如果测试是正确的，重构损坏了功能 → 记录回归问题

---

Writing Tests

测试编写规范

Structure

结构

As  <persona>          # auth state — always first
Go To  <url>

As  <persona>          # 鉴权状态 —— 必须放在第一位
Go To  <url>

Given — establish precondition

Given —— 建立前置条件

<steps>

When — perform the action

When —— 执行操作

<steps>

Then — verify the outcome

Then —— 验证结果

Persistence check (if relevant)

持久化校验（如果相关）

Reload <re-assert that state survived>

undefined

Reload <re-assert that state survived>

undefined

What makes a good test

优质测试的标准

✅ Verifies a business outcome — data saved, filter applied, order created ✅ Would FAIL if the feature is broken ✅ 5+ meaningful steps ✅ Checks state change, not just that a button exists

❌ Just navigates to a page and counts elements ❌ Clicks something without checking what happened ❌ Passes when the feature is broken

✅ 验证业务结果——数据保存、筛选生效、订单创建 ✅ 功能损坏时测试一定会失败 ✅ 包含5个以上有意义的步骤 ✅ 检查状态变更，而非仅验证按钮是否存在

❌ 仅导航到页面统计元素数量 ❌ 点击操作后不校验结果 ❌ 功能损坏时测试仍能通过

Test naming

测试命名规范

Format:

User can <action>

<Feature> <behavior>

✅
```
User can update profile email
```

✅

Cart total updates when quantity changes

❌
```
MyApp Login Test
```
❌
```
SiteName Checkout
```

格式：

User can <action>

或者

<Feature> <behavior>

✅
```
User can update profile email
```

✅

Cart total updates when quantity changes

❌
```
MyApp Login Test
```
❌
```
SiteName Checkout
```

Auth

鉴权

Use

Save As <StateName>

once to capture auth state. Reuse with

As <StateName>

in every test — never re-authenticate inside tests.

仅执行一次

Save As <StateName>

捕获鉴权状态，所有测试复用

As <StateName>

——绝对不要在测试内部重新鉴权。

Emails

邮箱

Use

Create Fake Email

— never hardcode

test@example.com

. Hardcoded emails break on second run.

${email}=  Create Fake Email
Fill Text  input[name=email]  ${email}
${code}=   Get Email Verification Code  ${email}

使用

Create Fake Email

——绝对不要硬编码

test@example.com

，硬编码邮箱第二次运行就会失败。

${email}=  Create Fake Email
Fill Text  input[name=email]  ${email}
${code}=   Get Email Verification Code  ${email}

Localhost

本地服务

If testing a local server, set up the proxy first:

helpmetest_proxy({ action: "start", domain: "dev.local", sourcePort: 3000 })

Verify it works before writing any tests. See the

proxy

skill for details.

如果测试本地服务，先启动代理：

helpmetest_proxy({ action: "start", domain: "dev.local", sourcePort: 3000 })

编写任何测试前先验证代理正常运行，参考

proxy

技能获取详情。

标签规范

```
priority:critical|high|medium|low
```
```
feature:[feature-name]
```
```
type:e2e|smoke|regression
```

```
priority:critical|high|medium|low
```
```
feature:[feature-name]
```
```
type:e2e|smoke|regression
```

Done means

完成标准

✅ All tests passing
✅ All
```
priority:critical
```
scenarios have
```
test_ids
```
✅ Bugs documented in
```
feature.bugs[]
```
✅ Feature.status updated (
```
working
```
/
```
broken
```
/
```
partial
```
)
✅ Tasks artifact all done

✅ 所有测试通过
✅ 所有
```
priority:critical
```
级别的场景都关联了
```
test_ids
```
✅ Bug已记录到
```
feature.bugs[]
```
中
✅ Feature.status已更新（
```
working
```
/
```
broken
```
/
```
partial
```
）
✅ Tasks工件所有任务已完成

tdd

Original

Translation

Tests — Write, Generate, Fix

测试——编写、生成、修复

Orient First (Always)

先梳理整体情况（必做）

Use Cases

使用场景

"I need to build something" (TDD)

"我需要开发某个功能"（TDD）

"Write tests for an existing feature"

"为现有功能编写测试"

Execute each Given/When/Then step, observe what actually happens

执行每个Given/When/Then步骤，观察实际运行结果

"Fix broken tests" / "Tests are failing"

"修复损坏的测试" / "测试运行失败"

Execute each step, observe what actually happens at the point of failure

执行每个步骤，观察失败节点的实际运行情况

Writing Tests

测试编写规范

Structure

结构

Given — establish precondition

Given —— 建立前置条件

When — perform the action

When —— 执行操作

Then — verify the outcome

Then —— 验证结果

Persistence check (if relevant)

持久化校验（如果相关）

What makes a good test

优质测试的标准

Test naming

测试命名规范

Auth

鉴权

Emails

邮箱

Localhost

本地服务

Tags

标签规范

Done means

完成标准