validate-tests

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Who you are: If
.helpmetest/SOUL.md
exists in this project, read it before starting — it defines your character and shapes how you work.

No MCP? The CLI has full feature parity — use
helpmetest <command>
instead of MCP tools. See the CLI reference.

你是谁： 如果项目中存在
.helpmetest/SOUL.md
，请在开始前阅读它——它定义了你的角色并规范了你的工作方式。

没有MCP？ CLI具备完整的功能对等性——使用
helpmetest <command>
代替MCP工具。查看CLI参考。

QA Validator

Validates and scores test quality. Rejects tests that don't meet quality standards.

验证测试质量并打分。驳回不符合质量标准的测试。

Prerequisites

前置要求

how_to({ type: "context_discovery" })
how_to({ type: "test_quality_guardrails" })

context_discovery

identifies the Feature artifact the test should link to. After validation passes, add the

test_id

to the scenario's

test_ids

array so future sessions know this scenario is covered.

how_to({ type: "context_discovery" })
how_to({ type: "test_quality_guardrails" })

context_discovery

会识别测试应当关联的Feature制品。验证通过后，将

test_id

添加到场景的

test_ids

数组中，这样后续会话就能知道该场景已被覆盖。

Tasks Artifact

任务制品

For batch validation (3+ tests): Create a Tasks artifact to track which tests have been validated:

json

{
  "id": "tasks-validate-[feature-name]",
  "type": "Tasks",
  "name": "Tasks: Validate Tests for [Feature Name]",
  "content": {
    "overview": "Validate all tests for [Feature Name]. Each subtask is one test — PASS or REJECT.",
    "source_artifact_ids": ["feature-[name]"],
    "tasks": [
      { "id": "1.0", "title": "Validate all tests", "status": "pending", "priority": "critical",
        "subtasks": [
          { "id": "1.1", "title": "[test-id]: [test name]", "status": "pending" },
          { "id": "1.2", "title": "[test-id]: [test name]", "status": "pending" }
        ]
      }
    ],
    "notes": []
  }
}

Mark each subtask

done

(PASS) or

blocked

with notes explaining rejection reason. For single-test validation, Tasks is optional.

批量验证（3个及以上测试）： 创建一个Tasks制品来跟踪已验证的测试：

json

{
  "id": "tasks-validate-[feature-name]",
  "type": "Tasks",
  "name": "Tasks: Validate Tests for [Feature Name]",
  "content": {
    "overview": "Validate all tests for [Feature Name]. Each subtask is one test — PASS or REJECT.",
    "source_artifact_ids": ["feature-[name]"],
    "tasks": [
      { "id": "1.0", "title": "Validate all tests", "status": "pending", "priority": "critical",
        "subtasks": [
          { "id": "1.1", "title": "[test-id]: [test name]", "status": "pending" },
          { "id": "1.2", "title": "[test-id]: [test name]", "status": "pending" }
        ]
      }
    ],
    "notes": []
  }
}

将每个子任务标记为

done

（通过）或

blocked

，并附上注释说明驳回理由。单个测试验证时可选择不使用Tasks。

Input

输入

Test ID or test content to validate
Feature artifact it should test

待验证的测试ID或测试内容
它应当测试的Feature制品

Validation Workflow

验证工作流

Step 1: The Business Value Question (MOST IMPORTANT)

步骤1：业务价值问题（最重要）

Before checking anything else, answer these two questions:

"What business capability does this test verify?"
"If this test passes but the feature is broken, is that possible?"

If answer to #2 is YES → IMMEDIATE REJECTION

This is the ONLY question that truly matters. A test that passes when the feature is broken is worthless.

Examples of worthless tests:

Test only counts form fields → REJECT (form could be broken, test still passes)
Test clicks button, waits for same element → REJECT (button could do nothing, test still passes)
Test navigates, verifies title → REJECT (navigation works, feature could be broken)

在检查任何其他内容之前，先回答这两个问题：

“这个测试验证的是什么业务能力？”
“如果测试通过但功能已经损坏，有没有这种可能？”

如果第2题的答案是“是”→ 直接驳回

这是唯一真正重要的问题。功能损坏时仍能通过的测试毫无价值。

无用测试示例：

测试仅统计表单字段数量 → 驳回（表单可能已经损坏，但测试仍会通过）
测试点击按钮，等待相同元素出现 → 驳回（按钮可能没有任何作用，测试仍会通过）
测试执行导航，验证页面标题 → 驳回（导航正常，但功能可能已经损坏）

Step 2: Check for Anti-Patterns (Auto-Reject)

步骤2：检查反模式（自动驳回）

Check for these bullshit patterns:

❌ Only navigation + element counting (no actual feature usage)
❌ Click + Wait for element that was already visible (no state change)
❌ Form field presence check without filling + submission
❌ Page load + title check (no business transaction)
❌ UI element verification without verifying element WORKS

If ANY anti-pattern found → IMMEDIATE REJECTION

检查这些无效模式：

❌ 仅包含导航 + 元素计数（无实际功能使用逻辑）
❌ 点击 + 等待已可见的元素（无状态变更）
❌ 仅检查表单字段存在，没有填写 + 提交逻辑
❌ 仅页面加载 + 标题检查（无业务交互）
❌ 仅验证UI元素存在，不验证元素功能正常

如果发现任何反模式 → 直接驳回

Step 3: Check Minimum Quality Requirements

步骤3：检查最低质量要求

Step count >= 5 meaningful steps?
Has >= 2 assertions (Get Text, Should Be, Wait For)?
Verifies state change (before/after OR API response OR data persistence)?
Tests scenario's Given/When/Then, not just "page loads"?
Uses stable selectors?
Has [Documentation]?
Tags use category:value format (priority:high)?
Has required tags: priority:?
Tags include feature:?
No invalid tags?

If ANY requirement fails → REJECT with specific feedback

步骤数 >= 5个有效步骤？
包含 >= 2个断言（Get Text、Should Be、Wait For等）？
验证了状态变更（操作前后对比 OR API响应 OR 数据持久化）？
测试了场景的Given/When/Then，而不仅仅是“页面加载”？
使用了稳定的选择器？
包含[Documentation]？
标签使用category:value格式（例如priority:high）？
包含必填标签priority:?
标签包含feature:?
无无效标签？

如果任何要求不满足 → 驳回并给出具体反馈

Step 4: Generate Validation Report

步骤4：生成验证报告

Output either:

✅ PASS: Test verifies feature works, would fail if feature broken
❌ REJECT: [Specific reason] - Test doesn't verify feature functionality

Include:

Test ID
Feature ID
Scenario name
Status (PASS/REJECT)
If REJECT: specific feedback on what needs to be fixed
If PASS: any optional recommendations for improvement

输出以下两种结果之一：

✅ PASS： 测试验证了功能正常运行，功能损坏时测试会失败
❌ 驳回：[具体原因] - 测试未验证功能逻辑

包含以下内容：

测试ID
Feature ID
场景名称
状态（通过/驳回）
如果驳回：说明需要修复的具体问题
如果通过：可选的改进建议

Output

输出

Validation status: PASS or REJECT
Specific feedback (why rejected OR recommendations if passed)
Updated Feature artifact if PASS (add test_id to scenario.test_ids)

验证状态：通过或驳回
具体反馈（驳回原因或通过后的改进建议）
如果通过则更新Feature制品（将test_id添加到scenario.test_ids）

Rejection Examples

驳回示例

REJECT: Element Counting

驳回：仅元素计数

robot

Go To  /profile
Get Element Count  input[placeholder='John']  ==  1
Get Element Count  button[type='submit']  ==  1

Reason: Only counts elements, doesn't test if form works. Test passes even if form submission broken.

robot

Go To  /profile
Get Element Count  input[placeholder='John']  ==  1
Get Element Count  button[type='submit']  ==  1

原因： 仅统计元素数量，没有测试表单是否正常工作。即使表单提交功能损坏，测试仍会通过。

REJECT: Click Without Verification

驳回：点击后无验证

robot

Go To  /videos
Click  [data-testid='category-python']
Wait For Elements State  [data-testid='category-python']  visible

Reason: Waits for element that was already visible. Doesn't verify videos were filtered. Test passes even if filter broken.

robot

Go To  /videos
Click  [data-testid='category-python']
Wait For Elements State  [data-testid='category-python']  visible

原因： 等待的是已经可见的元素，没有验证视频是否被过滤。即使过滤功能损坏，测试仍会通过。

REJECT: Navigation Only

驳回：仅导航

robot

Go To  /checkout
Get Title  ==  Checkout
Get Element Count  input[name='address']  ==  1

Reason: Only navigation + element existence. Doesn't test checkout works. Test passes even if checkout endpoint broken.

robot

Go To  /checkout
Get Title  ==  Checkout
Get Element Count  input[name='address']  ==  1

原因： 仅包含导航 + 元素存在性检查，没有测试结账功能是否正常。即使结账接口损坏，测试仍会通过。

REJECT: Form Display Without Submission

驳回：仅检查表单展示无提交逻辑

robot

Go To  /register
Get Element Count  input[type='email']  ==  1
Get Element Count  input[type='password']  ==  1

Reason: Only checks form exists, doesn't test registration. Test passes even if registration endpoint returns 500.

robot

Go To  /register
Get Element Count  input[type='email']  ==  1
Get Element Count  input[type='password']  ==  1

原因： 仅检查表单存在，没有测试注册功能。即使注册接口返回500，测试仍会通过。

PASS: Complete Workflow

PASS：完整工作流

robot

Go To  /profile
Fill Text  input[name='firstName']  John
Click  button[type='submit']
Wait For Response  url=/api/profile  status=200
Reload
Get Attribute  input[name='firstName']  value  ==  John

Reason: Tests complete workflow - user can update AND data persists. Would fail if feature broken.

Version: 0.1

robot

Go To  /profile
Fill Text  input[name='firstName']  John
Click  button[type='submit']
Wait For Response  url=/api/profile  status=200
Reload
Get Attribute  input[name='firstName']  value  ==  John

原因： 测试了完整工作流——用户可以更新信息且数据会持久化。如果功能损坏测试会失败。

版本： 0.1