validate-tests

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Who you are: If
.helpmetest/SOUL.md
exists in this project, read it before starting — it defines your character and shapes how you work.
No MCP? The CLI has full feature parity — use
helpmetest <command>
instead of MCP tools. See the CLI reference.
你是谁: 如果项目中存在
.helpmetest/SOUL.md
,请在开始前阅读它——它定义了你的角色并规范了你的工作方式。
没有MCP? CLI具备完整的功能对等性——使用
helpmetest <command>
代替MCP工具。查看CLI参考

QA Validator

QA Validator

Validates and scores test quality. Rejects tests that don't meet quality standards.
验证测试质量并打分。驳回不符合质量标准的测试。

Prerequisites

前置要求

how_to({ type: "context_discovery" })
how_to({ type: "test_quality_guardrails" })
context_discovery
identifies the Feature artifact the test should link to. After validation passes, add the
test_id
to the scenario's
test_ids
array so future sessions know this scenario is covered.
how_to({ type: "context_discovery" })
how_to({ type: "test_quality_guardrails" })
context_discovery
会识别测试应当关联的Feature制品。验证通过后,将
test_id
添加到场景的
test_ids
数组中,这样后续会话就能知道该场景已被覆盖。

Tasks Artifact

任务制品

For batch validation (3+ tests): Create a Tasks artifact to track which tests have been validated:
json
{
  "id": "tasks-validate-[feature-name]",
  "type": "Tasks",
  "name": "Tasks: Validate Tests for [Feature Name]",
  "content": {
    "overview": "Validate all tests for [Feature Name]. Each subtask is one test — PASS or REJECT.",
    "source_artifact_ids": ["feature-[name]"],
    "tasks": [
      { "id": "1.0", "title": "Validate all tests", "status": "pending", "priority": "critical",
        "subtasks": [
          { "id": "1.1", "title": "[test-id]: [test name]", "status": "pending" },
          { "id": "1.2", "title": "[test-id]: [test name]", "status": "pending" }
        ]
      }
    ],
    "notes": []
  }
}
Mark each subtask
done
(PASS) or
blocked
with notes explaining rejection reason. For single-test validation, Tasks is optional.
批量验证(3个及以上测试): 创建一个Tasks制品来跟踪已验证的测试:
json
{
  "id": "tasks-validate-[feature-name]",
  "type": "Tasks",
  "name": "Tasks: Validate Tests for [Feature Name]",
  "content": {
    "overview": "Validate all tests for [Feature Name]. Each subtask is one test — PASS or REJECT.",
    "source_artifact_ids": ["feature-[name]"],
    "tasks": [
      { "id": "1.0", "title": "Validate all tests", "status": "pending", "priority": "critical",
        "subtasks": [
          { "id": "1.1", "title": "[test-id]: [test name]", "status": "pending" },
          { "id": "1.2", "title": "[test-id]: [test name]", "status": "pending" }
        ]
      }
    ],
    "notes": []
  }
}
将每个子任务标记为
done
(通过)或
blocked
,并附上注释说明驳回理由。单个测试验证时可选择不使用Tasks。

Input

输入

  • Test ID or test content to validate
  • Feature artifact it should test
  • 待验证的测试ID或测试内容
  • 它应当测试的Feature制品

Validation Workflow

验证工作流

Step 1: The Business Value Question (MOST IMPORTANT)

步骤1:业务价值问题(最重要)

Before checking anything else, answer these two questions:
  1. "What business capability does this test verify?"
  2. "If this test passes but the feature is broken, is that possible?"
If answer to #2 is YES → IMMEDIATE REJECTION
This is the ONLY question that truly matters. A test that passes when the feature is broken is worthless.
Examples of worthless tests:
  • Test only counts form fields → REJECT (form could be broken, test still passes)
  • Test clicks button, waits for same element → REJECT (button could do nothing, test still passes)
  • Test navigates, verifies title → REJECT (navigation works, feature could be broken)
在检查任何其他内容之前,先回答这两个问题:
  1. “这个测试验证的是什么业务能力?”
  2. “如果测试通过但功能已经损坏,有没有这种可能?”
如果第2题的答案是“是”→ 直接驳回
这是唯一真正重要的问题。功能损坏时仍能通过的测试毫无价值。
无用测试示例:
  • 测试仅统计表单字段数量 → 驳回(表单可能已经损坏,但测试仍会通过)
  • 测试点击按钮,等待相同元素出现 → 驳回(按钮可能没有任何作用,测试仍会通过)
  • 测试执行导航,验证页面标题 → 驳回(导航正常,但功能可能已经损坏)

Step 2: Check for Anti-Patterns (Auto-Reject)

步骤2:检查反模式(自动驳回)

Check for these bullshit patterns:
  • ❌ Only navigation + element counting (no actual feature usage)
  • ❌ Click + Wait for element that was already visible (no state change)
  • ❌ Form field presence check without filling + submission
  • ❌ Page load + title check (no business transaction)
  • ❌ UI element verification without verifying element WORKS
If ANY anti-pattern found → IMMEDIATE REJECTION
检查这些无效模式:
  • ❌ 仅包含导航 + 元素计数(无实际功能使用逻辑)
  • ❌ 点击 + 等待已可见的元素(无状态变更)
  • ❌ 仅检查表单字段存在,没有填写 + 提交逻辑
  • ❌ 仅页面加载 + 标题检查(无业务交互)
  • ❌ 仅验证UI元素存在,不验证元素功能正常
如果发现任何反模式 → 直接驳回

Step 3: Check Minimum Quality Requirements

步骤3:检查最低质量要求

  • Step count >= 5 meaningful steps?
  • Has >= 2 assertions (Get Text, Should Be, Wait For)?
  • Verifies state change (before/after OR API response OR data persistence)?
  • Tests scenario's Given/When/Then, not just "page loads"?
  • Uses stable selectors?
  • Has [Documentation]?
  • Tags use category:value format (priority:high)?
  • Has required tags: priority:?
  • Tags include feature:?
  • No invalid tags?
If ANY requirement fails → REJECT with specific feedback
  • 步骤数 >= 5个有效步骤?
  • 包含 >= 2个断言(Get Text、Should Be、Wait For等)?
  • 验证了状态变更(操作前后对比 OR API响应 OR 数据持久化)?
  • 测试了场景的Given/When/Then,而不仅仅是“页面加载”?
  • 使用了稳定的选择器?
  • 包含[Documentation]?
  • 标签使用category:value格式(例如priority:high)?
  • 包含必填标签priority:?
  • 标签包含feature:?
  • 无无效标签?
如果任何要求不满足 → 驳回并给出具体反馈

Step 4: Generate Validation Report

步骤4:生成验证报告

Output either:
  • ✅ PASS: Test verifies feature works, would fail if feature broken
  • ❌ REJECT: [Specific reason] - Test doesn't verify feature functionality
Include:
  • Test ID
  • Feature ID
  • Scenario name
  • Status (PASS/REJECT)
  • If REJECT: specific feedback on what needs to be fixed
  • If PASS: any optional recommendations for improvement
输出以下两种结果之一:
  • ✅ PASS: 测试验证了功能正常运行,功能损坏时测试会失败
  • ❌ 驳回:[具体原因] - 测试未验证功能逻辑
包含以下内容:
  • 测试ID
  • Feature ID
  • 场景名称
  • 状态(通过/驳回)
  • 如果驳回:说明需要修复的具体问题
  • 如果通过:可选的改进建议

Output

输出

  • Validation status: PASS or REJECT
  • Specific feedback (why rejected OR recommendations if passed)
  • Updated Feature artifact if PASS (add test_id to scenario.test_ids)
  • 验证状态:通过或驳回
  • 具体反馈(驳回原因或通过后的改进建议)
  • 如果通过则更新Feature制品(将test_id添加到scenario.test_ids)

Rejection Examples

驳回示例

REJECT: Element Counting

驳回:仅元素计数

robot
Go To  /profile
Get Element Count  input[placeholder='John']  ==  1
Get Element Count  button[type='submit']  ==  1
Reason: Only counts elements, doesn't test if form works. Test passes even if form submission broken.
robot
Go To  /profile
Get Element Count  input[placeholder='John']  ==  1
Get Element Count  button[type='submit']  ==  1
原因: 仅统计元素数量,没有测试表单是否正常工作。即使表单提交功能损坏,测试仍会通过。

REJECT: Click Without Verification

驳回:点击后无验证

robot
Go To  /videos
Click  [data-testid='category-python']
Wait For Elements State  [data-testid='category-python']  visible
Reason: Waits for element that was already visible. Doesn't verify videos were filtered. Test passes even if filter broken.
robot
Go To  /videos
Click  [data-testid='category-python']
Wait For Elements State  [data-testid='category-python']  visible
原因: 等待的是已经可见的元素,没有验证视频是否被过滤。即使过滤功能损坏,测试仍会通过。

REJECT: Navigation Only

驳回:仅导航

robot
Go To  /checkout
Get Title  ==  Checkout
Get Element Count  input[name='address']  ==  1
Reason: Only navigation + element existence. Doesn't test checkout works. Test passes even if checkout endpoint broken.
robot
Go To  /checkout
Get Title  ==  Checkout
Get Element Count  input[name='address']  ==  1
原因: 仅包含导航 + 元素存在性检查,没有测试结账功能是否正常。即使结账接口损坏,测试仍会通过。

REJECT: Form Display Without Submission

驳回:仅检查表单展示无提交逻辑

robot
Go To  /register
Get Element Count  input[type='email']  ==  1
Get Element Count  input[type='password']  ==  1
Reason: Only checks form exists, doesn't test registration. Test passes even if registration endpoint returns 500.
robot
Go To  /register
Get Element Count  input[type='email']  ==  1
Get Element Count  input[type='password']  ==  1
原因: 仅检查表单存在,没有测试注册功能。即使注册接口返回500,测试仍会通过。

PASS: Complete Workflow

PASS:完整工作流

robot
Go To  /profile
Fill Text  input[name='firstName']  John
Click  button[type='submit']
Wait For Response  url=/api/profile  status=200
Reload
Get Attribute  input[name='firstName']  value  ==  John
Reason: Tests complete workflow - user can update AND data persists. Would fail if feature broken.
Version: 0.1
robot
Go To  /profile
Fill Text  input[name='firstName']  John
Click  button[type='submit']
Wait For Response  url=/api/profile  status=200
Reload
Get Attribute  input[name='firstName']  value  ==  John
原因: 测试了完整工作流——用户可以更新信息且数据会持久化。如果功能损坏测试会失败。
版本: 0.1