spec-kitty-checklist

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Checklist Purpose: "Unit Tests for English"

检查清单目的:「英文需求的单元测试」

CRITICAL CONCEPT: Checklists are UNIT TESTS FOR REQUIREMENTS WRITING - they validate the quality, clarity, and completeness of requirements in a given domain.
NOT for verification/testing:
  • ❌ NOT "Verify the button clicks correctly"
  • ❌ NOT "Test error handling works"
  • ❌ NOT "Confirm the API returns 200"
  • ❌ NOT checking if code/implementation matches the spec
FOR requirements quality validation:
  • ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
  • ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
  • ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
  • ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
  • ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
Metaphor: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
核心概念:检查清单是需求写作的单元测试——它们用于验证特定领域中需求的质量、清晰度和完整性。
不用于验证/测试
  • ❌ 不包含「验证按钮点击正常」
  • ❌ 不包含「测试错误处理功能正常」
  • ❌ 不包含「确认API返回200状态码」
  • ❌ 不检查代码/实现是否符合规格说明书
用于需求质量验证
  • ✅ 「是否为所有卡片类型定义了视觉层级需求?」(完整性)
  • ✅ 「是否用具体的尺寸/位置量化了“突出显示”的要求?」(清晰度)
  • ✅ 「所有交互元素的悬停状态需求是否一致?」(一致性)
  • ✅ 「是否为键盘导航定义了无障碍需求?」(覆盖范围)
  • ✅ 「规格说明书是否定义了Logo图片加载失败时的处理逻辑?」(边缘情况)
类比:如果你的需求规格说明书是用英文写的“代码”,那么检查清单就是它的单元测试套件。你要测试的是需求是否撰写得当、完整、明确,是否已准备好进入实现阶段——而不是实现是否可行。

User Input

用户输入

text
$ARGUMENTS
You MUST consider the user input before proceeding (if not empty).
text
$ARGUMENTS
在继续之前,你必须考虑用户输入(如果不为空)。

Execution Steps

执行步骤

  1. Setup: Run
    spec-kitty agent feature check-prerequisites --json
    from repo root and parse JSON for feature_dir and available_docs list.
    • All file paths must be absolute.
  2. Clarify intent (dynamic): Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
    • Be generated from the user's phrasing + extracted signals from spec/plan/tasks
    • Only ask about information that materially changes checklist content
    • Be skipped individually if already unambiguous in
      $ARGUMENTS
    • Prefer precision over breadth
    Generation algorithm:
    1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
    2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
    3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
    4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
    5. Formulate questions chosen from these archetypes:
      • Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
      • Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
      • Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
      • Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
      • Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
      • Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
    Question formatting rules:
    • If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
    • Limit to A–E options maximum; omit table if a free-form answer is clearer
    • Never ask the user to restate what they already said
    • Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
    Defaults when interaction impossible:
    • Depth: Standard
    • Audience: Reviewer (PR) if code-related; Author otherwise
    • Focus: Top 2 relevance clusters
    Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
  3. Understand user request: Combine
    $ARGUMENTS
    + clarifying answers:
    • Derive checklist theme (e.g., security, review, deploy, ux)
    • Consolidate explicit must-have items mentioned by user
    • Map focus selections to category scaffolding
    • Infer any missing context from spec/plan/tasks (do NOT hallucinate)
  4. Load feature context: Read from feature_dir:
    • spec.md: Feature requirements and scope
    • plan.md (if exists): Technical details, dependencies
    • tasks.md (if exists): Implementation tasks
    Context Loading Strategy:
    • Load only necessary portions relevant to active focus areas (avoid full-file dumping)
    • Prefer summarizing long sections into concise scenario/requirement bullets
    • Use progressive disclosure: add follow-on retrieval only if gaps detected
    • If source docs are large, generate interim summary items instead of embedding raw text
  5. Generate checklist - Create "Unit Tests for Requirements":
    • Create
      feature_dir/checklists/
      directory if it doesn't exist
    • Generate unique checklist filename:
      • Use short, descriptive name based on domain (e.g.,
        ux.md
        ,
        api.md
        ,
        security.md
        )
      • Format:
        [domain].md
      • If file exists, append to existing file
    • Number items sequentially starting from CHK001
    • Each
      /spec-kitty.checklist
      run creates a NEW file (never overwrites existing checklists)
    CORE PRINCIPLE - Test the Requirements, Not the Implementation: Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
    • Completeness: Are all necessary requirements present?
    • Clarity: Are requirements unambiguous and specific?
    • Consistency: Do requirements align with each other?
    • Measurability: Can requirements be objectively verified?
    • Coverage: Are all scenarios/edge cases addressed?
    Category Structure - Group items by requirement quality dimensions:
    • Requirement Completeness (Are all necessary requirements documented?)
    • Requirement Clarity (Are requirements specific and unambiguous?)
    • Requirement Consistency (Do requirements align without conflicts?)
    • Acceptance Criteria Quality (Are success criteria measurable?)
    • Scenario Coverage (Are all flows/cases addressed?)
    • Edge Case Coverage (Are boundary conditions defined?)
    • Non-Functional Requirements (Performance, Security, Accessibility, etc. - are they specified?)
    • Dependencies & Assumptions (Are they documented and validated?)
    • Ambiguities & Conflicts (What needs clarification?)
    HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English":
    WRONG (Testing implementation):
    • "Verify landing page displays 3 episode cards"
    • "Test hover states work on desktop"
    • "Confirm logo click navigates home"
    CORRECT (Testing requirements quality):
    • "Are the exact number and layout of featured episodes specified?" [Completeness]
    • "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
    • "Are hover state requirements consistent across all interactive elements?" [Consistency]
    • "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
    • "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
    • "Are loading states defined for asynchronous episode data?" [Completeness]
    • "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
    ITEM STRUCTURE: Each item should follow this pattern:
    • Question format asking about requirement quality
    • Focus on what's WRITTEN (or not written) in the spec/plan
    • Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
    • Reference spec section
      [Spec §X.Y]
      when checking existing requirements
    • Use
      [Gap]
      marker when checking for missing requirements
    EXAMPLES BY QUALITY DIMENSION:
    Completeness:
    • "Are error handling requirements defined for all API failure modes? [Gap]"
    • "Are accessibility requirements specified for all interactive elements? [Completeness]"
    • "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
    Clarity:
    • "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
    • "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
    • "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
    Consistency:
    • "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
    • "Are card component requirements consistent between landing and detail pages? [Consistency]"
    Coverage:
    • "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
    • "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
    • "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
    Measurability:
    • "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
    • "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
    Scenario Classification & Coverage (Requirements Quality Focus):
    • Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
    • For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
    • If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
    • Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
    Traceability Requirements:
    • MINIMUM: ≥80% of items MUST include at least one traceability reference
    • Each item should reference: spec section
      [Spec §X.Y]
      , or use markers:
      [Gap]
      ,
      [Ambiguity]
      ,
      [Conflict]
      ,
      [Assumption]
    • If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
    Surface & Resolve Issues (Requirements Quality Problems): Ask questions about the requirements themselves:
    • Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
    • Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
    • Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
    • Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
    • Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
    Content Consolidation:
    • Soft cap: If raw candidate items > 40, prioritize by risk/impact
    • Merge near-duplicates checking the same requirement aspect
    • If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
    🚫 ABSOLUTELY PROHIBITED - These make it an implementation test, not a requirements test:
    • ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
    • ❌ References to code execution, user actions, system behavior
    • ❌ "Displays correctly", "works properly", "functions as expected"
    • ❌ "Click", "navigate", "render", "load", "execute"
    • ❌ Test cases, test plans, QA procedures
    • ❌ Implementation details (frameworks, APIs, algorithms)
    ✅ REQUIRED PATTERNS - These test requirements quality:
    • ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
    • ✅ "Is [vague term] quantified/clarified with specific criteria?"
    • ✅ "Are requirements consistent between [section A] and [section B]?"
    • ✅ "Can [requirement] be objectively measured/verified?"
    • ✅ "Are [edge cases/scenarios] addressed in requirements?"
    • ✅ "Does the spec define [missing aspect]?"
  6. Structure Reference: Generate the checklist following the canonical template in
    .kittify/templates/checklist-template.md
    for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines,
    ##
    category sections containing
    - [ ] CHK### <requirement item>
    lines with globally incrementing IDs starting at CHK001.
  7. Report: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
    • Focus areas selected
    • Depth level
    • Actor/timing
    • Any explicit user-specified must-have items incorporated
Important: Each
/spec-kitty.checklist
command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
  • Multiple checklists of different types (e.g.,
    ux.md
    ,
    test.md
    ,
    security.md
    )
  • Simple, memorable filenames that indicate checklist purpose
  • Easy identification and navigation in the
    checklists/
    folder
To avoid clutter, use descriptive types and clean up obsolete checklists when done.
  1. 准备工作:从仓库根目录运行
    spec-kitty agent feature check-prerequisites --json
    命令,并解析返回的JSON以获取feature_dir和可用文档列表。
    • 所有文件路径必须为绝对路径。
  2. 明确意图(动态生成):推导最多三个初始上下文澄清问题(无预定义目录)。问题必须:
    • 从用户表述+从规格/计划/任务中提取的信号生成
    • 仅询问会实质性改变检查清单内容的信息
    • 如果
      $ARGUMENTS
      中已明确,则跳过对应问题
    • 优先精准性而非广泛性
    生成算法:
    1. 提取信号:功能领域关键词(如auth、latency、UX、API)、风险指标(“critical”、“must”、“compliance”)、利益相关者提示(“QA”、“review”、“security team”)和明确交付物(“a11y”、“rollback”、“contracts”)。
    2. 将信号聚类为候选焦点领域(最多4个),并按相关性排序。
    3. 如果未明确,推断可能的受众和时机(作者、评审者、QA、发布阶段)。
    4. 检测缺失维度:范围广度、深度/严谨性、风险重点、排除边界、可衡量的验收标准。
    5. 从以下原型中构建问题:
      • 范围细化(如:“是否应包含与X和Y的集成触点,还是仅局限于本地模块正确性?”)
      • 风险优先级(如:“这些潜在风险领域中哪些应设置强制检查关卡?”)
      • 深度校准(如:“这是轻量级的提交前检查清单,还是正式的发布关卡?”)
      • 受众定位(如:“仅由作者使用,还是供同行在PR评审时使用?”)
      • 边界排除(如:“本次是否应明确排除性能调优项?”)
      • 场景类别缺口(如:“未检测到恢复流程——回滚/部分失败路径是否在范围内?”)
    问题格式规则:
    • 如果提供选项,生成紧凑表格,列标题为:选项 | 候选内容 | 重要性说明
    • 选项最多限制为A-E;如果自由形式回答更清晰,则省略表格
    • 切勿要求用户重复已说明的内容
    • 避免推测性分类(不要虚构内容)。如有疑问,直接询问:“确认X是否在范围内。”
    无法交互时的默认值:
    • 深度:标准
    • 受众:与代码相关则为评审者(PR阶段);否则为作者
    • 焦点:前2个相关性最高的聚类
    输出问题(标记为Q1/Q2/Q3)。获取答案后:如果仍有≥2个场景类别(替代/异常/恢复/非功能领域)不明确,可额外提出最多2个针对性跟进问题(Q4/Q5),每个问题附带一行理由(如:“未解决的恢复路径风险”)。问题总数不得超过5个。如果用户明确拒绝更多问题,则跳过。
  3. 理解用户请求:结合
    $ARGUMENTS
    +澄清答案:
    • 推导检查清单主题(如security、review、deploy、ux)
    • 整合用户提到的明确必备项
    • 将焦点选择映射到类别框架
    • 从规格/计划/任务中推断缺失的上下文(切勿虚构)
  4. 加载功能上下文:从feature_dir读取:
    • spec.md:功能需求和范围
    • plan.md(如果存在):技术细节、依赖关系
    • tasks.md(如果存在):实现任务
    上下文加载策略
    • 仅加载与当前焦点领域相关的必要部分(避免全文件导入)
    • 优先将长章节总结为简洁的场景/需求要点
    • 逐步披露:仅在检测到缺口时补充后续检索
    • 如果源文档较大,生成临时摘要项而非嵌入原始文本
  5. 生成检查清单 - 创建「需求的单元测试」:
    • 如果
      feature_dir/checklists/
      目录不存在则创建
    • 生成唯一的检查清单文件名:
      • 使用基于领域的简短描述性名称(如
        ux.md
        api.md
        security.md
      • 格式:
        [domain].md
      • 如果文件已存在,追加到现有文件
    • 从CHK001开始按顺序编号条目
    • 每次运行
      /spec-kitty.checklist
      都会创建新文件(绝不会覆盖现有检查清单)
    核心原则 - 测试需求,而非实现: 每个检查清单条目必须评估需求本身的以下维度:
    • 完整性:所有必要需求是否都已存在?
    • 清晰度:需求是否明确且具体?
    • 一致性:需求之间是否一致?
    • 可衡量性:需求是否可客观验证?
    • 覆盖范围:所有场景/边缘情况是否都已覆盖?
    类别结构 - 按需求质量维度分组条目:
    • 需求完整性(所有必要需求是否已文档化?)
    • 需求清晰度(需求是否具体且无歧义?)
    • 需求一致性(需求是否一致无冲突?)
    • 验收标准质量(成功标准是否可衡量?)
    • 场景覆盖范围(所有流程/案例是否都已覆盖?)
    • 边缘情况覆盖范围(边界条件是否已定义?)
    • 非功能需求(性能、安全、无障碍等是否已明确?)
    • 依赖关系与假设(是否已文档化并验证?)
    • 歧义与冲突(哪些内容需要澄清?)
    如何撰写检查清单条目 - 「英文需求的单元测试」
    错误(测试实现):
    • 「验证着陆页显示3个剧集卡片」
    • 「测试桌面端悬停状态正常工作」
    • 「确认Logo点击可导航至首页」
    正确(测试需求质量):
    • 「是否明确指定了推荐剧集的确切数量和布局?」[完整性]
    • 「是否用具体的尺寸/位置量化了“突出显示”的要求?」[清晰度]
    • 「所有交互元素的悬停状态需求是否一致?」[一致性]
    • 「是否为所有交互式UI定义了键盘导航需求?」[覆盖范围]
    • 「规格说明书是否定义了Logo图片加载失败时的 fallback 行为?」[边缘情况]
    • 「是否为异步剧集数据定义了加载状态需求?」[完整性]
    • 「规格说明书是否定义了竞争UI元素的视觉层级?」[清晰度]
    条目结构: 每个条目应遵循以下模式:
    • 询问需求质量的问题格式
    • 聚焦于规格/计划中已撰写(或未撰写)的内容
    • 在括号中包含质量维度[完整性/清晰度/一致性等]
    • 检查现有需求时引用规格章节
      [Spec §X.Y]
    • 检查缺失需求时使用
      [Gap]
      标记
    按质量维度分类的示例
    完整性:
    • 「是否为所有API失败模式定义了错误处理需求?[Gap]」
    • 「是否为所有交互元素指定了无障碍需求?[完整性]」
    • 「是否为响应式布局定义了移动端断点需求?[Gap]」
    清晰度:
    • 「是否用具体的时间阈值量化了“快速加载”的要求?[清晰度, Spec §NFR-2]」
    • 「是否明确定义了“相关剧集”的选择标准?[清晰度, Spec §FR-5]」
    • 「是否用可衡量的视觉属性定义了“突出”的要求?[歧义, Spec §FR-4]」
    一致性:
    • 「所有页面的导航需求是否一致?[一致性, Spec §FR-10]」
    • 「着陆页和详情页的卡片组件需求是否一致?[一致性]」
    覆盖范围:
    • 「是否为零状态场景(无剧集)定义了需求?[覆盖范围, 边缘情况]」
    • 「是否解决了并发用户交互场景?[覆盖范围, Gap]」
    • 「是否为部分数据加载失败定义了需求?[覆盖范围, 异常流程]」
    可衡量性:
    • 「视觉层级需求是否可衡量/可测试?[验收标准, Spec §FR-1]」
    • 「“平衡视觉权重”是否可客观验证?[可衡量性, Spec §FR-2]」
    场景分类与覆盖范围(聚焦需求质量):
    • 检查是否存在针对以下场景的需求:主流程、替代流程、异常/错误流程、恢复流程、非功能场景
    • 针对每个场景类别,询问:「[场景类型]需求是否完整、清晰且一致?」
    • 如果场景类别缺失:「[场景类型]需求是有意排除还是遗漏?[Gap]」
    • 当存在状态变更时,包含弹性/回滚需求:「是否为迁移失败定义了回滚需求?[Gap]」
    可追溯性要求
    • 最低要求:≥80%的条目必须包含至少一个可追溯性引用
    • 每个条目应引用:规格章节
      [Spec §X.Y]
      ,或使用标记:
      [Gap]
      [Ambiguity]
      [Conflict]
      [Assumption]
    • 如果没有ID系统:「是否已建立需求与验收标准的ID体系?[可追溯性]」
    识别并解决问题(需求质量问题): 针对需求本身提出问题:
    • 歧义:「“快速”一词是否用具体指标量化?[歧义, Spec §NFR-1]」
    • 冲突:「§FR-10和§FR-10a中的导航需求是否冲突?[冲突]」
    • 假设:「“播客API始终可用”的假设是否已验证?[假设]」
    • 依赖关系:「是否已文档化外部播客API的需求?[依赖关系, Gap]」
    • 缺失定义:「是否用可衡量的标准定义了“视觉层级”?[Gap]」
    内容整合
    • 软限制:如果候选条目超过40个,按风险/影响优先级排序
    • 合并检查同一需求方面的近似重复项
    • 如果低影响边缘案例超过5个,合并为一个条目:「需求中是否涵盖了边缘案例X、Y、Z?[覆盖范围]」
    🚫 绝对禁止 - 这些会使其变为实现测试,而非需求测试:
    • ❌ 任何以“Verify”、“Test”、“Confirm”、“Check”+实现行为开头的条目
    • ❌ 提及代码执行、用户操作、系统行为
    • ❌ “显示正常”、“工作正常”、“符合预期”
    • ❌ “点击”、“导航”、“渲染”、“加载”、“执行”
    • ❌ 测试用例、测试计划、QA流程
    • ❌ 实现细节(框架、API、算法)
    ✅ 要求模式 - 这些用于测试需求质量:
    • ✅ 「是否为[场景]定义/明确/文档化了[需求类型]?」
    • ✅ 「是否用具体标准量化/澄清了[模糊术语]?」
    • ✅ 「[章节A]和[章节B]的需求是否一致?」
    • ✅ 「[需求]是否可客观衡量/验证?」
    • ✅ 「需求中是否涵盖了[边缘情况/场景]?」
    • ✅ 「规格说明书是否定义了[缺失方面]?」
  6. 结构参考:按照
    .kittify/templates/checklist-template.md
    中的标准模板生成检查清单,包括标题、元数据部分、类别标题和ID格式。如果模板不可用,使用:H1标题、目的/创建时间元数据行、
    ##
    类别章节,包含
    - [ ] CHK### <需求条目>
    行,ID从CHK001开始全局递增。
  7. 报告:输出创建的检查清单的完整路径、条目数量,并提醒用户每次运行都会创建新文件。总结:
    • 选择的焦点领域
    • 深度级别
    • 参与者/时机
    • 纳入的所有用户明确指定的必备项
重要提示:每次调用
/spec-kitty.checklist
命令都会使用简短的描述性名称创建检查清单文件(除非文件已存在)。这样做的好处:
  • 支持不同类型的多个检查清单(如
    ux.md
    test.md
    security.md
  • 文件名简单易记,可明确检查清单的用途
  • 便于在
    checklists/
    文件夹中识别和导航
为避免混乱,请使用描述性类型,并在完成后清理过时的检查清单。

Example Checklist Types & Sample Items

检查清单类型示例及样例条目

UX Requirements Quality:
ux.md
Sample items (testing the requirements, NOT the implementation):
  • "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
  • "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
  • "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
  • "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
  • "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
  • "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
API Requirements Quality:
api.md
Sample items:
  • "Are error response formats specified for all failure scenarios? [Completeness]"
  • "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
  • "Are authentication requirements consistent across all endpoints? [Consistency]"
  • "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
  • "Is versioning strategy documented in requirements? [Gap]"
Performance Requirements Quality:
performance.md
Sample items:
  • "Are performance requirements quantified with specific metrics? [Clarity]"
  • "Are performance targets defined for all critical user journeys? [Coverage]"
  • "Are performance requirements under different load conditions specified? [Completeness]"
  • "Can performance requirements be objectively measured? [Measurability]"
  • "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
Security Requirements Quality:
security.md
Sample items:
  • "Are authentication requirements specified for all protected resources? [Coverage]"
  • "Are data protection requirements defined for sensitive information? [Completeness]"
  • "Is the threat model documented and requirements aligned to it? [Traceability]"
  • "Are security requirements consistent with compliance obligations? [Consistency]"
  • "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
UX需求质量
ux.md
样例条目(测试需求,而非实现):
  • 「是否用可衡量的标准定义了视觉层级需求?[清晰度, Spec §FR-1]」
  • 「是否明确指定了UI元素的数量和位置?[完整性, Spec §FR-1]」
  • 「交互状态需求(悬停、聚焦、激活)是否一致定义?[一致性]」
  • 「是否为所有交互元素指定了无障碍需求?[覆盖范围, Gap]」
  • 「是否定义了图片加载失败时的 fallback 行为?[边缘情况, Gap]」
  • 「“突出显示”是否可客观衡量?[可衡量性, Spec §FR-4]」
API需求质量
api.md
样例条目:
  • 「是否为所有失败场景指定了错误响应格式?[完整性]」
  • 「是否用具体阈值量化了速率限制需求?[清晰度]」
  • 「所有端点的认证需求是否一致?[一致性]」
  • 「是否为外部依赖定义了重试/超时需求?[覆盖范围, Gap]」
  • 「需求中是否文档化了版本控制策略?[Gap]」
性能需求质量
performance.md
样例条目:
  • 「是否用具体指标量化了性能需求?[清晰度]」
  • 「是否为所有关键用户旅程定义了性能目标?[覆盖范围]」
  • 「是否指定了不同负载条件下的性能需求?[完整性]」
  • 「性能需求是否可客观衡量?[可衡量性]」
  • 「是否为高负载场景定义了性能降级需求?[边缘情况, Gap]」
安全需求质量
security.md
样例条目:
  • 「是否为所有受保护资源指定了认证需求?[覆盖范围]」
  • 「是否为敏感信息定义了数据保护需求?[完整性]」
  • 「是否文档化了威胁模型,且需求与之对齐?[可追溯性]」
  • 「安全需求是否符合合规义务?[一致性]」
  • 「是否定义了安全故障/违规响应需求?[Gap, 异常流程]」

Anti-Examples: What NOT To Do

反例:切勿这样做

❌ WRONG - These test implementation, not requirements:
markdown
- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
✅ CORRECT - These test requirements quality:
markdown
- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
Key Differences:
  • Wrong: Tests if the system works correctly
  • Correct: Tests if the requirements are written correctly
  • Wrong: Verification of behavior
  • Correct: Validation of requirement quality
  • Wrong: "Does it do X?"
  • Correct: "Is X clearly specified?"
❌ 错误 - 这些测试实现,而非需求
markdown
- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
✅ 正确 - 这些测试需求质量
markdown
- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
核心差异
  • 错误:测试系统是否正常工作
  • 正确:测试需求是否撰写正确
  • 错误:验证行为
  • 正确:验证需求质量
  • 错误:“它是否能做X?”
  • 正确:“X是否被明确指定?”