task-quality-kpi

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Task Quality KPI Framework

任务质量KPI框架

Overview

概述

The Task Quality KPI Framework provides objective, quantitative metrics for evaluating task implementation quality.
Key Architecture: KPIs are auto-generated by a hook - you read the results, not run scripts.
┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘
任务质量KPI框架为评估任务实施质量提供客观、量化的指标
核心架构:KPI由Hook自动生成——你只需读取结果,无需运行脚本。
┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘

Why This Architecture?

为何采用该架构?

ProblemSolution
Skills can't execute scriptsHook auto-runs on file save
Subjective review_statusQuantitative 0-10 scores
"Looks good to me"Evidence-based evaluation
Binary pass/failGraduated quality levels
问题解决方案
技能无法执行脚本文件保存时Hook自动运行
主观的review_status量化0-10分评分
“我看着没问题”基于证据的评估
二元通过/不通过分级质量水平

KPI File Location

KPI文件位置

After any task file modification, find KPI data at:
docs/specs/[ID]/tasks/TASK-XXX--kpi.json
任务文件修改后,可在以下路径找到KPI数据:
docs/specs/[ID]/tasks/TASK-XXX--kpi.json

KPI Categories

KPI分类

┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘

Category Weights

分类权重

CategoryWeightWhy
Spec Compliance30%Most important - did we build what was asked?
Code Quality25%Technical excellence
Test Coverage25%Verification and confidence
Contract Fulfillment20%Integration with other tasks
分类权重原因
规范合规30%最重要——我们是否按要求完成了任务?
代码质量25%技术卓越性
测试覆盖25%验证与可信度
契约履行20%与其他任务的集成性

When to Use

适用场景

  • Reading KPI data for task quality evaluation
  • Understanding quality metrics and scoring breakdown
  • Deciding whether to iterate or approve based on quantitative data
  • Integrating KPI checks into automated loops (
    agents_loop.py
    )
  • Generating evidence-based evaluation reports
  • 读取KPI数据进行任务质量评估
  • 理解质量指标及评分细分
  • 基于量化数据决定是否迭代或批准任务
  • 将KPI检查集成至自动化循环(
    agents_loop.py
  • 生成基于证据的评估报告

Instructions

使用说明

1. Reading KPI Data (Primary Use)

1. 读取KPI数据(主要用途)

DO NOT run scripts - read the auto-generated file:
markdown
Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json
请勿运行脚本——直接读取自动生成的文件:
markdown
读取KPI文件:
  docs/specs/001-feature/tasks/TASK-001--kpi.json

2. Understanding the Data

2. 理解数据内容

The KPI file contains:
json
{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}
KPI文件包含以下内容:
json
{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}

3. Making Decisions

3. 做出决策

Use
overall_score
and
passed_threshold
:
IF passed_threshold == true:
  → Task meets quality standards
  → Approve and proceed

IF passed_threshold == false:
  → Task needs improvement
  → Check recommendations for specific targets
  → Create fix specification
使用
overall_score
passed_threshold
字段:
IF passed_threshold == true:
  → 任务符合质量标准
  → 批准并推进流程

IF passed_threshold == false:
  → 任务需要改进
  → 查看建议中的具体改进目标
  → 创建修复规范

Integration with Workflow

与工作流集成

In Task Review (evaluator-agent)

在任务评审中(evaluator-agent)

markdown
undefined
markdown
undefined

Review Process

评审流程

  1. Read KPI file: TASK-XXX--kpi.json
  2. Extract overall_score and kpi_scores
  3. Read task file to validate
  4. Generate evaluation report
  5. Decision based on passed_threshold
undefined
  1. 读取KPI文件:TASK-XXX--kpi.json
  2. 提取overall_score和kpi_scores
  3. 读取任务文件进行验证
  4. 生成评估报告
  5. 根据passed_threshold做出决策
undefined

In agents_loop

在agents_loop中

python
undefined
python
undefined

Check KPI file exists

检查KPI文件是否存在

kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"
if kpi_path.exists(): kpi_data = json.loads(kpi_path.read_text())
if kpi_data["passed_threshold"]:
    # Quality threshold met
    advance_state("update_done")
else:
    # Need more work
    fix_targets = kpi_data["recommendations"]
    create_fix_task(fix_targets)
    advance_state("fix")
else: # KPI not generated yet - task may not be implemented log_warning("No KPI data found")
undefined
kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"
if kpi_path.exists(): kpi_data = json.loads(kpi_path.read_text())
if kpi_data["passed_threshold"]:
    # 达到质量阈值
    advance_state("update_done")
else:
    # 需要优化
    fix_targets = kpi_data["recommendations"]
    create_fix_task(fix_targets)
    advance_state("fix")
else: # KPI尚未生成——任务可能未完成 log_warning("No KPI data found")
undefined

Multi-Iteration Loop

多迭代循环

Instead of max 3 retries, iterate until quality threshold met:
Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed
Each iteration updates the KPI file automatically on task save.
不再限制最多3次重试,持续迭代直至达到质量阈值:
Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed
每次迭代时,任务保存后KPI文件会自动更新。

Threshold Guidelines

阈值指南

ScoreQuality LevelAction
9.0-10.0ExceptionalApprove, document best practices
8.0-8.9GoodApprove with minor notes
7.0-7.9AcceptableApprove (if threshold 7.5)
6.0-6.9Below StandardRequest specific improvements
< 6.0PoorSignificant rework required
分数质量等级操作
9.0-10.0卓越批准,记录最佳实践
8.0-8.9良好批准并附带少量备注
7.0-7.9可接受批准(若阈值为7.5)
6.0-6.9低于标准要求具体改进
< 6.0较差需要重大返工

Recommended Thresholds

推荐阈值

Project TypeThresholdRationale
Production MVP8.0High quality required
Internal Tool7.0Good enough
Prototype6.0Functional over perfect
Critical System8.5No compromises
项目类型阈值理由
生产环境MVP8.0要求高质量
内部工具7.0满足需求即可
原型6.0功能优先于完美
关键系统8.5无妥协空间

Metric Details

指标详情

Spec Compliance Metrics

规范合规指标

Acceptance Criteria Met
  • Calculates:
    (checked_criteria / total_criteria) * 10
  • Source: Task file checkbox count
  • Example: 9/10 checked = 9.0
Requirements Coverage
  • Calculates: Count of REQ-IDs this task covers
  • Source:
    traceability-matrix.md
  • Example: 4 requirements covered = 8.0
No Scope Creep
  • Calculates:
    (implemented_files / expected_files) * 10
  • Source: Task "Files to Create" vs actual files
  • Penalizes: Missing files or unexpected additions
Acceptance Criteria Met(验收标准达成率)
  • 计算公式:
    (已勾选标准数 / 总标准数) * 10
  • 数据来源:任务文件中的复选框计数
  • 示例:9/10已勾选 = 9.0
Requirements Coverage(需求覆盖率)
  • 计算方式:统计该任务覆盖的REQ-ID数量
  • 数据来源:
    traceability-matrix.md
  • 示例:覆盖4项需求 = 8.0
No Scope Creep(无范围蔓延)
  • 计算公式:
    (已实现文件数 / 预期文件数) * 10
  • 数据来源:任务中的“待创建文件”与实际文件对比
  • 扣分情况:缺失文件或添加了预期外的内容

Code Quality Metrics

代码质量指标

Static Analysis
  • Java: Maven Checkstyle
  • TypeScript: ESLint
  • Python: ruff
  • Score: 10 if passes, 5 if issues found
Complexity
  • Calculates: Functions >50 lines
  • Score:
    10 - (long_functions_ratio * 5)
  • Penalizes: Large, complex functions
Patterns Alignment
  • Checks: Knowledge Graph patterns
  • Source:
    knowledge-graph.json
  • Validates: Implementation follows project patterns
Static Analysis(静态分析)
  • Java:Maven Checkstyle
  • TypeScript:ESLint
  • Python:ruff
  • 评分:通过得10分,存在问题得5分
Complexity(复杂度)
  • 计算方式:统计行数超过50行的函数数量
  • 评分公式:
    10 - (长函数占比 * 5)
  • 扣分情况:存在大型、复杂的函数
Patterns Alignment(模式匹配度)
  • 检查内容:知识图谱模式
  • 数据来源:
    knowledge-graph.json
  • 验证:实现是否遵循项目模式

Test Coverage Metrics

测试覆盖指标

Unit Tests Present
  • Calculates:
    min(10, test_files * 5)
  • 2 test files = maximum score
  • Penalizes: Missing tests
Test/Code Ratio
  • Calculates:
    (test_count / code_count) * 10
  • 1:1 ratio = 10/10
  • Ideal: At least 1 test file per code file
Coverage Percentage
  • Source: Coverage reports (JaCoCo, lcov, etc.)
  • Calculates:
    coverage_percent / 10
  • 80% coverage = 8.0
Unit Tests Present(单元测试存在性)
  • 计算公式:
    min(10, 测试文件数 * 5)
  • 2个测试文件 = 满分
  • 扣分情况:缺失测试文件
Test/Code Ratio(测试/代码比率)
  • 计算公式:
    (测试用例数 / 代码行数) * 10
  • 1:1比率 = 10/10
  • 理想状态:至少每个代码文件对应一个测试文件
Coverage Percentage(覆盖百分比)
  • 数据来源:覆盖率报告(JaCoCo、lcov等)
  • 计算公式:
    覆盖率百分比 / 10
  • 80%覆盖率 = 8.0

Contract Fulfillment Metrics

契约履行指标

Provides Verified
  • Checks: Files exist and export expected symbols
  • Source: Task
    provides
    frontmatter
  • Validates: Contract satisfied
Expects Satisfied
  • Checks: Dependencies provide required files/symbols
  • Source: Task
    expects
    frontmatter
  • Validates: Prerequisites met
Provides Verified(提供内容验证)
  • 检查内容:文件是否存在并导出预期符号
  • 数据来源:任务的
    provides
    前置内容
  • 验证:是否满足契约要求
Expects Satisfied(依赖满足情况)
  • 检查内容:依赖是否提供了所需的文件/符号
  • 数据来源:任务的
    expects
    前置内容
  • 验证:是否满足先决条件

When KPI File is Missing

KPI文件缺失时的处理

If
TASK-XXX--kpi.json
doesn't exist:
  1. Task was never modified - Hook runs on file save
  2. Hook failed - Check Claude Code logs
  3. Task is new - Save the file first to trigger hook
DO NOT try to calculate KPIs manually. The hook runs automatically when:
  • Task file is saved (Write tool)
  • Task file is edited (Edit tool)
TASK-XXX--kpi.json
不存在:
  1. 任务从未被修改——Hook仅在文件保存时运行
  2. Hook运行失败——检查Claude Code日志
  3. 任务为新建——先保存文件以触发Hook
请勿手动计算KPI。Hook会在以下场景自动运行:
  • 任务文件被保存(Write工具)
  • 任务文件被编辑(Edit工具)

Best Practices

最佳实践

1. Always Check KPI File Exists

1. 始终检查KPI文件是否存在

Before evaluating:
markdown
Check if KPI file exists:
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

If missing:
  - Task may not be implemented yet
  - Ask user to save the task file first
评估前:
markdown
检查KPI文件是否存在:
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

若缺失:
  - 任务可能尚未完成
  - 请用户先保存任务文件

2. Trust the Metrics

2. 信任指标数据

The KPIs are objective. Only override with documented evidence:
  • Critical security issue not in metrics
  • Logic error not caught by static analysis
  • Exceptional quality not measured
KPI是客观的。仅在有文档记录的证据时才覆盖结果:
  • 指标未涵盖的严重安全问题
  • 静态分析未发现的逻辑错误
  • 未被度量的卓越质量

3. Iterate on Low KPIs

3. 针对低KPI进行迭代

Target specific categories:
❌ "Fix code quality issues"
✅ "Improve Code Quality KPI from 5.2 to 7.0:
    - Complexity: Refactor processData() (5→8)
    - Patterns: Add error handling (6→8)"
瞄准特定分类进行优化:
❌ "修复代码质量问题"
✅ "将代码质量KPI从5.2提升至7.0:
    - 复杂度:重构processData()(5→8)
    - 模式匹配:添加错误处理(6→8)"

4. Track KPI Trends

4. 跟踪KPI趋势

Monitor quality over time:
Sprint 1: Average KPI 6.8
Sprint 2: Average KPI 7.3 (+0.5)
Sprint 3: Average KPI 7.9 (+0.6)
随时间监控质量变化:
Sprint 1: 平均KPI 6.8
Sprint 2: 平均KPI 7.3 (+0.5)
Sprint 3: 平均KPI 7.9 (+0.6)

Troubleshooting

故障排除

KPI File Not Generated

KPI文件未生成

Check:
  1. Hook enabled in
    hooks.json
  2. Task file name matches pattern
    TASK-*.md
  3. File was actually saved (not just viewed)
检查项:
  1. hooks.json
    中Hook已启用
  2. 任务文件名符合
    TASK-*.md
    模式
  3. 文件确实已保存(而非仅查看)

KPI Scores Seem Wrong

KPI评分似乎有误

Validate:
  1. Check evidence field for data sources
  2. Verify files exist at expected paths
  3. Some metrics need build tools (Maven, npm)
验证步骤:
  1. 查看evidence字段的数据来源
  2. 验证文件是否存在于预期路径
  3. 部分指标需要构建工具(Maven、npm)支持

Low Scores Despite Good Code

代码质量良好但评分较低

Possible causes:
  • Missing test files
  • No coverage report generated
  • Acceptance criteria not checked
  • Lint rules too strict
Fix the root cause, not just the score.
可能原因:
  • 缺失测试文件
  • 未生成覆盖率报告
  • 验收标准未勾选
  • 代码检查规则过于严格
修复根本原因,而非仅修改评分。

Examples

示例

Example 1: Reading KPI Data

示例1:读取KPI数据

markdown
Read the KPI file to evaluate task quality:
  docs/specs/001-feature/tasks/TASK-042--kpi.json

Based on the data:
- Overall score: 6.8/10 (below threshold)
- Lowest KPI: Test Coverage (5.0/10)
- Recommendation: Add unit tests

Decision: REQUEST FIXES - target Test Coverage improvement
markdown
读取KPI文件评估任务质量:
  docs/specs/001-feature/tasks/TASK-042--kpi.json

基于数据:
- 总分:6.8/10(低于阈值)
- 最低KPI:测试覆盖(5.0/10)
- 建议:添加单元测试

决策:要求修复——目标提升测试覆盖

Example 2: Iteration Decision

示例2:迭代决策

markdown
Iteration 1 KPI: Score 6.2 → FAILED
- Spec Compliance: 7.0 ✓
- Code Quality: 5.5 ✗
- Test Coverage: 6.0 ✗

Fix targets:
1. Refactor complex functions (Code Quality)
2. Add test coverage (Test Coverage)

Iteration 2 KPI: Score 7.8 → PASSED ✓
markdown
Iteration 1 KPI: Score 6.2 → FAILED
- 规范合规:7.0 ✓
- 代码质量:5.5 ✗
- 测试覆盖:6.0 ✗

修复目标:
1. 重构复杂函数(代码质量)
2. 提升测试覆盖(测试覆盖)

Iteration 2 KPI: Score 7.8 → PASSED ✓

Example 3: agents_loop Integration

示例3:agents_loop集成

python
undefined
python
undefined

In agents_loop, after implementation step

在agents_loop中,实现步骤完成后

kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"
if kpi_file.exists(): kpi = json.loads(kpi_file.read_text())
if kpi["passed_threshold"]:
    print(f"✅ Task passed quality check: {kpi['overall_score']}/10")
    advance_state("update_done")
else:
    print(f"❌ Task failed quality check: {kpi['overall_score']}/10")
    print("Recommendations:")
    for rec in kpi["recommendations"]:
        print(f"  - {rec}")
    advance_state("fix")
undefined
kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"
if kpi_file.exists(): kpi = json.loads(kpi_file.read_text())
if kpi["passed_threshold"]:
    print(f"✅ 任务通过质量检查:{kpi['overall_score']}/10")
    advance_state("update_done")
else:
    print(f"❌ 任务未通过质量检查:{kpi['overall_score']}/10")
    print("建议:")
    for rec in kpi["recommendations"]:
        print(f"  - {rec}")
    advance_state("fix")
undefined

References

参考资料

  • evaluator-agent.md
    - Agent that uses KPI data for evaluation
  • hooks.json
    - Hook configuration for auto-generation
  • task-kpi-analyzer.py
    - Hook script (do not execute directly)
  • agents_loop.py
    - Orchestrator that reads KPI for decisions
  • evaluator-agent.md
    - 使用KPI数据进行评估的Agent
  • hooks.json
    - 自动生成KPI的Hook配置
  • task-kpi-analyzer.py
    - Hook脚本(请勿直接执行)
  • agents_loop.py
    - 读取KPI以做出决策的编排器