task-quality-kpi

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Task Quality KPI Framework

任务质量KPI框架

Overview

概述

The Task Quality KPI Framework provides objective, quantitative metrics for evaluating task implementation quality.

Key Architecture: KPIs are auto-generated by a hook - you read the results, not run scripts.

┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘

任务质量KPI框架为评估任务实施质量提供客观、量化的指标。

核心架构：KPI由Hook自动生成——你只需读取结果，无需运行脚本。

┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘

Why This Architecture?

为何采用该架构？

Problem	Solution
Skills can't execute scripts	Hook auto-runs on file save
Subjective review_status	Quantitative 0-10 scores
"Looks good to me"	Evidence-based evaluation
Binary pass/fail	Graduated quality levels

问题	解决方案
技能无法执行脚本	文件保存时Hook自动运行
主观的review_status	量化0-10分评分
“我看着没问题”	基于证据的评估
二元通过/不通过	分级质量水平

KPI File Location

KPI文件位置

After any task file modification, find KPI data at:

docs/specs/[ID]/tasks/TASK-XXX--kpi.json

任务文件修改后，可在以下路径找到KPI数据：

docs/specs/[ID]/tasks/TASK-XXX--kpi.json

KPI Categories

KPI分类

┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘

Category Weights

分类权重

Category	Weight	Why
Spec Compliance	30%	Most important - did we build what was asked?
Code Quality	25%	Technical excellence
Test Coverage	25%	Verification and confidence
Contract Fulfillment	20%	Integration with other tasks

分类	权重	原因
规范合规	30%	最重要——我们是否按要求完成了任务？
代码质量	25%	技术卓越性
测试覆盖	25%	验证与可信度
契约履行	20%	与其他任务的集成性

When to Use

适用场景

Reading KPI data for task quality evaluation
Understanding quality metrics and scoring breakdown
Deciding whether to iterate or approve based on quantitative data
Integrating KPI checks into automated loops (
```
agents_loop.py
```
)
Generating evidence-based evaluation reports

读取KPI数据进行任务质量评估
理解质量指标及评分细分
基于量化数据决定是否迭代或批准任务
将KPI检查集成至自动化循环（
```
agents_loop.py
```
）
生成基于证据的评估报告

Instructions

使用说明

1. Reading KPI Data (Primary Use)

1. 读取KPI数据（主要用途）

DO NOT run scripts - read the auto-generated file:

markdown

Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json

请勿运行脚本——直接读取自动生成的文件：

markdown

读取KPI文件：
  docs/specs/001-feature/tasks/TASK-001--kpi.json

2. Understanding the Data

2. 理解数据内容

The KPI file contains:

json

{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}

KPI文件包含以下内容：

json

{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}

3. Making Decisions

3. 做出决策

Use

overall_score

and

passed_threshold

IF passed_threshold == true:
  → Task meets quality standards
  → Approve and proceed

IF passed_threshold == false:
  → Task needs improvement
  → Check recommendations for specific targets
  → Create fix specification

使用

overall_score

和

passed_threshold

字段：

IF passed_threshold == true:
  → 任务符合质量标准
  → 批准并推进流程

IF passed_threshold == false:
  → 任务需要改进
  → 查看建议中的具体改进目标
  → 创建修复规范

Integration with Workflow

与工作流集成

In Task Review (evaluator-agent)

在任务评审中（evaluator-agent）

markdown

undefined

markdown

undefined

Review Process

评审流程

Read KPI file: TASK-XXX--kpi.json
Extract overall_score and kpi_scores
Read task file to validate
Generate evaluation report
Decision based on passed_threshold

undefined

读取KPI文件：TASK-XXX--kpi.json
提取overall_score和kpi_scores
读取任务文件进行验证
生成评估报告
根据passed_threshold做出决策

undefined

In agents_loop

在agents_loop中

python

undefined

python

undefined

Check KPI file exists

检查KPI文件是否存在

kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"

if kpi_path.exists(): kpi_data = json.loads(kpi_path.read_text())

if kpi_data["passed_threshold"]:
    # Quality threshold met
    advance_state("update_done")
else:
    # Need more work
    fix_targets = kpi_data["recommendations"]
    create_fix_task(fix_targets)
    advance_state("fix")

else: # KPI not generated yet - task may not be implemented log_warning("No KPI data found")

undefined

kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"

if kpi_path.exists(): kpi_data = json.loads(kpi_path.read_text())

if kpi_data["passed_threshold"]:
    # 达到质量阈值
    advance_state("update_done")
else:
    # 需要优化
    fix_targets = kpi_data["recommendations"]
    create_fix_task(fix_targets)
    advance_state("fix")

else: # KPI尚未生成——任务可能未完成 log_warning("No KPI data found")

undefined

Multi-Iteration Loop

多迭代循环

Instead of max 3 retries, iterate until quality threshold met:

Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed

Each iteration updates the KPI file automatically on task save.

不再限制最多3次重试，持续迭代直至达到质量阈值：

Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed

每次迭代时，任务保存后KPI文件会自动更新。

Threshold Guidelines

阈值指南

Score	Quality Level	Action
9.0-10.0	Exceptional	Approve, document best practices
8.0-8.9	Good	Approve with minor notes
7.0-7.9	Acceptable	Approve (if threshold 7.5)
6.0-6.9	Below Standard	Request specific improvements
< 6.0	Poor	Significant rework required

分数	质量等级	操作
9.0-10.0	卓越	批准，记录最佳实践
8.0-8.9	良好	批准并附带少量备注
7.0-7.9	可接受	批准（若阈值为7.5）
6.0-6.9	低于标准	要求具体改进
< 6.0	较差	需要重大返工

Recommended Thresholds

Project Type	Threshold	Rationale
Production MVP	8.0	High quality required
Internal Tool	7.0	Good enough
Prototype	6.0	Functional over perfect
Critical System	8.5	No compromises

项目类型	阈值	理由
生产环境MVP	8.0	要求高质量
内部工具	7.0	满足需求即可
原型	6.0	功能优先于完美
关键系统	8.5	无妥协空间

Metric Details

指标详情

Spec Compliance Metrics

规范合规指标

Acceptance Criteria Met

Calculates:

(checked_criteria / total_criteria) * 10

Source: Task file checkbox count
Example: 9/10 checked = 9.0

Requirements Coverage

Calculates: Count of REQ-IDs this task covers
Source:
```
traceability-matrix.md
```
Example: 4 requirements covered = 8.0

No Scope Creep

Calculates:

(implemented_files / expected_files) * 10

Source: Task "Files to Create" vs actual files
Penalizes: Missing files or unexpected additions

Acceptance Criteria Met（验收标准达成率）

计算公式：

(已勾选标准数 / 总标准数) * 10

数据来源：任务文件中的复选框计数
示例：9/10已勾选 = 9.0

Requirements Coverage（需求覆盖率）

计算方式：统计该任务覆盖的REQ-ID数量
数据来源：
```
traceability-matrix.md
```
示例：覆盖4项需求 = 8.0

No Scope Creep（无范围蔓延）

计算公式：

(已实现文件数 / 预期文件数) * 10

数据来源：任务中的“待创建文件”与实际文件对比
扣分情况：缺失文件或添加了预期外的内容

Code Quality Metrics

代码质量指标

Static Analysis

Java: Maven Checkstyle
TypeScript: ESLint
Python: ruff
Score: 10 if passes, 5 if issues found

Complexity

Calculates: Functions >50 lines
Score:
```
10 - (long_functions_ratio * 5)
```
Penalizes: Large, complex functions

Patterns Alignment

Checks: Knowledge Graph patterns
Source:
```
knowledge-graph.json
```
Validates: Implementation follows project patterns

Static Analysis（静态分析）

Java：Maven Checkstyle
TypeScript：ESLint
Python：ruff
评分：通过得10分，存在问题得5分

Complexity（复杂度）

计算方式：统计行数超过50行的函数数量
评分公式：
```
10 - (长函数占比 * 5)
```
扣分情况：存在大型、复杂的函数

Patterns Alignment（模式匹配度）

检查内容：知识图谱模式
数据来源：
```
knowledge-graph.json
```
验证：实现是否遵循项目模式

Test Coverage Metrics

测试覆盖指标

Unit Tests Present

Calculates:
```
min(10, test_files * 5)
```
2 test files = maximum score
Penalizes: Missing tests

Test/Code Ratio

Calculates:
```
(test_count / code_count) * 10
```
1:1 ratio = 10/10
Ideal: At least 1 test file per code file

Coverage Percentage

Source: Coverage reports (JaCoCo, lcov, etc.)
Calculates:
```
coverage_percent / 10
```
80% coverage = 8.0

Unit Tests Present（单元测试存在性）

计算公式：
```
min(10, 测试文件数 * 5)
```
2个测试文件 = 满分
扣分情况：缺失测试文件

Test/Code Ratio（测试/代码比率）

计算公式：
```
(测试用例数 / 代码行数) * 10
```
1:1比率 = 10/10
理想状态：至少每个代码文件对应一个测试文件

Coverage Percentage（覆盖百分比）

数据来源：覆盖率报告（JaCoCo、lcov等）
计算公式：
```
覆盖率百分比 / 10
```
80%覆盖率 = 8.0

Contract Fulfillment Metrics

契约履行指标

Provides Verified

Checks: Files exist and export expected symbols
Source: Task
```
provides
```
frontmatter
Validates: Contract satisfied

Expects Satisfied

Checks: Dependencies provide required files/symbols
Source: Task
```
expects
```
frontmatter
Validates: Prerequisites met

Provides Verified（提供内容验证）

检查内容：文件是否存在并导出预期符号
数据来源：任务的
```
provides
```
前置内容
验证：是否满足契约要求

Expects Satisfied（依赖满足情况）

检查内容：依赖是否提供了所需的文件/符号
数据来源：任务的
```
expects
```
前置内容
验证：是否满足先决条件

When KPI File is Missing

KPI文件缺失时的处理

TASK-XXX--kpi.json

doesn't exist:

Task was never modified - Hook runs on file save
Hook failed - Check Claude Code logs
Task is new - Save the file first to trigger hook

DO NOT try to calculate KPIs manually. The hook runs automatically when:

Task file is saved (Write tool)
Task file is edited (Edit tool)

若

TASK-XXX--kpi.json

不存在：

任务从未被修改——Hook仅在文件保存时运行
Hook运行失败——检查Claude Code日志
任务为新建——先保存文件以触发Hook

请勿手动计算KPI。Hook会在以下场景自动运行：

任务文件被保存（Write工具）
任务文件被编辑（Edit工具）

Best Practices

最佳实践

1. Always Check KPI File Exists

1. 始终检查KPI文件是否存在

Before evaluating:

markdown

Check if KPI file exists:
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

If missing:
  - Task may not be implemented yet
  - Ask user to save the task file first

评估前：

markdown

检查KPI文件是否存在：
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

若缺失：
  - 任务可能尚未完成
  - 请用户先保存任务文件

2. Trust the Metrics

2. 信任指标数据

The KPIs are objective. Only override with documented evidence:

Critical security issue not in metrics
Logic error not caught by static analysis
Exceptional quality not measured

KPI是客观的。仅在有文档记录的证据时才覆盖结果：

指标未涵盖的严重安全问题
静态分析未发现的逻辑错误
未被度量的卓越质量

3. Iterate on Low KPIs

3. 针对低KPI进行迭代

Target specific categories:

❌ "Fix code quality issues"
✅ "Improve Code Quality KPI from 5.2 to 7.0:
    - Complexity: Refactor processData() (5→8)
    - Patterns: Add error handling (6→8)"

瞄准特定分类进行优化：

❌ "修复代码质量问题"
✅ "将代码质量KPI从5.2提升至7.0:
    - 复杂度：重构processData()（5→8）
    - 模式匹配：添加错误处理（6→8）"

4. Track KPI Trends

4. 跟踪KPI趋势

Monitor quality over time:

Sprint 1: Average KPI 6.8
Sprint 2: Average KPI 7.3 (+0.5)
Sprint 3: Average KPI 7.9 (+0.6)

随时间监控质量变化：

Sprint 1: 平均KPI 6.8
Sprint 2: 平均KPI 7.3 (+0.5)
Sprint 3: 平均KPI 7.9 (+0.6)

Troubleshooting

故障排除

KPI File Not Generated

KPI文件未生成

Check:

Hook enabled in
```
hooks.json
```
Task file name matches pattern
```
TASK-*.md
```
File was actually saved (not just viewed)

检查项：

```
hooks.json
```
中Hook已启用
任务文件名符合
```
TASK-*.md
```
模式
文件确实已保存（而非仅查看）

KPI Scores Seem Wrong

KPI评分似乎有误

Validate:

Check evidence field for data sources
Verify files exist at expected paths
Some metrics need build tools (Maven, npm)

验证步骤：

查看evidence字段的数据来源
验证文件是否存在于预期路径
部分指标需要构建工具（Maven、npm）支持

Low Scores Despite Good Code

代码质量良好但评分较低

Possible causes:

Missing test files
No coverage report generated
Acceptance criteria not checked
Lint rules too strict

Fix the root cause, not just the score.

可能原因：

缺失测试文件
未生成覆盖率报告
验收标准未勾选
代码检查规则过于严格

修复根本原因，而非仅修改评分。

Examples

示例

Example 1: Reading KPI Data

示例1：读取KPI数据

markdown

Read the KPI file to evaluate task quality:
  docs/specs/001-feature/tasks/TASK-042--kpi.json

Based on the data:
- Overall score: 6.8/10 (below threshold)
- Lowest KPI: Test Coverage (5.0/10)
- Recommendation: Add unit tests

Decision: REQUEST FIXES - target Test Coverage improvement

markdown

读取KPI文件评估任务质量：
  docs/specs/001-feature/tasks/TASK-042--kpi.json

基于数据：
- 总分：6.8/10（低于阈值）
- 最低KPI：测试覆盖（5.0/10）
- 建议：添加单元测试

决策：要求修复——目标提升测试覆盖

Example 2: Iteration Decision

示例2：迭代决策

markdown

Iteration 1 KPI: Score 6.2 → FAILED
- Spec Compliance: 7.0 ✓
- Code Quality: 5.5 ✗
- Test Coverage: 6.0 ✗

Fix targets:
1. Refactor complex functions (Code Quality)
2. Add test coverage (Test Coverage)

Iteration 2 KPI: Score 7.8 → PASSED ✓

markdown

Iteration 1 KPI: Score 6.2 → FAILED
- 规范合规：7.0 ✓
- 代码质量：5.5 ✗
- 测试覆盖：6.0 ✗

修复目标：
1. 重构复杂函数（代码质量）
2. 提升测试覆盖（测试覆盖）

Iteration 2 KPI: Score 7.8 → PASSED ✓

Example 3: agents_loop Integration

示例3：agents_loop集成

python

undefined

python

undefined

In agents_loop, after implementation step

在agents_loop中，实现步骤完成后

kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"

if kpi_file.exists(): kpi = json.loads(kpi_file.read_text())

if kpi["passed_threshold"]:
    print(f"✅ Task passed quality check: {kpi['overall_score']}/10")
    advance_state("update_done")
else:
    print(f"❌ Task failed quality check: {kpi['overall_score']}/10")
    print("Recommendations:")
    for rec in kpi["recommendations"]:
        print(f"  - {rec}")
    advance_state("fix")

undefined

kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"

if kpi_file.exists(): kpi = json.loads(kpi_file.read_text())

if kpi["passed_threshold"]:
    print(f"✅ 任务通过质量检查：{kpi['overall_score']}/10")
    advance_state("update_done")
else:
    print(f"❌ 任务未通过质量检查：{kpi['overall_score']}/10")
    print("建议：")
    for rec in kpi["recommendations"]:
        print(f"  - {rec}")
    advance_state("fix")

undefined

References

参考资料

```
evaluator-agent.md
```
- Agent that uses KPI data for evaluation
```
hooks.json
```
- Hook configuration for auto-generation
```
task-kpi-analyzer.py
```
- Hook script (do not execute directly)
```
agents_loop.py
```
- Orchestrator that reads KPI for decisions

```
evaluator-agent.md
```
- 使用KPI数据进行评估的Agent
```
hooks.json
```
- 自动生成KPI的Hook配置
```
task-kpi-analyzer.py
```
- Hook脚本（请勿直接执行）
```
agents_loop.py
```
- 读取KPI以做出决策的编排器

task-quality-kpi

Original

Translation

Task Quality KPI Framework

任务质量KPI框架

Overview

概述

Why This Architecture?

为何采用该架构？

KPI File Location

KPI文件位置

KPI Categories

KPI分类

Category Weights

分类权重

When to Use

适用场景

Instructions

使用说明

1. Reading KPI Data (Primary Use)

1. 读取KPI数据（主要用途）

2. Understanding the Data

2. 理解数据内容

3. Making Decisions

3. 做出决策

Integration with Workflow

与工作流集成

In Task Review (evaluator-agent)

在任务评审中（evaluator-agent）

Review Process

评审流程

In agents_loop

在agents_loop中

Check KPI file exists

检查KPI文件是否存在

Multi-Iteration Loop

多迭代循环

Threshold Guidelines

阈值指南

Recommended Thresholds

推荐阈值

Metric Details

指标详情

Spec Compliance Metrics

规范合规指标

Code Quality Metrics

代码质量指标

Test Coverage Metrics

测试覆盖指标

Contract Fulfillment Metrics

契约履行指标

When KPI File is Missing

KPI文件缺失时的处理

Best Practices

最佳实践

1. Always Check KPI File Exists

1. 始终检查KPI文件是否存在

2. Trust the Metrics

2. 信任指标数据

3. Iterate on Low KPIs

3. 针对低KPI进行迭代

4. Track KPI Trends

4. 跟踪KPI趋势

Troubleshooting

故障排除

KPI File Not Generated

KPI文件未生成

KPI Scores Seem Wrong

KPI评分似乎有误

Low Scores Despite Good Code

代码质量良好但评分较低

Examples

示例

Example 1: Reading KPI Data

示例1：读取KPI数据

Example 2: Iteration Decision

示例2：迭代决策

Example 3: agents_loop Integration

示例3：agents_loop集成

In agents_loop, after implementation step