quality-capture-baseline

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Capture Quality Baseline

捕获质量基线

Table of Contents

目录

Core Sections

核心章节

Purpose

目的

Establish a quality metrics baseline at the start of a feature or refactor so that subsequent quality gate runs can detect regressions. This skill is a critical component of the project's regression detection strategy.
在功能开发或重构开始时建立质量指标基线,以便后续运行质量门时能够检测回归。该技能是项目回归检测策略的关键组成部分。

When to Use

使用时机

Use this skill when:
  • Starting any new feature (before writing code)
  • Beginning refactor work (to document pre-state)
  • After major changes (dependency upgrades, architecture shifts)
  • When @planner creates a new todo.md
  • When @implementer starts task 0 of a feature
  • User asks to "capture baseline" or "establish baseline metrics"
  • Before major architectural changes to track impact
在以下场景使用该技能:
  • 启动任何新功能(编写代码前)
  • 开始重构工作(记录初始状态)
  • 重大变更后(依赖升级、架构调整)
  • @planner创建新的todo.md时
  • @implementer启动功能的任务0时
  • 用户要求“capture baseline”或“establish baseline metrics”时
  • 重大架构变更前(跟踪影响)

Quick Start

快速开始

When to invoke this skill:
  • ✅ At the START of any new feature (before writing code)
  • ✅ Before beginning refactor work (to document pre-state)
  • ✅ After major changes (dependency upgrades, architecture shifts)
  • ❌ NOT after you've already started coding (too late)
Basic usage:
1. Run quality checks: ./scripts/check_all.sh
2. Parse metrics (tests, coverage, type errors, linting, dead code)
3. Create memory entity: baseline_{feature}_{date}
4. Return baseline reference for future comparison
触发该技能的时机:
  • ✅ 任何新功能开发启动前(编写代码前)
  • ✅ 重构工作开始前(记录初始状态)
  • ✅ 重大变更后(依赖升级、架构调整)
  • ❌ 不要在已经开始编码后使用(为时已晚)
基本用法:
1. 运行质量检查:./scripts/check_all.sh
2. 解析指标(测试、覆盖率、类型错误、代码检查、死代码)
3. 创建内存实体:baseline_{feature}_{date}
4. 返回基线引用,用于后续对比

Instructions

操作指南

Step 1: Prepare Context

步骤1:准备上下文

Before running quality checks, gather:
  • Feature name or refactor identifier (e.g., "auth", "service-result-migration")
  • Current git commit hash:
    git rev-parse HEAD
  • Current timestamp:
    date +"%Y-%m-%d %H:%M:%S"
Example:
bash
FEATURE="user_profile"
GIT_COMMIT=$(git rev-parse HEAD)
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
DATE=$(date +"%Y-%m-%d")
运行质量检查前,收集以下信息:
  • 功能名称或重构标识(例如:"auth"、"service-result-migration")
  • 当前Git提交哈希:
    git rev-parse HEAD
  • 当前时间戳:
    date +"%Y-%m-%d %H:%M:%S"
示例:
bash
FEATURE="user_profile"
GIT_COMMIT=$(git rev-parse HEAD)
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
DATE=$(date +"%Y-%m-%d")

Step 2: Run Quality Checks

步骤2:运行质量检查

Primary method (recommended):
bash
./scripts/check_all.sh
Fallback (if check_all.sh fails):
bash
undefined
主要方法(推荐):
bash
./scripts/check_all.sh
备选方案(如果check_all.sh执行失败):
bash
undefined

Run individual checks

运行单独的检查

uv run pytest tests/ -v --cov=src --cov-report=term-missing uv run pyright src/ uv run ruff check src/ uv run vulture src/

**Capture output:**
- Store stdout and stderr
- Measure execution time
- Note any failures (document them, don't skip)
uv run pytest tests/ -v --cov=src --cov-report=term-missing uv run pyright src/ uv run ruff check src/ uv run vulture src/

**捕获输出:**
- 存储标准输出和标准错误
- 测量执行时间
- 记录任何失败情况(文档化,不要跳过)

Step 3: Parse Metrics

步骤3:解析指标

Use the parsing patterns from
references/parsing.md
. Quick reference:
Tests:
bash
undefined
使用
references/parsing.md
中的解析规则。快速参考:
测试:
bash
undefined

Pattern: "X passed" or "X failed" or "X skipped"

规则:"X passed" 或 "X failed" 或 "X skipped"

PASSED=$(grep -oE "[0-9]+ passed" output.txt | grep -oE "[0-9]+") FAILED=$(grep -oE "[0-9]+ failed" output.txt | grep -oE "[0-9]+") SKIPPED=$(grep -oE "[0-9]+ skipped" output.txt | grep -oE "[0-9]+")

**Coverage:**
```bash
PASSED=$(grep -oE "[0-9]+ passed" output.txt | grep -oE "[0-9]+") FAILED=$(grep -oE "[0-9]+ failed" output.txt | grep -oE "[0-9]+") SKIPPED=$(grep -oE "[0-9]+ skipped" output.txt | grep -oE "[0-9]+")

**覆盖率:**
```bash

Pattern: "TOTAL ... X%"

规则:"TOTAL ... X%"

COVERAGE=$(grep "TOTAL" output.txt | grep -oE "[0-9]+%" | tail -1)

**Type Errors:**
```bash
COVERAGE=$(grep "TOTAL" output.txt | grep -oE "[0-9]+%" | tail -1)

**类型错误:**
```bash

Pattern: "X errors, Y warnings" or "0 errors"

规则:"X errors, Y warnings" 或 "0 errors"

TYPE_ERRORS=$(grep -oE "[0-9]+ error" output.txt | grep -oE "[0-9]+" | head -1)

**Linting:**
```bash
TYPE_ERRORS=$(grep -oE "[0-9]+ error" output.txt | grep -oE "[0-9]+" | head -1)

**代码检查:**
```bash

Pattern: "Found X errors"

规则:"Found X errors"

LINTING_ERRORS=$(grep -oE "Found [0-9]+ error" output.txt | grep -oE "[0-9]+")

**Dead Code:**
```bash
LINTING_ERRORS=$(grep -oE "Found [0-9]+ error" output.txt | grep -oE "[0-9]+")

**死代码:**
```bash

Pattern: Count vulture output lines or "X% unused"

规则:统计vulture输出行数或 "X% unused"

DEAD_CODE=$(vulture src/ 2>&1 | grep -v "^$" | wc -l | xargs)

**Execution Time:**
```bash
DEAD_CODE=$(vulture src/ 2>&1 | grep -v "^$" | wc -l | xargs)

**执行时间:**
```bash

Measure with time command or calculate from timestamps

使用time命令测量或通过时间戳计算

EXEC_TIME=$(echo "scale=1; $END_TIME - $START_TIME" | bc)
undefined
EXEC_TIME=$(echo "scale=1; $END_TIME - $START_TIME" | bc)
undefined

Step 4: Create Memory Entity

步骤4:创建内存实体

Entity structure:
yaml
name: quality-capture-baseline
type: quality_baseline
observations:
  - "Tests: {passed} passed, {failed} failed, {skipped} skipped"
  - "Coverage: {coverage}%"
  - "Type errors: {type_errors}"
  - "Linting errors: {linting_errors}"
  - "Dead code: {dead_code} items"
  - "Execution time: {exec_time}s"
  - "Git commit: {git_hash}"
  - "Timestamp: {timestamp}"
  - "Feature: {feature_name}"
Example:
python
mcp__memory__create_entities({
    "entities": [{
        "name": f"baseline_{feature}_{date}",
        "type": "quality_baseline",
        "observations": [
            f"Tests: {passed} passed, {failed} failed, {skipped} skipped",
            f"Coverage: {coverage}%",
            f"Type errors: {type_errors}",
            f"Linting errors: {linting_errors}",
            f"Dead code: {dead_code} items",
            f"Execution time: {exec_time}s",
            f"Git commit: {git_commit}",
            f"Timestamp: {timestamp}",
            f"Feature: {feature}"
        ]
    }]
})
If pre-existing issues exist: Add additional observation:
"Pre-existing issues: 3 type errors in legacy code (documented, will not be fixed)"
实体结构:
yaml
name: quality-capture-baseline
type: quality_baseline
observations:
  - "Tests: {passed} passed, {failed} failed, {skipped} skipped"
  - "Coverage: {coverage}%"
  - "Type errors: {type_errors}"
  - "Linting errors: {linting_errors}"
  - "Dead code: {dead_code} items"
  - "Execution time: {exec_time}s"
  - "Git commit: {git_hash}"
  - "Timestamp: {timestamp}"
  - "Feature: {feature_name}"
示例:
python
mcp__memory__create_entities({
    "entities": [{
        "name": f"baseline_{feature}_{date}",
        "type": "quality_baseline",
        "observations": [
            f"Tests: {passed} passed, {failed} failed, {skipped} skipped",
            f"Coverage: {coverage}%",
            f"Type errors: {type_errors}",
            f"Linting errors: {linting_errors}",
            f"Dead code: {dead_code} items",
            f"Execution time: {exec_time}s",
            f"Git commit: {git_commit}",
            f"Timestamp: {timestamp}",
            f"Feature: {feature}"
        ]
    }]
})
如果存在预先存在的问题: 添加额外的观测项:
"Pre-existing issues: 3 type errors in legacy code (documented, will not be fixed)"

Step 5: Return Baseline Reference

步骤5:返回基线引用

Success output format:
✅ Baseline captured: baseline_{feature}_{date}

Metrics:
- Tests: {passed} passed ({coverage}% coverage)
- Type safety: {status} ({type_errors} errors)
- Code quality: {status} ({linting_errors} linting, {dead_code} dead code items)
- Execution: {exec_time}s
- Git: {git_hash}

Use this baseline for regression detection.
With pre-existing issues:
✅ Baseline captured: baseline_{feature}_{date}

Metrics:
- Tests: 152 passed (89% coverage)
- Type safety: 3 errors (pre-existing, documented)
- Code quality: Clean
- Git: abc123f

⚠️  Note: 3 pre-existing type errors documented, will not cause regression failures.

Use this baseline for regression detection.
成功输出格式:
✅ Baseline captured: baseline_{feature}_{date}

Metrics:
- Tests: {passed} passed ({coverage}% coverage)
- Type safety: {status} ({type_errors} errors)
- Code quality: {status} ({linting_errors} linting, {dead_code} dead code items)
- Execution: {exec_time}s
- Git: {git_hash}

Use this baseline for regression detection.
存在预先存在的问题时:
✅ Baseline captured: baseline_{feature}_{date}

Metrics:
- Tests: 152 passed (89% coverage)
- Type safety: 3 errors (pre-existing, documented)
- Code quality: Clean
- Git: abc123f

⚠️  Note: 3 pre-existing type errors documented, will not cause regression failures.

Use this baseline for regression detection.

Agent Integration

Agent集成

@planner Usage

@planner 使用方法

When: At plan creation, BEFORE creating todo.md
Workflow:
  1. Receive feature request from user
  2. Analyze requirements
  3. → Invoke capture-quality-baseline
  4. Receive baseline reference
  5. Create todo.md with baseline in header
  6. Create memory entities linking to baseline
Example:
markdown
undefined
时机: 计划创建时,在创建todo.md之前
工作流:
  1. 接收用户的功能请求
  2. 分析需求
  3. → 调用capture-quality-baseline
  4. 接收基线引用
  5. 在todo.md的头部添加基线信息后创建文件
  6. 创建关联到基线的内存实体
示例:
markdown
undefined

Todo: User Profile Feature

Todo: User Profile Feature

Baseline: baseline_user_profile_2025-10-16 Created: 2025-10-16
Baseline: baseline_user_profile_2025-10-16 Created: 2025-10-16

Tasks

Tasks

  • Task 1 (references baseline for quality gates) ...
undefined
  • Task 1 (references baseline for quality gates) ...
undefined

@implementer Usage

@implementer 使用方法

When: At feature start (task 0) or before refactor work
Workflow:
  1. Validate ADR (if refactor) using validate-refactor-adr
  2. → Invoke capture-quality-baseline
  3. Receive baseline reference
  4. Reference baseline in quality gate runs
  5. Compare final metrics against baseline
Example:
User: "Start implementing authentication"

@implementer:
1. Invoke capture-quality-baseline
2. Receives: baseline_auth_2025-10-16
3. Begin task 1 implementation
4. After each task: Run quality gates, compare to baseline
5. Report: "All metrics maintained or improved from baseline"
时机: 功能启动时(任务0)或重构工作前
工作流:
  1. 使用validate-refactor-adr验证ADR(如果是重构)
  2. → 调用capture-quality-baseline
  3. 接收基线引用
  4. 在质量门运行时引用基线
  5. 将最终指标与基线对比
示例:
User: "Start implementing authentication"

@implementer:
1. 调用capture-quality-baseline
2\. 接收:baseline_auth_2025-10-16
3. 开始任务1的实现
4. 每个任务完成后:运行质量门,与基线对比
5. 报告:"所有指标保持或优于基线"

@statuser Usage

@statuser 使用方法

When: User requests baseline recapture or progress check
Workflow:
  1. Receive progress check request
  2. If major changes detected: → Invoke capture-quality-baseline
  3. Compare new baseline to original
  4. Report delta and recommend re-baseline if appropriate
Example:
User: "Check progress after dependency upgrade"

@statuser:
1. Detects major change (dependency upgrade)
2. Invokes capture-quality-baseline
3. Receives: baseline_post_upgrade_2025-10-16
4. Compares to original baseline
5. Reports: "2 tests removed (deprecated APIs), coverage maintained, type errors resolved"
时机: 用户要求重新捕获基线或检查进度时
工作流:
  1. 接收进度检查请求
  2. 如果检测到重大变更:→ 调用capture-quality-baseline
  3. 将新基线与原始基线对比
  4. 报告差异并建议是否需要重新建立基线
示例:
User: "Check progress after dependency upgrade"

@statuser:
1. 检测到重大变更(依赖升级)
2. 调用capture-quality-baseline
3. 接收:baseline_post_upgrade_2025-10-16
4. 与原始基线对比
5. 报告:"移除2个测试(废弃API),覆盖率保持,类型错误已解决"

Edge Cases

边缘情况

Quality Checks Fail

质量检查失败

Scenario: Some checks fail during baseline capture
Handling:
  1. Still capture the baseline (document failures)
  2. Add observation: "Baseline captured WITH failures"
  3. Flag in output: "⚠️ Baseline has failures - must fix before feature work"
  4. Return baseline reference (still usable for comparison)
Example:
✅ Baseline captured: baseline_auth_2025-10-16

⚠️  WARNING: Baseline has failures

Metrics:
- Tests: 140 passed, 5 FAILED, 2 skipped
- Type safety: 2 errors
- Code quality: 3 linting errors

🛑 FIX these failures before starting feature work.
场景: 基线捕获过程中部分检查失败
处理方式:
  1. 仍然捕获基线(记录失败情况)
  2. 添加观测项:"Baseline captured WITH failures"
  3. 在输出中标记:"⚠️ Baseline has failures - must fix before feature work"
  4. 返回基线引用(仍可用于对比)
示例:
✅ Baseline captured: baseline_auth_2025-10-16

⚠️  WARNING: Baseline has failures

Metrics:
- Tests: 140 passed, 5 FAILED, 2 skipped
- Type safety: 2 errors
- Code quality: 3 linting errors

🛑 FIX these failures before starting feature work.

No Git Repository

无Git仓库

Scenario: Not in a git repository (or git not available)
Handling:
  1. Skip git commit capture
  2. Continue with other metrics
  3. Add observation: "No git commit (not in repository)"
  4. Note in output: "Git: Not available"
场景: 不在Git仓库中(或Git不可用)
处理方式:
  1. 跳过Git提交记录捕获
  2. 继续收集其他指标
  3. 添加观测项:"No git commit (not in repository)"
  4. 在输出中注明:"Git: Not available"

Script Not Found

脚本未找到

Scenario:
./scripts/check_all.sh
doesn't exist
Handling:
  1. Try individual checks (pytest, pyright, ruff, vulture)
  2. If any work: Use those results
  3. If none work: Report error and abort
Error message:
❌ Cannot capture baseline: Quality check scripts not found

Tried:
- ./scripts/check_all.sh (not found)
- uv run pytest (not found)
- uv run pyright (not found)

Cannot establish baseline without quality checks.
场景:
./scripts/check_all.sh
不存在
处理方式:
  1. 尝试单独运行检查(pytest、pyright、ruff、vulture)
  2. 如果有可用的检查:使用其结果
  3. 如果全部不可用:报告错误并终止
错误信息:
❌ Cannot capture baseline: Quality check scripts not found

Tried:
- ./scripts/check_all.sh (not found)
- uv run pytest (not found)
- uv run pyright (not found)

Cannot establish baseline without quality checks.

Memory Service Unavailable

内存服务不可用

Scenario: Cannot create memory entity (MCP server down)
Handling:
  1. Store baseline in local file:
    .quality-baseline-{feature}-{date}.json
  2. Return file path as reference
  3. Note in output: "Baseline stored locally (memory unavailable)"
Fallback file format:
json
{
  "baseline_name": "baseline_auth_2025-10-16",
  "metrics": {
    "tests": {"passed": 145, "failed": 0, "skipped": 2},
    "coverage": 87,
    "type_errors": 0,
    "linting_errors": 0,
    "dead_code": 3,
    "execution_time": 8.3,
    "git_commit": "abc123f",
    "timestamp": "2025-10-16 10:30:00"
  }
}
场景: 无法创建内存实体(MCP服务器宕机)
处理方式:
  1. 将基线存储在本地文件:
    .quality-baseline-{feature}-{date}.json
  2. 返回文件路径作为引用
  3. 在输出中注明:"Baseline stored locally (memory unavailable)"
备选文件格式:
json
{
 "baseline_name": "baseline_auth_2025-10-16",
 "metrics": {
 "tests": {"passed": 145, "failed": 0, "skipped": 2},
 "coverage": 87,
 "type_errors": 0,
 "linting_errors": 0,
 "dead_code": 3,
 "execution_time": 8.3,
 "git_commit": "abc123f",
 "timestamp": "2025-10-16 10:30:00"
 }
}

Anti-Patterns

反模式

DON'T:
  • Skip baseline capture (regression detection requires it)
  • Reuse old baselines for new features (each feature needs its own)
  • Capture baseline after starting work (too late)
  • Ignore failing checks in baseline (document them)
  • Forget to reference baseline in todo.md
DO:
  • Capture baseline BEFORE any feature work
  • Create unique baseline per feature
  • Document pre-existing issues in baseline
  • Reference baseline in todo.md and memory
  • Re-baseline after major changes (upgrades, migrations)
不要:
  • 跳过基线捕获(回归检测需要基线)
  • 为新功能复用旧基线(每个功能需要独立基线)
  • 开始工作后才捕获基线(为时已晚)
  • 忽略基线中的检查失败(需记录)
  • 忘记在todo.md中引用基线
应该:
  • 在任何功能工作开始前捕获基线
  • 为每个功能创建唯一基线
  • 在基线中记录预先存在的问题
  • 在todo.md和内存中引用基线
  • 重大变更后重新建立基线(升级、迁移)

Integration with Other Skills

与其他技能集成

Skill: run-quality-gates
  • capture-quality-baseline: Establishes the reference point
  • run-quality-gates: Compares current metrics against baseline
Skill: validate-refactor-adr
  • validate-refactor-adr: Validates ADR completeness
  • capture-quality-baseline: Documents pre-refactor state
Skill: manage-refactor-markers
  • capture-quality-baseline: Establishes pre-refactor metrics
  • manage-refactor-markers: Tracks refactor progress
Skill: manage-todo
  • capture-quality-baseline: Provides baseline reference
  • manage-todo: Stores baseline name in todo.md header
技能:run-quality-gates
  • capture-quality-baseline:建立参考基准
  • run-quality-gates:将当前指标与基线对比
技能:validate-refactor-adr
  • validate-refactor-adr:验证ADR完整性
  • capture-quality-baseline:记录重构前的状态
技能:manage-refactor-markers
  • capture-quality-baseline:建立重构前的指标
  • manage-refactor-markers:跟踪重构进度
技能:manage-todo
  • capture-quality-baseline:提供基线引用
  • manage-todo:在todo.md头部存储基线名称

Examples

示例

See
examples.md
for comprehensive scenarios:
  1. Feature start baseline (clean state)
  2. Refactor start baseline (with pre-existing issues)
  3. Re-baseline after major change
  4. Baseline with failures
  5. Agent auto-invocation patterns
  6. Cross-agent baseline sharing
  7. Baseline comparison workflows
  8. Recovery from failed baseline capture
查看
examples.md
获取综合场景:
  1. 功能启动基线(干净状态)
  2. 重构启动基线(存在预先问题)
  3. 重大变更后重新建立基线
  4. 包含失败的基线
  5. Agent自动触发模式
  6. 跨Agent基线共享
  7. 基线对比工作流
  8. 基线捕获失败后的恢复

Reference

参考资料

  • Metrics parsing: See
    references/parsing.md
    for detailed parsing logic
  • Baseline lifecycle: See
    references/lifecycle.md
    for management best practices
  • Quality gates: See project's
    ../run-quality-gates/references/shared-quality-gates.md
    for gate definitions
  • 指标解析: 查看
    references/parsing.md
    获取详细解析逻辑
  • 基线生命周期: 查看
    references/lifecycle.md
    获取管理最佳实践
  • 质量门: 查看项目的
    ../run-quality-gates/references/shared-quality-gates.md
    获取质量门定义

Success Criteria

成功标准

  • Quality checks execute successfully (or failures documented)
  • All 5 metrics extracted (tests, coverage, type errors, linting, dead code)
  • Memory entity created with baseline data
  • Baseline name returned for reference
  • Execution time < 30 seconds (typically ~8s)
  • Git commit captured (if available)
  • Timestamp recorded
  • Pre-existing issues documented (if any)
  • 质量检查执行成功(或失败已记录)
  • 提取全部5项指标(测试、覆盖率、类型错误、代码检查、死代码)
  • 创建包含基线数据的内存实体
  • 返回基线名称用于引用
  • 执行时间<30秒(通常约8秒)
  • 捕获Git提交记录(如果可用)
  • 记录时间戳
  • 记录预先存在的问题(如果有)

Troubleshooting

故障排除

Problem: Metrics parsing fails (empty values)
Solution:
  1. Check quality check output format (may have changed)
  2. Review parsing patterns in
    references/parsing.md
  3. Update regex patterns if needed
  4. Fallback: Manually parse critical metrics
Problem: Baseline creation succeeds but can't be found later
Solution:
  1. Verify memory entity name format:
    baseline_{feature}_{date}
  2. Search memory:
    mcp__memory__search_memories(query="baseline")
  3. Check fallback file:
    .quality-baseline-*.json
Problem: Quality checks take too long (> 30s)
Solution:
  1. Check if Neo4j is running (connection timeout)
  2. Skip slow tests:
    pytest -m "not slow"
  3. Run checks in parallel (check_all.sh does this)
  4. Report timing issue to user

Last Updated: 2025-10-16 Priority: High (critical for regression detection) Dependencies: Quality gate scripts, memory MCP server
问题: 指标解析失败(值为空)
解决方案:
  1. 检查质量检查输出格式(可能已变更)
  2. 查看
    references/parsing.md
    中的解析规则
  3. 如有需要更新正则表达式
  4. 备选方案:手动解析关键指标
问题: 基线创建成功但后续无法找到
解决方案:
  1. 验证内存实体名称格式:
    baseline_{feature}_{date}
  2. 搜索内存:
    mcp__memory__search_memories(query="baseline")
  3. 检查备选文件:
    .quality-baseline-*.json
问题: 质量检查耗时过长(>30秒)
解决方案:
  1. 检查Neo4j是否在运行(连接超时)
  2. 跳过慢测试:
    pytest -m "not slow"
  3. 并行运行检查(check_all.sh已支持)
  4. 向用户报告耗时问题

最后更新: 2025-10-16 优先级: 高(回归检测的关键) 依赖: 质量门脚本、内存MCP服务器",