nemoclaw-maintainer-cross-issue-sweep

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cross-Issue Regression Sweep

跨Issue回归扫描

Surfaces the issues a single PR may also fix or accidentally break beyond the one it claims to address. Two outputs:
  • Adjacent fixes — "PR may also close #X" → bundling intel (ship one PR, close multiple issues)
  • Contradicting risks — "PR may break what #Y wants" → coordination needed before merge
揭示单个PR除了声称要解决的问题外,还可能修复或意外破坏的其他问题。输出分为两类:
  • 相邻修复 — "PR可能同时关闭#X" → 合并情报(提交一个PR,关闭多个Issue)
  • 冲突风险 — "PR可能破坏#Y想要实现的功能" → 合并前需要协调

Prerequisites

前提条件

  • gh
    CLI authenticated
  • A target repository with open issues
  • An open PR to scan
  • 已认证的
    gh
    CLI
  • 包含未关闭Issue的目标仓库
  • 待扫描的开放PR

Repo policy

仓库策略

Defaults assume NemoClaw conventions. Edit
repo-policy.md
to override per-repo (bot logins, candidate caps, language regex).
默认遵循NemoClaw约定。编辑
repo-policy.md
可覆盖仓库专属配置(机器人登录信息、候选数量上限、语言正则表达式)。

Workflow

工作流程

Copy this checklist into your response and check off each step:
text
Cross-issue sweep progress:
- [ ] Step 1: Extract fingerprint (files, symbols, error strings, primary issue)
- [ ] Step 2: Search candidate issues (capped at 30, primary excluded)
- [ ] Step 3: Classify each candidate (4-class with evidence)
- [ ] Step 4: Apply reverse-link boost
- [ ] Step 5: Filter (drop UNRELATED, SAME_ISSUE_DIFF, low-confidence)
- [ ] Step 6: Render report using templates/report.md
将以下检查清单复制到回复中,并勾选每个步骤:
text
跨Issue扫描进度:
- [ ] 步骤1:提取特征指纹(文件、符号、错误字符串、主Issue)
- [ ] 步骤2:搜索候选Issue(上限30个,排除主Issue)
- [ ] 步骤3:分类每个候选Issue(4类并附带证据)
- [ ] 步骤4:反向链接加权
- [ ] 步骤5:过滤(移除无关、同根问题、低置信度项)
- [ ] 步骤6:使用templates/report.md生成报告

Step 1: Extract fingerprint

步骤1:提取特征指纹

bash
scripts/extract-fingerprint.sh <pr-number>
Pulls four dimensions: touched files, touched symbols (per-language regex), error-string tokens, and the PR's primary linked issue (for exclusion). See
checks/fingerprint-extraction.md
.
bash
scripts/extract-fingerprint.sh <pr-number>
提取四个维度:修改的文件、修改的符号(基于各语言正则表达式)、错误字符串标记,以及PR关联的主要Issue(用于排除)。详见
checks/fingerprint-extraction.md

Step 2: Search candidate issues

步骤2:搜索候选Issue

bash
scripts/search-candidate-issues.sh <fingerprint-json>
Three search dimensions, capped at 30 total candidates:
  • Per symbol: top 10 by recency
  • Per file path: top 5 by recency
  • Per error string: top 5 by recency
Dedupes; excludes the PR's primary linked issue.
bash
scripts/search-candidate-issues.sh <fingerprint-json>
三个搜索维度,候选总数上限为30个:
  • 按符号:最新的10个
  • 按文件路径:最新的5个
  • 按错误字符串:最新的5个
自动去重;排除PR关联的主要Issue。

Step 3: Classify each candidate

步骤3:分类每个候选Issue

For each candidate, the LLM classifies as one of four classes per
checks/relationship-judgment.md
:
  • ADJACENT_FIX — PR's changes likely also resolve this issue
  • CONTRADICTING — PR's approach blocks what this issue wants
  • SAME_ISSUE_DIFF — same root bug as PR's primary issue (dedup filter)
  • UNRELATED — no meaningful relationship
Required for ADJACENT_FIX or CONTRADICTING:
  • Cite specific PR diff line
  • Cite specific issue symptom
  • Confidence: high / medium / low
If no specific evidence can be cited, the LLM must answer UNRELATED. This floors hallucination.
根据
checks/relationship-judgment.md
,LLM将每个候选Issue分为以下四类之一:
  • ADJACENT_FIX — PR的修改可能同时解决该Issue
  • CONTRADICTING — PR的实现方式会阻碍该Issue的需求
  • SAME_ISSUE_DIFF — 与PR主Issue为同一根因bug(去重过滤项)
  • UNRELATED — 无实质性关联
标记为ADJACENT_FIX或CONTRADICTING时需满足:
  • 引用PR差异的具体行号
  • 引用Issue的具体症状
  • 置信度:高/中/低
若无法引用具体证据,LLM必须标记为UNRELATED,以此减少幻觉。

Step 4: Reverse-link boost

步骤4:反向链接加权

If the candidate issue's body or comments already mention this PR's number, the relationship is already in someone's mental model. Boost confidence by one tier (low → medium, medium → high).
若候选Issue的正文或评论中已提及该PR编号,说明该关联已被关注。将置信度提升一级(低→中,中→高)。

Step 5: Filter

步骤5:过滤

  • Suppress UNRELATED + SAME_ISSUE_DIFF
  • Drop low-confidence judgments
  • Keep ADJACENT_FIX and CONTRADICTING with high or medium confidence
  • 隐藏UNRELATED和SAME_ISSUE_DIFF项
  • 移除低置信度判断结果
  • 保留高或中置信度的ADJACENT_FIX和CONTRADICTING项

Step 6: Render report

步骤6:生成报告

bash
scripts/render-report.py < classifications.json
See
templates/report.md
for the format.
bash
scripts/render-report.py < classifications.json
格式详见
templates/report.md

Reference files

参考文件

  • repo-policy.md
    — configurable per-repo defaults
  • relationship-rules.md
    — 4-class definitions with worked examples
  • checks/fingerprint-extraction.md
    — what to pull from the diff, per language
  • checks/relationship-judgment.md
    — LLM judgment criteria + evidence requirement
  • templates/report.md
    — output template
  • validation/backtest.md
    — backtest the skill against historical PRs
  • repo-policy.md
    — 可配置的仓库默认设置
  • relationship-rules.md
    — 4类关系的定义及示例
  • checks/fingerprint-extraction.md
    — 基于语言从差异中提取的内容说明
  • checks/relationship-judgment.md
    — LLM判断标准及证据要求
  • templates/report.md
    — 输出模板
  • validation/backtest.md
    — 基于历史PR对该工具进行回测

Scripts (execute, do not read)

脚本(直接执行,无需阅读)

  • scripts/extract-fingerprint.sh
    — symbols + paths + error strings, deterministic
  • scripts/search-candidate-issues.sh
    — GitHub Search wrapper, dedupe, cap
  • scripts/render-report.py
    — report renderer
  • scripts/extract-fingerprint.sh
    — 符号+路径+错误字符串提取,结果可复现
  • scripts/search-candidate-issues.sh
    — GitHub Search封装脚本,支持去重和数量上限
  • scripts/render-report.py
    — 报告生成器

Composition with other skills

与其他工具组合使用

The pr-comparator (
nemoclaw-maintainer-pr-comparator
) calls this skill as a sub-step when comparing competing PRs. Adjacent-fix counts feed Tier 3 tiebreakers; contradicting hits factor into Tier 2 quality scoring.
pr-comparator(
nemoclaw-maintainer-pr-comparator
)在对比竞争PR时会调用本工具作为子步骤。相邻修复数量用于三级平局判定;冲突风险会纳入二级质量评分。

What this skill does NOT do (deferred)

本工具不支持的功能(待开发)

These would raise the ceiling but require infrastructure beyond GitHub API + LLM:
  • Run PR code against adversarial inputs (sandboxed)
  • Static-analyzer dataflow tracing (CodeQL, Semgrep)
  • ML-based symbol disambiguation across codebases
以下功能可提升上限,但需要GitHub API + LLM之外的基础设施支持:
  • 在沙箱环境中运行PR代码对抗测试输入
  • 静态分析器数据流追踪(CodeQL、Semgrep)
  • 基于ML的跨代码库符号消歧义