auto-bug-fixer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBug 定位与修复技能
Bug Localization and Fix Skill
概述
Overview
通过结构化的错误分析和根因定位流程,自动生成最小化diff格式修复补丁和防回归测试用例,实现从错误现象到完整修复方案的端到端自动化处理。
Through a structured error analysis and root cause localization process, automatically generate minimized diff-format fix patches and regression-preventive test cases, enabling end-to-end automated processing from error symptoms to complete fix solutions.
使用场景
Application Scenarios
- 遇到程序崩溃、异常抛出、测试失败等错误现象需要分析
- 需要定位bug的根本原因(特别是代码变更引入的问题)
- 需要生成符合工程化规范的代码修复补丁
- 需要创建可执行的防回归测试用例
- 需要完整的bug分析报告(包含根因、修复、测试、预防建议)
- Need analysis when encountering error symptoms such as program crashes, exception throws, test failures, etc.
- Need to locate the root cause of bugs (especially issues introduced by code changes)
- Need to generate code fix patches that comply with engineering specifications
- Need to create executable regression-preventive test cases
- Need a complete bug analysis report (including root cause, fix, testing, and prevention suggestions)
快速参考
Quick Reference
| 场景 | 定位方法 | 适用条件 |
|---|---|---|
| 通用bug分析 | 自顶向下追踪法 | 无明确代码变更/PR信息 |
| 代码变更引入bug | Git二分法 | 有code_change_info,需定位具体PR/提交 |
| 错误码 | 含义 | 处理方式 |
|---|---|---|
| E001 | 输入信息缺失或无法解析 | 补充必填输入(error_phenomenon/reproduce_steps) |
| E002 | 根因模糊无法准确定位 | 提供更多上下文信息或代码变更信息 |
| E003 | 补丁生成失败 | 确认根因已锁定,检查代码上下文 |
| E004 | 测试用例生成失败 | 确认修复补丁有效,指定测试框架 |
| Scenario | Localization Method | Applicable Conditions |
|---|---|---|
| General bug analysis | Top-down tracing method | No clear code change/PR information |
| Bug introduced by code change | Git bisect method | Has code_change_info, needs to locate specific PR/commit |
| Error Code | Meaning | Handling Method |
|---|---|---|
| E001 | Missing or unparseable input information | Supplement required inputs (error_phenomenon/reproduce_steps) |
| E002 | Ambiguous root cause that cannot be accurately located | Provide more context information or code change information |
| E003 | Patch generation failed | Confirm root cause is locked, check code context |
| E004 | Test case generation failed | Confirm fix patch is valid, specify test framework |
核心能力(动作+结果,无重叠/无超范围)
Core Capabilities (Action + Result, No Overlap/Out-of-Scope)
- 解析多语言错误现象,校验必填输入信息的完整性和有效性
- 对通用场景采用自顶向下追踪法、对代码变更/PR场景采用Git二分法定位bug根因,并验证根因的可复现性
- 针对根因生成最小化diff格式代码修复补丁,校验补丁语法正确性,遵循工程化开发规范
- 为主流测试框架创建可直接执行的防回归测试用例,覆盖正常/边界/异常核心场景
- 输出结构化Markdown格式的Bug分析报告,包含错误现象描述、复现步骤、根因分析、修复补丁、测试用例和预防建议
- Parse multi-language error symptoms, verify the completeness and validity of required input information
- Use the top-down tracing method for general scenarios and the Git bisect method for code change/PR scenarios to locate bug root causes, and verify the reproducibility of the root cause
- Generate minimized diff-format code fix patches for the root cause, verify patch syntax correctness, and follow engineering development specifications
- Create directly executable regression-preventive test cases for mainstream test frameworks, covering core normal/boundary/exception scenarios
- Output structured Markdown-format bug analysis reports, including error symptom descriptions, reproduction steps, root cause analysis, fix patches, test cases, and prevention suggestions
工作流程(步骤+校验+中断+反馈,Agent可直接执行)
Workflow (Steps + Verification + Interruption + Feedback, Executable by Agent)
第一步:信息收集与校验
Step 1: Information Collection and Verification
✅ 校验点:检查必填输入(error_phenomenon/reproduce_steps)是否完整、可识别
❌ 中断条件:缺失必填输入 / 错误现象无法解析(如乱码、不完整)→ 抛出错误码E001并中断
📝 反馈:输出「信息收集完成,可进入根因分析阶段 / 缺失必填输入,错误码E001」
✅ Checkpoint: Verify whether required inputs (error_phenomenon/reproduce_steps) are complete and recognizable
❌ Interrupt Condition: Missing required inputs / Unparseable error symptoms (e.g., garbled text, incomplete content) → Throw error code E001 and interrupt
📝 Feedback: Output "Information collection completed, can proceed to root cause analysis stage / Missing required inputs, error code E001"
第二步:根因分析与确认
Step 2: Root Cause Analysis and Confirmation
✅ 校验点:根因可复现,且精准锁定至具体代码行/文件/PR/提交记录
❌ 中断条件:信息不足导致根因模糊,无法准确定位 → 抛出错误码E002并中断
📝 反馈:输出「根因已定位:XXX(关联PR/提交:XXX) / 根因模糊,错误码E002」
✅ Checkpoint: Root cause is reproducible and accurately locked to specific code lines/files/PR/commit records
❌ Interrupt Condition: Insufficient information leads to ambiguous root cause that cannot be accurately located → Throw error code E002 and interrupt
📝 Feedback: Output "Root cause located: XXX (associated PR/commit: XXX) / Ambiguous root cause, error code E002"
定位方法(Agent根据输入自动选择,强化Git二分法定位PR/提交)
Localization Methods (Agent automatically selects based on input, enhances Git bisect method for PR/commit localization)
方法一:自顶向下追踪法(通用场景,无明确代码变更/PR信息)
Method 1: Top-down Tracing Method (General scenario, no clear code change/PR information)
- 分析错误堆栈跟踪,定位抛出异常/触发错误的具体代码行/文件
- 检查该位置的入参、数据流、依赖调用是否符合业务预期
- 若输入/依赖存在异常,逐级向上游追踪数据来源/调用方
- 重复上述步骤,直至找到根因(数据/逻辑首次出现异常的位置)
- Analyze error stack traces to locate the specific code line/file where the exception is thrown/error is triggered
- Check whether the input parameters, data flow, and dependency calls at this location meet business expectations
- If there are exceptions in input/dependencies, trace upstream to the data source/caller level by level
- Repeat the above steps until the root cause is found (the location where data/logic first deviates from expectations)
方法二:Git二分法定位(代码变更/PR引入bug场景,核心定位问题PR/提交)
Method 2: Git Bisect Localization (Scenario where bug is introduced by code change/PR, core is to locate problematic PR/commit)
- 根据输入的code_change_info,确认bug未出现的最后正常版本/PR/提交、bug首次出现的异常版本/PR/提交
- 依次执行Git二分法命令:→
git bisect start→git bisect bad [异常版本/提交哈希]git bisect good [正常版本/提交哈希] - Git自动缩小排查范围,定位到引入bug的具体提交记录/关联PR(二分法结果中唯一的变更记录)
- 分析目标提交/PR的代码变更内容,锁定根因代码行,并做验证:回滚该提交/PR后,确认bug是否消失
- 若涉及多个PR/提交,按影响程度排序,提供逐一验证的方法
- Based on the input code_change_info, confirm the last normal version/PR/commit where the bug did not occur and the abnormal version/PR/commit where the bug first occurred
- Execute Git bisect commands in sequence: →
git bisect start→git bisect bad [abnormal version/commit hash]git bisect good [normal version/commit hash] - Git automatically narrows down the investigation scope and locates the specific commit record/associated PR that introduced the bug (the only change record in the bisect result)
- Analyze the code change content of the target commit/PR, lock the root cause code line, and verify: after rolling back this commit/PR, confirm whether the bug disappears
- If multiple PRs/commits are involved, sort them by impact level and provide a method for verification one by one
第三步:最小化Diff格式补丁生成与校验
Step 3: Minimized Diff-format Patch Generation and Verification
✅ 校验点:补丁为标准diff格式、修改粒度最小、语法无错误、仅针对根因修复
❌ 中断条件:根因未锁定 / 无有效代码上下文 → 抛出错误码E003并中断
📝 反馈:输出「Diff格式修复补丁已生成,共X处修改 / 补丁生成失败,错误码E003」
✅ Checkpoint: Patch is in standard diff format, minimal modification granularity, no syntax errors, and only fixes the root cause
❌ Interrupt Condition: Root cause not locked / No valid code context → Throw error code E003 and interrupt
📝 Feedback: Output "Diff-format fix patch generated, total X changes / Patch generation failed, error code E003"
补丁生成原则(强制遵循)
Patch Generation Principles (Mandatory Compliance)
- 最小化修改:仅修改与根因相关的代码,无无关逻辑/格式变更
- 根因修复:解决问题本质,而非临时规避表象问题
- 防御性编程:必要时添加输入校验、边界判断、异常捕获逻辑
- 工程化规范:符合项目编码规范,不引入新的语法/逻辑问题
- 可追溯性:关键修复点添加单行注释,说明修复原因和关联根因
- Minimal modification: Only modify code related to the root cause, no irrelevant logic/format changes
- Root cause fix: Address the essence of the problem, rather than temporarily circumventing surface issues
- Defensive programming: Add input validation, boundary judgment, exception capture logic when necessary
- Engineering specifications: Comply with project coding specifications, do not introduce new syntax/logic issues
- Traceability: Add single-line comments at key fix points to explain the fix reason and associated root cause
第四步:防回归测试用例生成与校验
Step 4: Regression-preventive Test Case Generation and Verification
✅ 校验点:测试用例可直接执行、覆盖根因相关核心场景、匹配项目实际使用的测试框架
❌ 中断条件:无有效修复补丁 / 未指定测试框架 → 抛出错误码E004并中断
📝 反馈:输出「防回归测试用例已生成,共X个(适配XXX框架) / 测试用例生成失败,错误码E004」
✅ Checkpoint: Test cases are directly executable, cover core scenarios related to the root cause, and match the test framework actually used by the project
❌ Interrupt Condition: No valid fix patch / Test framework not specified → Throw error code E004 and interrupt
📝 Feedback: Output "Regression-preventive test cases generated, total X (adapted to XXX framework) / Test case generation failed, error code E004"
测试用例核心要求
Core Requirements for Test Cases
- 可复现:能精准复现原始bug(修复前执行失败)
- 可验证:修复后执行通过,能有效验证补丁的修复效果
- 全覆盖:覆盖根因相关的正常场景、边界场景、异常场景
- 可执行:无语法错误,适配主流测试框架(gtest/pytest/JUnit/GoTest/Jest/Vue Test Utils)
- Reproducible: Can accurately reproduce the original bug (fails before fix)
- Verifiable: Passes after fix, can effectively verify the patch's fix effect
- Full coverage: Covers normal scenarios, boundary scenarios, and exception scenarios related to the root cause
- Executable: No syntax errors, adapts to mainstream test frameworks (gtest/pytest/JUnit/GoTest/Jest/Vue Test Utils)
输出格式(结构化Markdown,机器+人类双友好,固定模板不可随意修改)
Output Format (Structured Markdown, Friendly to Both Machines and Humans, Fixed Template Cannot Be Modified Randomly)
Bug 分析报告
Bug Analysis Report
一、错误现象
1. Error Symptoms
- 错误描述:[清晰描述错误类型、报错信息、影响范围、关联业务场景]
- 复现步骤:[步骤化复现方法,包含运行环境、测试数据、操作步骤、关联版本]
- 输入校验:[通过/失败,失败则标注错误码+具体原因]
- 代码变更信息:[从输入中提取的PR/提交/分支信息,标注二分法定位关键信息]
- Error Description: [Clearly describe error type, error message, impact scope, associated business scenario]
- Reproduction Steps: [Step-by-step reproduction method, including runtime environment, test data, operation steps, associated version]
- Input Validation: [Pass/Fail, if failed, mark error code + specific reason]
- Code Change Information: [PR/commit/branch information extracted from input, mark key information of bisect localization]
二、根因分析
2. Root Cause Analysis
- 根因定位:[具体根因,如除零错误/空指针异常/逻辑判断错误,关联PR/提交则标注哈希/编号]
- 定位方法:[自顶向下追踪法 / Git二分法(关联PR:XXX,提交哈希:XXX)]
- 追踪过程:[简要描述数据流/调用链/Git二分法的执行过程,标注锁定根因的关键步骤]
- 影响范围:[受影响的代码模块/PR/分支/业务场景/用户群体]
- Root Cause Localization: [Specific root cause, such as division by zero error/null pointer exception/logic judgment error, mark hash/number if associated with PR/commit]
- Localization Method: [Top-down tracing method / Git bisect method (associated PR: XXX, commit hash: XXX)]
- Tracing Process: [Briefly describe the execution process of data flow/call chain/Git bisect, mark key steps that locked the root cause]
- Impact Scope: [Affected code modules/PRs/branches/business scenarios/user groups]
三、最小化Diff格式修复补丁
3. Minimized Diff-format Fix Patch
[标准diff格式,包含文件路径、行号、修改内容,关键修复点添加注释说明]
diff
diff --git a/.github/workflows/nightly_benchmarks.yaml b/.github/workflows/nightly_benchmarks.yaml
index 123456..789abc 100644
--- a/.github/workflows/nightly_benchmarks.yaml
+++ b/.github/workflows/nightly_benchmarks.yaml
@@ -15,6 +15,10 @@
- name: Install dependencies
run: |
pip install -e .
+ # 安装msprobe相关依赖(修复根因:PR#4241引入的msprobe功能需要这些依赖)
+ pip install mindstudio-probe==8.3.0 || echo "msprobe installation skipped"
+ pip install tb_graph_ascend || echo "tb_graph_ascend installation skipped"[Standard diff format, including file path, line number, modified content, add comments at key fix points to explain]
diff
diff --git a/.github/workflows/nightly_benchmarks.yaml b/.github/workflows/nightly_benchmarks.yaml
index 123456..789abc 100644
--- a/.github/workflows/nightly_benchmarks.yaml
+++ b/.github/workflows/nightly_benchmarks.yaml
@@ -15,6 +15,10 @@
- name: Install dependencies
run: |
pip install -e .
+ # Install msprobe-related dependencies (Root cause fix: msprobe feature introduced by PR#4241 requires these dependencies)
+ pip install mindstudio-probe==8.3.0 || echo "msprobe installation skipped"
+ pip install tb_graph_ascend || echo "tb_graph_ascend installation skipped"四、防回归测试用例
4. Regression-preventive Test Cases
[可执行的测试代码,适配项目实际使用的测试框架,覆盖正常/边界/异常场景]
python
undefined[Executable test code, adapted to the test framework actually used by the project, covering normal/boundary/exception scenarios]
python
undefined示例:pytest框架测试用例
Example: pytest framework test case
def test_bug_fix_reproduction():
"""复现原始bug(修复前执行失败)"""
# 测试代码...
assert result == expected
def test_bug_fix_validation():
"""验证修复效果(修复后执行通过)"""
# 测试代码...
assert fixed_result == expected
def test_edge_cases():
"""边界场景测试"""
# 测试代码...
pass
undefineddef test_bug_fix_reproduction():
"""Reproduce the original bug (fails before fix)"""
# Test code...
assert result == expected
def test_bug_fix_validation():
"""Verify fix effect (passes after fix)"""
# Test code...
assert fixed_result == expected
def test_edge_cases():
"""Boundary scenario test"""
# Test code...
pass
undefined五、预防建议
5. Prevention Suggestions
- 代码层面:[针对根因的代码改进建议,如添加参数校验、异常处理等]
- 流程层面:[开发流程改进建议,如代码审查要点、测试覆盖要求等]
- 工具层面:[工具或监控建议,如静态分析、日志增强等]
- Code Level: [Code improvement suggestions targeting the root cause, such as adding parameter validation, exception handling, etc.]
- Process Level: [Development process improvement suggestions, such as code review key points, test coverage requirements, etc.]
- Tool Level: [Tool or monitoring suggestions, such as static analysis, log enhancement, etc.]