triage-ci-flake

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Triage CI Failure

CI失败问题排查

Overview

概述

Systematic workflow for triaging and fixing test failures in CI, especially flaky tests that pass locally but fail in CI. Tests that made it to

main

are usually flaky due to timing, bundling, or environment differences.

CRITICAL RULE: You MUST run the reproduction workflow before proposing any fixes. No exceptions.

用于在CI中排查和修复测试失败的系统化流程，尤其是那些在本地通过但在CI中失败的不稳定测试。能进入

main

分支的测试通常因时序、打包或环境差异而不稳定。

重要规则：在提出任何修复方案之前，你必须执行复现流程。无例外。

When to Use

适用场景

CI test fails on
```
main
```
branch after PR was merged
Test passes locally but fails in CI
Test failure labeled as "flaky" or intermittent
E2E or integration test timing out in CI only

PR合并后
```
main
```
分支的CI测试失败
测试在本地通过但在CI中失败
测试失败被标记为“不稳定”或间歇性出现
仅在CI中出现E2E或集成测试超时

MANDATORY First Steps

强制第一步

YOU MUST EXECUTE THESE COMMANDS. Reading code or analyzing logs does NOT count as reproduction.

Extract suite name, test name, and error from CI logs
EXECUTE: Kill port 3000 to avoid conflicts
EXECUTE:
```
pnpm dev $SUITE_NAME
```
(use run_in_background=true)
EXECUTE: Wait for server to be ready (check with curl or sleep)
EXECUTE: Run the specific failing test with Playwright directly (npx playwright test test/TEST_SUITE_NAME/e2e.spec.ts:31:3 --headed -g "TEST_DESCRIPTION_TARGET_GOES_HERE")
If test passes, EXECUTE:
```
pnpm prepare-run-test-against-prod
```
EXECUTE:
```
pnpm dev:prod $SUITE_NAME
```
and run test again

Only after EXECUTING these commands and seeing their output can you proceed to analysis and fixes.

"Analysis from logs" is NOT reproduction. You must RUN the commands.

你必须执行以下命令。阅读代码或分析日志不算复现。

提取CI日志中的测试套件名称、测试名称和错误信息
执行：杀死3000端口以避免冲突
执行：
```
pnpm dev $SUITE_NAME
```
（设置run_in_background=true）
执行：等待服务器就绪（用curl检查或sleep等待）
执行：直接使用Playwright运行特定的失败测试（npx playwright test test/TEST_SUITE_NAME/e2e.spec.ts:31:3 --headed -g "TEST_DESCRIPTION_TARGET_GOES_HERE"）
如果测试通过，执行：
```
pnpm prepare-run-test-against-prod
```
执行：
```
pnpm dev:prod $SUITE_NAME
```
并再次运行测试

只有在执行完这些命令并查看输出后，才能进入分析和修复阶段。

“通过日志分析”不算复现。你必须运行命令。

Core Workflow

核心流程

dot

digraph triage_ci {
    "CI failure reported" [shape=box];
    "Extract details from CI logs" [shape=box];
    "Identify suite and test name" [shape=box];
    "Run dev server: pnpm dev $SUITE" [shape=box];
    "Run specific test by name" [shape=box];
    "Did test fail?" [shape=diamond];
    "Debug with dev code" [shape=box];
    "Run prepare-run-test-against-prod" [shape=box];
    "Run: pnpm dev:prod $SUITE" [shape=box];
    "Run specific test again" [shape=box];
    "Did test fail now?" [shape=diamond];
    "Debug bundling issue" [shape=box];
    "Unable to reproduce - check logs" [shape=box];
    "Fix and verify" [shape=box];

    "CI failure reported" -> "Extract details from CI logs";
    "Extract details from CI logs" -> "Identify suite and test name";
    "Identify suite and test name" -> "Run dev server: pnpm dev $SUITE";
    "Run dev server: pnpm dev $SUITE" -> "Run specific test by name";
    "Run specific test by name" -> "Did test fail?";
    "Did test fail?" -> "Debug with dev code" [label="yes"];
    "Did test fail?" -> "Run prepare-run-test-against-prod" [label="no"];
    "Run prepare-run-test-against-prod" -> "Run: pnpm dev:prod $SUITE";
    "Run: pnpm dev:prod $SUITE" -> "Run specific test again";
    "Run specific test again" -> "Did test fail now?";
    "Did test fail now?" -> "Debug bundling issue" [label="yes"];
    "Did test fail now?" -> "Unable to reproduce - check logs" [label="no"];
    "Debug with dev code" -> "Fix and verify";
    "Debug bundling issue" -> "Fix and verify";
}

dot

digraph triage_ci {
    "CI failure reported" [shape=box];
    "Extract details from CI logs" [shape=box];
    "Identify suite and test name" [shape=box];
    "Run dev server: pnpm dev $SUITE" [shape=box];
    "Run specific test by name" [shape=box];
    "Did test fail?" [shape=diamond];
    "Debug with dev code" [shape=box];
    "Run prepare-run-test-against-prod" [shape=box];
    "Run: pnpm dev:prod $SUITE" [shape=box];
    "Run specific test again" [shape=box];
    "Did test fail now?" [shape=diamond];
    "Debug bundling issue" [shape=box];
    "Unable to reproduce - check logs" [shape=box];
    "Fix and verify" [shape=box];

    "CI failure reported" -> "Extract details from CI logs";
    "Extract details from CI logs" -> "Identify suite and test name";
    "Identify suite and test name" -> "Run dev server: pnpm dev $SUITE";
    "Run dev server: pnpm dev $SUITE" -> "Run specific test by name";
    "Run specific test by name" -> "Did test fail?";
    "Did test fail?" -> "Debug with dev code" [label="yes"];
    "Did test fail?" -> "Run prepare-run-test-against-prod" [label="no"];
    "Run prepare-run-test-against-prod" -> "Run: pnpm dev:prod $SUITE";
    "Run: pnpm dev:prod $SUITE" -> "Run specific test again";
    "Run specific test again" -> "Did test fail now?";
    "Did test fail now?" -> "Debug bundling issue" [label="yes"];
    "Did test fail now?" -> "Unable to reproduce - check logs" [label="no"];
    "Debug with dev code" -> "Fix and verify";
    "Debug bundling issue" -> "Fix and verify";
}

Step-by-Step Process

分步流程

1. Extract CI Details

1. 提取CI详情

From CI logs or GitHub Actions URL, identify:

Suite name: Directory name (e.g.,
```
i18n
```
,
```
fields
```
,
```
lexical
```
)
Test file: Full path (e.g.,
```
test/i18n/e2e.spec.ts
```
)
Test name: Exact test description
Error message: Full stack trace
Test type: E2E (Playwright) or integration (Vitest)

从CI日志或GitHub Actions链接中确认：

测试套件名称：目录名（如
```
i18n
```
、
```
fields
```
、
```
lexical
```
）
测试文件：完整路径（如
```
test/i18n/e2e.spec.ts
```
）
测试名称：精确的测试描述
错误信息：完整堆栈跟踪
测试类型：E2E（Playwright）或集成测试（Vitest）

2. Reproduce with Dev Code

2. 使用开发代码复现

CRITICAL: Always run the specific test by name, not the full suite.

SERVER MANAGEMENT RULES:

ALWAYS kill all servers before starting a new one
NEVER assume ports are free
ALWAYS wait for server ready confirmation before running tests

bash

undefined

重要：始终按名称运行特定测试，而非整个套件。

服务器管理规则：

启动新服务器前必须关闭所有现有服务器
永远不要假设端口是空闲的
运行测试前必须等待服务器就绪确认

bash

undefined

========================================

STEP 2A: STOP ALL SERVERS

步骤2A：关闭所有服务器

========================================

lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"

========================================

STEP 2B: START DEV SERVER

步骤2B：启动开发服务器

========================================

Start dev server with the suite (in background with run_in_background=true)

启动对应套件的开发服务器（后台运行，设置run_in_background=true）

pnpm dev $SUITE_NAME

========================================

STEP 2C: WAIT FOR SERVER READY

步骤2C：等待服务器就绪

========================================

Wait for server to be ready (REQUIRED - do not skip)

等待服务器就绪（必填 - 不可跳过）

until curl -s http://localhost:3000/admin > /dev/null 2>&1; do sleep 1; done && echo "Server ready"

========================================

STEP 2D: RUN SPECIFIC TEST

步骤2D：运行特定测试

========================================

Run ONLY the specific failing test using Playwright directly

直接使用Playwright仅运行失败的特定测试

For E2E tests (DO NOT use pnpm test:e2e as it spawns its own server):

针对E2E测试（不要使用pnpm test:e2e，因为它会启动自己的服务器）：

pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name"

For integration tests:

针对集成测试：

pnpm test:int $SUITE_NAME -t "exact test name"


**Did the test fail?**

- ✅ **YES**: You reproduced it! Proceed to debug with dev code.
- ❌ **NO**: Continue to step 3 (bundled code test).

pnpm test:int $SUITE_NAME -t "exact test name"


**测试是否失败？**

- ✅ **是**：已复现问题！继续使用开发代码调试。
- ❌ **否**：继续步骤3（打包代码测试）。

3. Reproduce with Bundled Code

3. 使用打包代码复现

If test passed with dev code, the issue is likely in bundled/production code.

IMPORTANT: You MUST stop the dev server before starting prod server.

bash

undefined

如果测试在开发代码中通过，问题可能出在打包/生产代码中。

重要：启动生产服务器前必须关闭开发服务器。

bash

undefined

========================================

STEP 3A: STOP ALL SERVERS (INCLUDING DEV SERVER FROM STEP 2)

步骤3A：关闭所有服务器（包括步骤2中的开发服务器）

========================================

lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"

========================================

STEP 3B: BUILD AND PACK FOR PROD

步骤3B：为生产环境构建和打包

========================================

Build all packages and pack them (this takes time - be patient)

构建所有包并打包（需要时间 - 请耐心等待）

pnpm prepare-run-test-against-prod

========================================

STEP 3C: START PROD SERVER

步骤3C：启动生产服务器

========================================

Start prod dev server (in background with run_in_background=true)

启动生产环境的开发服务器（后台运行，设置run_in_background=true）

pnpm dev:prod $SUITE_NAME

========================================

STEP 3D: WAIT FOR SERVER READY

步骤3D：等待服务器就绪

========================================

Wait for server to be ready (REQUIRED - do not skip)

等待服务器就绪（必填 - 不可跳过）

until curl -s http://localhost:3000/admin > /dev/null 2>&1; do sleep 1; done && echo "Server ready"

========================================

STEP 3E: RUN SPECIFIC TEST

步骤3E：运行特定测试

========================================

Run the specific test again using Playwright directly

再次使用Playwright运行特定测试

pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name"

OR for integration tests:

或者针对集成测试：

pnpm test:int $SUITE_NAME -t "exact test name"


**Did the test fail now?**

- ✅ **YES**: Bundling or production build issue. Look for:
  - Missing exports in package.json
  - Build configuration problems
  - Code that behaves differently when bundled
- ❌ **NO**: Unable to reproduce locally. Proceed to step 4.

pnpm test:int $SUITE_NAME -t "exact test name"


**现在测试是否失败？**

- ✅ **是**：打包或生产构建问题。排查方向：
  - package.json中缺少导出配置
  - 构建配置问题
  - 代码在打包后表现不同
- ❌ **否**：无法在本地复现。继续步骤4。

4. Unable to Reproduce

4. 无法复现的情况

If you cannot reproduce locally after both attempts:

Review CI logs more carefully for environment differences
Check for race conditions (run test multiple times:
```
for i in {1..10}; do pnpm test:e2e...; done
```
)
Look for CI-specific constraints (memory, CPU, timing)
Consider if it's a true race condition that's highly timing-dependent

如果两次尝试后都无法在本地复现：

更仔细地查看CI日志，寻找环境差异
检查是否存在竞态条件（多次运行测试：
```
for i in {1..10}; do pnpm test:e2e...; done
```
）
查看CI特定限制（内存、CPU、时序）
考虑是否是高度依赖时序的真正竞态条件

Common Flaky Test Patterns

常见不稳定测试模式

Race Conditions

竞态条件

Page navigating while assertions run
Network requests not settled before assertions
State updates not completed

Fix patterns:

Use Playwright's web-first assertions (
```
toBeVisible()
```
,
```
toHaveText()
```
)
Wait for specific conditions, not arbitrary timeouts
Use
```
waitForFunction()
```
with condition checks

断言执行时页面正在导航
断言前网络请求未完成
状态更新未完成

修复模式：

使用Playwright的web-first断言（
```
toBeVisible()
```
、
```
toHaveText()
```
）
等待特定条件，而非任意超时
使用
```
waitForFunction()
```
进行条件检查

Test Pollution

测试污染

Tests leaving data in database
Shared state between tests
Missing cleanup in
```
afterEach
```

Fix patterns:

Track created IDs and clean up in
```
afterEach
```
Use isolated test data
Don't use
```
deleteAll
```
that affects other tests

测试在数据库中遗留数据
测试间共享状态
```
afterEach
```
中缺少清理步骤

修复模式：

跟踪创建的ID并在
```
afterEach
```
中清理
使用独立的测试数据
不要使用会影响其他测试的
```
deleteAll
```
操作

Timing Issues

时序问题

```
setTimeout
```
/
```
sleep
```
instead of condition-based waiting
Not waiting for page stability
Animations/transitions not complete

Fix patterns:

Use
```
waitForPageStability()
```
helper
Wait for specific DOM states
Use Playwright's built-in waiting mechanisms

使用
```
setTimeout
```
/
```
sleep
```
而非基于条件的等待
未等待页面稳定
动画/过渡未完成

修复模式：

使用
```
waitForPageStability()
```
工具函数
等待特定DOM状态
使用Playwright内置的等待机制

Linting Considerations

代码检查注意事项

When fixing e2e tests, be aware of these eslint rules:

playwright/no-networkidle

- Avoid

waitForLoadState('networkidle')

(use condition-based waiting instead)

```
payload/no-wait-function
```
- Avoid custom
```
wait()
```
functions (use Playwright's built-in waits)
```
payload/no-flaky-assertions
```
- Avoid non-retryable assertions
```
playwright/prefer-web-first-assertions
```
- Use built-in Playwright assertions

Existing code may violate these rules - when adding new code, follow the rules even if existing code doesn't.

修复e2e测试时，请注意以下eslint规则：

playwright/no-networkidle

- 避免使用

waitForLoadState('networkidle')

（改用基于条件的等待）

```
payload/no-wait-function
```
- 避免自定义
```
wait()
```
函数（使用Playwright内置等待）
```
payload/no-flaky-assertions
```
- 避免不可重试的断言
```
playwright/prefer-web-first-assertions
```
- 使用Playwright内置断言

现有代码可能违反这些规则 - 添加新代码时，即使现有代码未遵循，也要遵守规则。

Verification

验证

After fixing:

bash

undefined

修复完成后：

bash

undefined

Ensure dev server is running on port 3000

确保开发服务器在3000端口运行

Run test multiple times to confirm stability

多次运行测试以确认稳定性

for i in {1..10}; do pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name" || break done

Run full suite

运行整个套件

pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts

If you modified bundled code, test with prod build

如果修改了打包代码，使用生产构建测试

lsof -ti:3000 | xargs kill -9 2>/dev/null pnpm prepare-run-test-against-prod pnpm dev:prod $SUITE_NAME until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts

undefined

undefined

The Iron Law

铁律

NO FIX WITHOUT REPRODUCTION FIRST

If you propose a fix before completing steps 1-3 of the workflow, you've violated this skill.

This applies even when:

The fix seems obvious from the logs
You've seen this error before
Time pressure from the team
You're confident about the root cause
The logs show clear stack traces

No exceptions. Run the reproduction workflow first.

未复现则不修复

如果在完成流程的1-3步之前就提出修复方案，就违反了本规范。

即使在以下情况也必须遵守：

从日志看修复方案很明显
你之前见过这个错误
团队要求快速修复
你对根因很有信心
日志显示清晰的堆栈跟踪

无例外。先执行复现流程。

Rationalization Table

合理化借口对照表

Every excuse for skipping reproduction, and why it's wrong:

Rationalization	Reality
"The logs show the exact error"	Logs show symptoms, not root cause. Reproduce.
"I can see the problem in the code"	You're guessing. Reproduce to confirm.
"This is obviously a race condition"	Maybe. Reproduce to be sure.
"I've seen this error before"	This might be different. Reproduce.
"The stack trace is clear"	Stack trace shows where, not why. Reproduce.
"Time pressure - need to fix fast"	Reproducing IS fast. Guessing wastes time.
"The test file shows the issue"	Reading ≠ running. Execute the commands.
"I'll analyze the code first"	Code analysis comes AFTER reproduction.
"Let me investigate the root cause"	Reproduction IS the investigation.
"I need to understand the error"	Understanding comes from seeing it fail.

所有跳过复现的借口，以及为什么它们是错误的：

借口	事实
“日志显示了确切的错误”	日志显示症状，而非根因。必须复现。
“我能在代码中看到问题”	你只是猜测。复现以确认。
“这显然是竞态条件”	也许是。但必须复现以确保。
“我之前见过这个错误”	这次可能不同。必须复现。
“堆栈跟踪很清晰”	堆栈跟踪显示位置，而非原因。必须复现。
“时间紧张 - 需要快速修复”	复现本身很快。猜测才会浪费时间。
“测试文件显示了问题”	阅读≠运行。必须执行命令。
“我先分析代码”	代码分析在复现之后。
“我需要调查根因”	复现就是调查过程。
“我需要理解这个错误”	理解来自亲眼看到它失败。

Red Flags - STOP

危险信号 - 立即停止

These thoughts mean you're about to violate the skill:

"Let me just fix the obvious issue" → STOP. Execute reproduction first.
"I'll increase the timeout" → STOP. Find root cause via reproduction.
"This looks like X from the logs" → STOP. Run the test.
"I'll run the full suite" → STOP. Run specific test first.
"Let me check the code" → STOP. Run reproduction first.
"I can see the problem in the stack trace" → STOP. Reproduce to confirm.
"The logs are clear enough" → STOP. Execute the commands.
"Let me analyze..." → STOP. Analysis comes after reproduction.

All of these mean: Stop immediately. Execute the reproduction commands. No analysis without execution.

出现以下想法意味着你即将违反本规范：

“我直接修复这个明显的问题” → 停止。先执行复现流程。
“我增加超时时间就行” → 停止。通过复现找到根因。
“从日志看这是X问题” → 停止。运行测试。
“我运行整个套件” → 停止。先运行特定测试。
“我先看代码” → 停止。先执行复现。
“从堆栈跟踪我能看到问题” → 停止。复现以确认。
“日志足够清晰了” → 停止。执行命令。
“我先分析...” → 停止。分析在复现之后。

所有这些想法都意味着：立即停止。执行复现命令。没有执行就没有分析。

Example Session

示例会话

CORRECT approach - executes commands first:

User: CI failed on i18n suite - https://github.com/.../actions/runs/XXX

Agent:
1. Extracted details from CI logs:
   - Suite: i18n
   - Test: "ensure translations update correctly when switching language"
   - File: test/i18n/e2e.spec.ts:124
   - Error: "Error loading language: en" TypeError: network error

2. Executing reproduction workflow:
   [Uses Bash tool to kill ports]
   $ lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"

   [Uses Bash tool with run_in_background=true]
   $ pnpm dev i18n

   [Waits for server to start with curl check]
   $ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done && echo "Server ready"

   [Uses Bash tool to run Playwright directly]
   $ pnpm exec playwright test test/i18n/e2e.spec.ts -g "ensure translations update correctly when switching language"

   Output shows: Test PASSED locally

3. Test passed with dev code, trying bundled code:
   [Kills ports again]
   $ lsof -ti:3000 | xargs kill -9 2>/dev/null

   [Uses Bash tool]
   $ pnpm prepare-run-test-against-prod

   [Uses Bash tool with run_in_background=true]
   $ pnpm dev:prod i18n

   [Waits for server]
   $ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done

   [Uses Bash tool]
   $ pnpm exec playwright test test/i18n/e2e.spec.ts -g "ensure translations update correctly when switching language"

   Output shows: Test FAILED - reproduced!

4. Now that I've reproduced it, analyzing the failure...
   [Code analysis, root cause identification]

5. Implementing fix: [specific changes]

6. Verification:
   [Uses Bash tool]
   $ for i in {1..10}; do pnpm test:e2e i18n -g "exact test"; done
   All 10 runs passed!

INCORRECT approach - analyzes without executing:

User: CI failed on i18n suite

Agent: Looking at the error logs, I can see this is a race condition...
[Analyzes code, proposes fix without running any commands]

❌ WRONG - This violates the skill. Must execute reproduction commands first.

正确方式 - 先执行命令：

用户：i18n套件的CI失败了 - https://github.com/.../actions/runs/XXX

助手：
1. 从CI日志中提取详情：
   - 套件：i18n
   - 测试：“切换语言时确保翻译正确更新”
   - 文件：test/i18n/e2e.spec.ts:124
   - 错误：“Error loading language: en” TypeError: network error

2. 执行复现流程：
   [使用Bash工具关闭端口]
   $ lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"

   [使用Bash工具后台运行]
   $ pnpm dev i18n

   [用curl检查等待服务器启动]
   $ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done && echo "Server ready"

   [使用Bash工具直接运行Playwright]
   $ pnpm exec playwright test test/i18n/e2e.spec.ts -g "切换语言时确保翻译正确更新"

   输出显示：测试在本地通过

3. 开发代码中测试通过，尝试打包代码：
   [再次关闭端口]
   $ lsof -ti:3000 | xargs kill -9 2>/dev/null

   [使用Bash工具]
   $ pnpm prepare-run-test-against-prod

   [使用Bash工具后台运行]
   $ pnpm dev:prod i18n

   [等待服务器]
   $ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done

   [使用Bash工具]
   $ pnpm exec playwright test test/i18n/e2e.spec.ts -g "切换语言时确保翻译正确更新"

   输出显示：测试失败 - 已复现！

4. 现在已复现问题，分析失败原因...
   [代码分析，确定根因]

5. 实施修复：[具体修改内容]

6. 验证：
   [使用Bash工具]
   $ for i in {1..10}; do pnpm test:e2e i18n -g "精确测试名"; done
   10次运行全部通过！

错误方式 - 未执行就分析：

用户：i18n套件的CI失败了

助手：查看错误日志，我发现这是竞态条件...
[分析代码，未运行任何命令就提出修复方案]

❌ 错误 - 违反本规范。必须先执行复现命令。

Common Mistakes

常见错误

Mistake	Fix
Running full test suite first	Run specific test by name
Skipping dev code reproduction	Always try dev code first
Not testing with bundled code	If dev passes, test with `prepare-run-test-against-prod`
Proposing fix without reproducing	Follow the workflow - reproduce first
Using `networkidle` in new code	Use condition-based waiting with `waitForFunction()`
Adding arbitrary `wait()` calls	Use Playwright's built-in assertions and waits

错误	修复方案
先运行整个测试套件	按名称运行特定测试
跳过开发代码复现	始终先尝试开发代码
未使用打包代码测试	如果开发代码通过，使用 `prepare-run-test-against-prod` 测试
未复现就提出修复方案	遵循流程 - 先复现
在新代码中使用 `networkidle`	使用 `waitForFunction()` 进行基于条件的等待
添加任意 `wait()` 调用	使用Playwright内置的断言和等待机制

Key Principles

核心原则

Reproduce before fixing: Never propose a fix without reproducing the issue
Test specifically: Run the exact failing test, not the full suite
Dev first, prod second: Check dev code before bundled code
Follow the workflow: No shortcuts - the steps exist to save time
Verify stability: Run tests multiple times to confirm fix

先复现后修复：未复现问题绝不提出修复方案
精准测试：运行确切的失败测试，而非整个套件
先开发后生产：先检查开发代码，再检查打包代码
遵循流程：不要走捷径 - 步骤的存在是为了节省时间
验证稳定性：多次运行测试以确认修复有效

Completion: Creating a PR

完成：创建PR

After you have:

✅ Reproduced the issue
✅ Implemented a fix
✅ Verified the fix passes locally (multiple runs)
✅ Tested with prod build (if applicable)

You MUST prompt the user to create a PR:

The fix has been verified and is ready for review. Would you like me to create a PR with these changes?

Summary of changes:
- [List files modified]
- [Brief description of the fix]
- [Verification results]

IMPORTANT:

DO NOT automatically create a PR - always ask the user first
Provide a clear summary of what was changed and why
Include verification results (number of test runs, pass rate)
Let the user decide whether to create the PR immediately or make additional changes first

This ensures the user has visibility and control over what gets submitted for review.

当你完成以下步骤后：

✅ 复现了问题
✅ 实施了修复
✅ 验证修复在本地通过（多次运行）
✅ 用生产构建测试（如适用）

你必须提示用户创建PR：

修复已验证，可用于评审。是否需要我创建包含这些修改的PR？

修改摘要：
- [列出修改的文件]
- [修复的简要描述]
- [验证结果]

重要：

不要自动创建PR - 始终先询问用户
清晰说明修改内容和原因
包含验证结果（测试运行次数、通过率）
让用户决定是立即创建PR还是先做额外修改

这确保用户对提交评审的内容有可见性和控制权。