fix

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Fix Failing or Flaky Tests

修复失败或不稳定的测试

Diagnose and fix a Playwright test that fails or passes intermittently using a systematic taxonomy.

使用系统化的分类法诊断并修复偶尔失败或偶尔通过的Playwright测试。

Input

输入

$ARGUMENTS

contains:

A test file path:
```
e2e/login.spec.ts
```
A test name:
```
"should redirect after login"
```

A description:

"the checkout test fails in CI but passes locally"

$ARGUMENTS

包含：

测试文件路径：
```
e2e/login.spec.ts
```
测试名称：
```
"should redirect after login"
```

描述信息：

"该结账测试在CI中失败，但本地运行通过"

Steps

步骤

1. Reproduce the Failure

1. 复现失败场景

Run the test to capture the error:

bash

npx playwright test <file> --reporter=list

If the test passes, it's likely flaky. Run burn-in:

bash

npx playwright test <file> --repeat-each=10 --reporter=list

If it still passes, try with parallel workers:

bash

npx playwright test --fully-parallel --workers=4 --repeat-each=5

运行测试以捕获错误：

bash

npx playwright test <file> --reporter=list

如果测试通过，则它很可能是不稳定的。执行多次运行验证：

bash

npx playwright test <file> --repeat-each=10 --reporter=list

如果仍然通过，尝试使用并行运行器：

bash

npx playwright test --fully-parallel --workers=4 --repeat-each=5

2. Capture Trace

2. 捕获追踪信息

Run with full tracing:

bash

npx playwright test <file> --trace=on --retries=0

Read the trace output. Use

/debug

to analyze trace files if available.

开启完整追踪功能运行测试：

bash

npx playwright test <file> --trace=on --retries=0

查看追踪输出。如果有追踪文件，使用

/debug

命令分析。

3. Categorize the Failure

3. 对失败进行分类

Load

flaky-taxonomy.md

from this skill directory.

Every failing test falls into one of four categories:

Category	Symptom	Diagnosis
Timing/Async	Fails intermittently everywhere	`--repeat-each=20` reproduces locally
Test Isolation	Fails in suite, passes alone	`--workers=1 --grep "test name"` passes
Environment	Fails in CI, passes locally	Compare CI vs local screenshots/traces
Infrastructure	Random, no pattern	Error references browser internals

从当前技能目录加载

flaky-taxonomy.md

文件。

每个失败的测试都属于以下四类之一：

类别	症状	诊断方式
时序/异步问题	在所有环境中间歇性失败	本地执行 `--repeat-each=20` 可复现
测试隔离问题	在测试套件中失败，单独运行通过	执行 `--workers=1 --grep "test name"` 可通过
环境差异问题	在CI中失败，本地运行通过	对比CI与本地的截图/追踪信息
基础设施问题	随机出现，无固定规律	错误信息涉及浏览器内部机制

4. Apply Targeted Fix

4. 应用针对性修复

Timing/Async:

Replace
```
waitForTimeout()
```
with web-first assertions
Add
```
await
```
to missing Playwright calls
Wait for specific network responses before asserting
Use
```
toBeVisible()
```
before interacting with elements

Test Isolation:

Remove shared mutable state between tests
Create test data per-test via API or fixtures
Use unique identifiers (timestamps, random strings) for test data
Check for database state leaks

Environment:

Match viewport sizes between local and CI
Account for font rendering differences in screenshots
Use
```
docker
```
locally to match CI environment
Check for timezone-dependent assertions

Infrastructure:

Increase timeout for slow CI runners
Add retries in CI config (
```
retries: 2
```
)
Check for browser OOM (reduce parallel workers)
Ensure browser dependencies are installed

时序/异步问题：

用Web优先断言替换
```
waitForTimeout()
```
为缺失的Playwright调用添加
```
await
```
在断言前等待特定的网络响应
与元素交互前使用
```
toBeVisible()
```
验证可见性

测试隔离问题：

移除测试之间的共享可变状态
通过API或fixtures为每个测试创建独立的测试数据
为测试数据使用唯一标识符（时间戳、随机字符串）
检查是否存在数据库状态泄漏

环境差异问题：

对齐本地与CI的视口大小
考虑截图中的字体渲染差异
本地使用
```
docker
```
匹配CI环境
检查是否存在依赖时区的断言

基础设施问题：

为慢速CI运行器增加超时时间
在CI配置中添加重试机制（
```
retries: 2
```
）
检查浏览器是否出现OOM（减少并行运行器数量）
确保已安装浏览器依赖

5. Verify the Fix

5. 验证修复效果

Run the test 10 times to confirm stability:

bash

npx playwright test <file> --repeat-each=10 --reporter=list

All 10 must pass. If any fail, go back to step 3.

运行测试10次以确认稳定性：

bash

npx playwright test <file> --repeat-each=10 --reporter=list

必须全部10次通过。如果有任何一次失败，回到步骤3重新处理。

6. Prevent Recurrence

6. 预防问题复发

Suggest:

Add to CI with
```
retries: 2
```
if not already
Enable
```
trace: 'on-first-retry'
```
in config
Add the fix pattern to project's test conventions doc

建议：

如果尚未配置，在CI中添加
```
retries: 2
```
在配置中启用
```
trace: 'on-first-retry'
```
将修复模式添加到项目的测试规范文档中

Output

输出

Root cause category and specific issue
The fix applied (with diff)
Verification result (10/10 passes)
Prevention recommendation

根本原因类别及具体问题
应用的修复方案（附带差异对比）
验证结果（10/10通过）
预防建议