fix-flaky-tests
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFix Flaky Test
修复不稳定测试
Quick Start
快速开始
You'll typically receive a Tuist test case URL or identifier. Follow these steps to investigate and fix it:
- Run to get reliability metrics for the test.
tuist test case show <id-or-identifier> --json - Run to see flaky run patterns.
tuist test case run list Module/Suite/TestCase --flaky --json - Run on failing flaky runs to get failure messages and file paths.
tuist test case run show <run-id> --json - Read the test source at the reported path and line, identify the flaky pattern, and fix it.
- Verify by running the test multiple times to confirm it passes consistently.
你通常会收到一个Tuist测试用例URL或标识符。按照以下步骤调查并修复问题:
- 运行 获取该测试的可靠性指标。
tuist test case show <id-or-identifier> --json - 运行 查看不稳定运行模式。
tuist test case run list Module/Suite/TestCase --flaky --json - 针对失败的不稳定运行,执行 获取失败信息和文件路径。
tuist test case run show <run-id> --json - 查看报告路径和行号对应的测试源码,识别不稳定模式并进行修复。
- 多次运行测试以验证其能持续通过。
Investigation
调查分析
1. Get test case metrics
1. 获取测试用例指标
You can pass either the UUID or the identifier:
Module/Suite/TestCasebash
tuist test case show <id> --json
tuist test case show Module/Suite/TestCase --jsonKey fields:
- — percentage of successful runs (higher is better)
reliability_rate - — percentage of runs marked flaky in the last 30 days
flakiness_rate - /
total_runs— volume contextfailed_runs - — current state
last_status
你可以传入UUID或标识符:
Module/Suite/TestCasebash
tuist test case show <id> --json
tuist test case show Module/Suite/TestCase --json关键字段:
- —— 成功运行的百分比(数值越高越好)
reliability_rate - —— 过去30天内标记为不稳定的运行百分比
flakiness_rate - /
total_runs—— 运行量上下文信息failed_runs - —— 当前状态
last_status
2. View flaky run history
2. 查看不稳定运行历史
bash
tuist test case run list Module/Suite/TestCase --flaky --jsonThe identifier uses the format or when there is no suite. This returns only runs that were detected as flaky.
ModuleName/SuiteName/TestCaseNameModuleName/TestCaseNamebash
tuist test case run list Module/Suite/TestCase --flaky --json标识符使用格式,当没有测试套件时使用。该命令仅返回被检测为不稳定的运行记录。
ModuleName/SuiteName/TestCaseNameModuleName/TestCaseName3. View full run history
3. 查看完整运行历史
bash
tuist test case run list Module/Suite/TestCase --json --page-size 20Look for patterns:
- Does it fail on specific branches?
- Does it fail only on CI () or also locally?
is_ci: true - Are failures clustered around specific commits?
bash
tuist test case run list Module/Suite/TestCase --json --page-size 20寻找模式规律:
- 是否在特定分支上失败?
- 是否仅在CI环境()中失败,还是本地也会失败?
is_ci: true - 失败是否集中在特定提交附近?
4. Get failure details
4. 获取失败详情
bash
tuist test case run show <run-id> --jsonKey fields:
- — the assertion or error message
failures[].message - — source file path
failures[].path - — exact line of failure
failures[].line_number - — type of issue (assertion_failure, etc.)
failures[].issue_type - — if present, shows retry behavior (pass/fail sequence)
repetitions - — the broader test run this execution belongs to
test_run_id
bash
tuist test case run show <run-id> --json关键字段:
- —— 断言或错误信息
failures[].message - —— 源码文件路径
failures[].path - —— 失败的具体行号
failures[].line_number - —— 问题类型(如assertion_failure等)
failures[].issue_type - —— 若存在,显示重试行为(通过/失败序列)
repetitions - —— 本次执行所属的更宽泛的测试运行ID
test_run_id
Code Analysis
代码分析
- Open the file at and go to
failures[0].path.failures[0].line_number - Read the full test function and its setup/teardown.
- Identify which of the common flaky patterns below applies.
- Check if the test shares state with other tests in the same suite.
- 打开对应的文件,跳转到
failures[0].path行。failures[0].line_number - 阅读完整的测试函数及其 setUp/tearDown 方法。
- 识别以下哪种常见不稳定模式适用。
- 检查该测试是否与同套件中的其他测试共享状态。
Common Flaky Patterns
常见不稳定模式
Timing and async issues
时序与异步问题
- Missing waits: Test checks a result before an async operation completes. Fix: use , expectations with timeouts, or polling.
await - Race conditions: Multiple concurrent operations access shared state. Fix: synchronize access or use serial queues.
- Hardcoded timeouts: or fixed delays that are too short on CI. Fix: use condition-based waits instead of fixed delays.
sleep(1)
- 缺失等待:测试在异步操作完成前就检查结果。修复方案:使用、带超时的预期断言或轮询机制。
await - 竞态条件:多个并发操作访问共享状态。修复方案:同步访问或使用串行队列。
- 硬编码超时:或固定延迟在CI环境中过短。修复方案:使用基于条件的等待替代固定延迟。
sleep(1)
Shared state
共享状态问题
- Test pollution: One test modifies global/static state that another test depends on. Fix: reset state in setUp/tearDown or use unique instances per test.
- Singleton contamination: Shared singletons carry state between tests. Fix: inject dependencies or reset singletons.
- File system leftovers: Tests leave files that affect subsequent runs. Fix: use temporary directories and clean up.
- 测试污染:一个测试修改了全局/静态状态,而另一个测试依赖该状态。修复方案:在setUp/tearDown中重置状态,或为每个测试使用独立实例。
- 单例污染:共享单例在测试间携带状态。修复方案:注入依赖或重置单例。
- 文件系统残留:测试留下的文件影响后续运行。修复方案:使用临时目录并在测试后清理。
Environment dependencies
环境依赖问题
- Network calls: Tests hit real services that may be slow or unavailable. Fix: mock network calls.
- Date/time sensitivity: Tests depend on current time or timezone. Fix: inject a clock or freeze time.
- File system paths: Hardcoded paths that differ between environments. Fix: use relative paths or temp directories.
- 网络调用:测试访问真实服务,可能存在缓慢或不可用的情况。修复方案:模拟网络调用。
- 日期/时间敏感:测试依赖当前时间或时区。修复方案:注入时钟对象或冻结时间。
- 文件系统路径:硬编码路径在不同环境中存在差异。修复方案:使用相对路径或临时目录。
Order dependence
执行顺序依赖
- Implicit ordering: Test passes only when run after another test that sets up required state. Fix: make each test self-contained.
- Parallel execution conflicts: Tests that work in isolation but fail when run concurrently. Fix: use unique resources per test.
- 隐式顺序:测试仅在依赖的前置测试运行后才能通过。修复方案:让每个测试保持独立自包含。
- 并行执行冲突:测试单独运行正常,但并发运行时失败。修复方案:为每个测试使用唯一资源。
Fix Implementation
修复实施
After identifying the pattern:
- Apply the smallest fix that addresses the root cause.
- Do not refactor unrelated code.
- If the fix requires a test utility (like a mock or helper), check if one already exists before creating a new one.
识别问题模式后:
- 应用最小化的修复方案,直击根本原因。
- 不要重构无关代码。
- 如果修复需要测试工具(如模拟对象或辅助函数),先检查是否已有现成工具,再考虑创建新工具。
Verification
验证
Run the specific test repeatedly until failure using 's built-in repetition support:
xcodebuildbash
xcodebuild test -workspace <workspace> -scheme <scheme> -only-testing <module>/<suite>/<test> -test-iterations <count> -run-tests-until-failureThis runs the test up to times and stops at the first failure. Choose the iteration count based on how long the test takes — for fast unit tests use 50–100, for slower integration or acceptance tests use 2–5.
<count>使用的内置重复运行功能,反复执行特定测试直到出现失败:
xcodebuildbash
xcodebuild test -workspace <workspace> -scheme <scheme> -only-testing <module>/<suite>/<test> -test-iterations <count> -run-tests-until-failure该命令会运行测试最多次,首次失败时停止。根据测试耗时选择迭代次数——对于快速单元测试,使用50–100次;对于较慢的集成或验收测试,使用2–5次。
<count>Done Checklist
完成检查清单
- Identified the root cause of flakiness
- Applied a targeted fix
- Verified the test passes consistently (multiple runs)
- Did not introduce new test dependencies or shared state
- Committed the fix with a descriptive message
- 已识别不稳定问题的根本原因
- 已应用针对性修复
- 已验证测试能持续通过(多次运行)
- 未引入新的测试依赖或共享状态
- 已提交修复并附带描述性提交信息