fix-flaky-tests

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Fix Flaky Test

修复不稳定测试

Quick Start

快速开始

You'll typically receive a Tuist test case URL or identifier. Follow these steps to investigate and fix it:
  1. Run
    tuist test case show <id-or-identifier> --json
    to get reliability metrics for the test.
  2. Run
    tuist test case run list Module/Suite/TestCase --flaky --json
    to see flaky run patterns.
  3. Run
    tuist test case run show <run-id> --json
    on failing flaky runs to get failure messages and file paths.
  4. Read the test source at the reported path and line, identify the flaky pattern, and fix it.
  5. Verify by running the test multiple times to confirm it passes consistently.
你通常会收到一个Tuist测试用例URL或标识符。按照以下步骤调查并修复问题:
  1. 运行
    tuist test case show <id-or-identifier> --json
    获取该测试的可靠性指标。
  2. 运行
    tuist test case run list Module/Suite/TestCase --flaky --json
    查看不稳定运行模式。
  3. 针对失败的不稳定运行,执行
    tuist test case run show <run-id> --json
    获取失败信息和文件路径。
  4. 查看报告路径和行号对应的测试源码,识别不稳定模式并进行修复。
  5. 多次运行测试以验证其能持续通过。

Investigation

调查分析

1. Get test case metrics

1. 获取测试用例指标

You can pass either the UUID or the
Module/Suite/TestCase
identifier:
bash
tuist test case show <id> --json
tuist test case show Module/Suite/TestCase --json
Key fields:
  • reliability_rate
    — percentage of successful runs (higher is better)
  • flakiness_rate
    — percentage of runs marked flaky in the last 30 days
  • total_runs
    /
    failed_runs
    — volume context
  • last_status
    — current state
你可以传入UUID或
Module/Suite/TestCase
标识符:
bash
tuist test case show <id> --json
tuist test case show Module/Suite/TestCase --json
关键字段:
  • reliability_rate
    —— 成功运行的百分比(数值越高越好)
  • flakiness_rate
    —— 过去30天内标记为不稳定的运行百分比
  • total_runs
    /
    failed_runs
    —— 运行量上下文信息
  • last_status
    —— 当前状态

2. View flaky run history

2. 查看不稳定运行历史

bash
tuist test case run list Module/Suite/TestCase --flaky --json
The identifier uses the format
ModuleName/SuiteName/TestCaseName
or
ModuleName/TestCaseName
when there is no suite. This returns only runs that were detected as flaky.
bash
tuist test case run list Module/Suite/TestCase --flaky --json
标识符使用
ModuleName/SuiteName/TestCaseName
格式,当没有测试套件时使用
ModuleName/TestCaseName
。该命令仅返回被检测为不稳定的运行记录。

3. View full run history

3. 查看完整运行历史

bash
tuist test case run list Module/Suite/TestCase --json --page-size 20
Look for patterns:
  • Does it fail on specific branches?
  • Does it fail only on CI (
    is_ci: true
    ) or also locally?
  • Are failures clustered around specific commits?
bash
tuist test case run list Module/Suite/TestCase --json --page-size 20
寻找模式规律:
  • 是否在特定分支上失败?
  • 是否仅在CI环境(
    is_ci: true
    )中失败,还是本地也会失败?
  • 失败是否集中在特定提交附近?

4. Get failure details

4. 获取失败详情

bash
tuist test case run show <run-id> --json
Key fields:
  • failures[].message
    — the assertion or error message
  • failures[].path
    — source file path
  • failures[].line_number
    — exact line of failure
  • failures[].issue_type
    — type of issue (assertion_failure, etc.)
  • repetitions
    — if present, shows retry behavior (pass/fail sequence)
  • test_run_id
    — the broader test run this execution belongs to
bash
tuist test case run show <run-id> --json
关键字段:
  • failures[].message
    —— 断言或错误信息
  • failures[].path
    —— 源码文件路径
  • failures[].line_number
    —— 失败的具体行号
  • failures[].issue_type
    —— 问题类型(如assertion_failure等)
  • repetitions
    —— 若存在,显示重试行为(通过/失败序列)
  • test_run_id
    —— 本次执行所属的更宽泛的测试运行ID

Code Analysis

代码分析

  1. Open the file at
    failures[0].path
    and go to
    failures[0].line_number
    .
  2. Read the full test function and its setup/teardown.
  3. Identify which of the common flaky patterns below applies.
  4. Check if the test shares state with other tests in the same suite.
  1. 打开
    failures[0].path
    对应的文件,跳转到
    failures[0].line_number
    行。
  2. 阅读完整的测试函数及其 setUp/tearDown 方法。
  3. 识别以下哪种常见不稳定模式适用。
  4. 检查该测试是否与同套件中的其他测试共享状态。

Common Flaky Patterns

常见不稳定模式

Timing and async issues

时序与异步问题

  • Missing waits: Test checks a result before an async operation completes. Fix: use
    await
    , expectations with timeouts, or polling.
  • Race conditions: Multiple concurrent operations access shared state. Fix: synchronize access or use serial queues.
  • Hardcoded timeouts:
    sleep(1)
    or fixed delays that are too short on CI. Fix: use condition-based waits instead of fixed delays.
  • 缺失等待:测试在异步操作完成前就检查结果。修复方案:使用
    await
    、带超时的预期断言或轮询机制。
  • 竞态条件:多个并发操作访问共享状态。修复方案:同步访问或使用串行队列。
  • 硬编码超时
    sleep(1)
    或固定延迟在CI环境中过短。修复方案:使用基于条件的等待替代固定延迟。

Shared state

共享状态问题

  • Test pollution: One test modifies global/static state that another test depends on. Fix: reset state in setUp/tearDown or use unique instances per test.
  • Singleton contamination: Shared singletons carry state between tests. Fix: inject dependencies or reset singletons.
  • File system leftovers: Tests leave files that affect subsequent runs. Fix: use temporary directories and clean up.
  • 测试污染:一个测试修改了全局/静态状态,而另一个测试依赖该状态。修复方案:在setUp/tearDown中重置状态,或为每个测试使用独立实例。
  • 单例污染:共享单例在测试间携带状态。修复方案:注入依赖或重置单例。
  • 文件系统残留:测试留下的文件影响后续运行。修复方案:使用临时目录并在测试后清理。

Environment dependencies

环境依赖问题

  • Network calls: Tests hit real services that may be slow or unavailable. Fix: mock network calls.
  • Date/time sensitivity: Tests depend on current time or timezone. Fix: inject a clock or freeze time.
  • File system paths: Hardcoded paths that differ between environments. Fix: use relative paths or temp directories.
  • 网络调用:测试访问真实服务,可能存在缓慢或不可用的情况。修复方案:模拟网络调用。
  • 日期/时间敏感:测试依赖当前时间或时区。修复方案:注入时钟对象或冻结时间。
  • 文件系统路径:硬编码路径在不同环境中存在差异。修复方案:使用相对路径或临时目录。

Order dependence

执行顺序依赖

  • Implicit ordering: Test passes only when run after another test that sets up required state. Fix: make each test self-contained.
  • Parallel execution conflicts: Tests that work in isolation but fail when run concurrently. Fix: use unique resources per test.
  • 隐式顺序:测试仅在依赖的前置测试运行后才能通过。修复方案:让每个测试保持独立自包含。
  • 并行执行冲突:测试单独运行正常,但并发运行时失败。修复方案:为每个测试使用唯一资源。

Fix Implementation

修复实施

After identifying the pattern:
  1. Apply the smallest fix that addresses the root cause.
  2. Do not refactor unrelated code.
  3. If the fix requires a test utility (like a mock or helper), check if one already exists before creating a new one.
识别问题模式后:
  1. 应用最小化的修复方案,直击根本原因。
  2. 不要重构无关代码。
  3. 如果修复需要测试工具(如模拟对象或辅助函数),先检查是否已有现成工具,再考虑创建新工具。

Verification

验证

Run the specific test repeatedly until failure using
xcodebuild
's built-in repetition support:
bash
xcodebuild test -workspace <workspace> -scheme <scheme> -only-testing <module>/<suite>/<test> -test-iterations <count> -run-tests-until-failure
This runs the test up to
<count>
times and stops at the first failure. Choose the iteration count based on how long the test takes — for fast unit tests use 50–100, for slower integration or acceptance tests use 2–5.
使用
xcodebuild
的内置重复运行功能,反复执行特定测试直到出现失败:
bash
xcodebuild test -workspace <workspace> -scheme <scheme> -only-testing <module>/<suite>/<test> -test-iterations <count> -run-tests-until-failure
该命令会运行测试最多
<count>
次,首次失败时停止。根据测试耗时选择迭代次数——对于快速单元测试,使用50–100次;对于较慢的集成或验收测试,使用2–5次。

Done Checklist

完成检查清单

  • Identified the root cause of flakiness
  • Applied a targeted fix
  • Verified the test passes consistently (multiple runs)
  • Did not introduce new test dependencies or shared state
  • Committed the fix with a descriptive message
  • 已识别不稳定问题的根本原因
  • 已应用针对性修复
  • 已验证测试能持续通过(多次运行)
  • 未引入新的测试依赖或共享状态
  • 已提交修复并附带描述性提交信息