compare-test-case

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Compare Test Case

测试用例对比

Quick Start

快速上手

You'll typically receive a test case identifier and two branches. Follow these steps:

Run

tuist test case show <id-or-identifier> --json

to get the test case metrics.

Run

tuist test case run list <identifier> --json

to see runs across branches.

Compare behavior between base and head branches.

Inspect failures with

tuist test case run show <run-id> --json

Summarize findings with root cause analysis.

通常你会拿到一个测试用例标识和两个分支，按照以下步骤操作：

执行

tuist test case show <id-or-identifier> --json

获取测试用例指标

执行

tuist test case run list <identifier> --json

查看跨分支的运行记录

对比基准分支和头分支的测试表现

执行

tuist test case run show <run-id> --json

排查失败原因

汇总分析结果并输出根因分析

Step 1: Resolve the Test Case

步骤1：解析测试用例

By ID or dashboard URL

通过ID或看板URL

bash

tuist test case show <test-case-id> --json

bash

tuist test case show <test-case-id> --json

By identifier (Module/Suite/TestCase)

通过标识（模块/测试套件/测试用例）

bash

tuist test case show Module/Suite/TestCase --json

bash

tuist test case show Module/Suite/TestCase --json

If no test case is provided

如果未提供测试用例

Discover flaky or failing tests to investigate:

bash

tuist test case list --flaky --json --page-size 10

Key fields from the response:

```
id
```
-- unique identifier for subsequent commands
```
name
```
,
```
module_name
```
,
```
suite_name
```
-- the test identity
```
reliability_rate
```
-- percentage of successful runs
```
flakiness_rate
```
-- percentage of flaky runs in the last 30 days
```
total_runs
```
/
```
failed_runs
```
-- volume context
```
is_flaky
```
/
```
is_quarantined
```
-- current flags

可先查询待排查的不稳定或失败测试：

bash

tuist test case list --flaky --json --page-size 10

响应中的关键字段：

```
id
```
-- 后续命令需要用到的唯一标识
```
name
```
,
```
module_name
```
,
```
suite_name
```
-- 测试的身份信息
```
reliability_rate
```
-- 运行成功率百分比
```
flakiness_rate
```
-- 过去30天内不稳定运行的占比
```
total_runs
```
/
```
failed_runs
```
-- 运行次数上下文
```
is_flaky
```
/
```
is_quarantined
```
-- 当前状态标记

Step 2: Get Runs on Each Branch

步骤2：获取每个分支的运行记录

List test case runs filtered by the test case, and look at the

git_branch

field:

bash

tuist test case run list <identifier> --json --page-size 20

Separate runs by branch. For each branch, compute:

Pass rate:
```
passed_runs / total_runs * 100
```
Average duration
Flaky run count
Most recent status

筛选对应测试用例的运行记录，查看

git_branch

字段：

bash

tuist test case run list <identifier> --json --page-size 20

按分支拆分运行记录，为每个分支计算以下指标：

通过率：

运行成功次数 / 总运行次数 * 100

平均执行时长
不稳定运行次数
最近一次运行状态

Defaults

默认规则

If no base branch is provided, use the project's default branch (usually
```
main
```
).
If no head branch is provided, detect the current git branch.

如果未提供基准分支，使用项目的默认分支（通常是
```
main
```
）
如果未提供头分支，自动检测当前git分支

Step 3: Compare Branch Behavior

步骤3：对比分支表现

Metric	Base branch	Head branch	Verdict
Pass rate	e.g. 100%	e.g. 60%	REGRESSION
Avg duration	e.g. 0.5s	e.g. 2.1s	REGRESSION
Flaky runs	0	3	NEW FLAKINESS
Last status	success	failure	REGRESSION

Classify the change:

Newly failing: 100% pass rate on base, <100% on head
Newly flaky: No flaky runs on base, flaky runs on head
Duration regression: >50% increase in average duration
Fixed: Failing on base, passing on head
Stable: Same behavior on both branches

指标	基准分支	头分支	结论
通过率	例：100%	例：60%	回归
平均执行时长	例：0.5s	例：2.1s	回归
不稳定运行次数	0	3	新增不稳定问题
最近状态	成功	失败	回归

对变更分类：

新增失败：基准分支通过率100%，头分支低于100%
新增不稳定：基准分支无不稳定运行，头分支存在不稳定运行
耗时回归：平均执行时长涨幅超过50%
已修复：基准分支运行失败，头分支运行成功
稳定：两个分支表现一致

Step 4: Inspect Failures

步骤4：排查失败原因

For each failing run on the head branch:

bash

tuist test case run show <test-case-run-id> --json

Examine:

```
failures[].message
```
-- the assertion or error message
```
failures[].path
```
-- source file path
```
failures[].line_number
```
-- exact line of failure
```
failures[].issue_type
```
-- type of issue
```
repetitions
```
-- shows retry behavior (e.g., pass-fail-pass means flaky)
```
crash_report
```
-- crash data if the test runner crashed

针对头分支的每一次失败运行：

bash

tuist test case run show <test-case-run-id> --json

检查以下字段：

```
failures[].message
```
-- 断言或错误信息
```
failures[].path
```
-- 源文件路径
```
failures[].line_number
```
-- 失败代码的精确行号
```
failures[].issue_type
```
-- 问题类型
```
repetitions
```
-- 展示重试行为（比如过-败-过说明是不稳定测试）
```
crash_report
```
-- 测试运行器崩溃时的崩溃数据

Step 5: Identify Root Cause

步骤5：定位根因

Based on the comparison:

基于对比结果：

Newly failing

新增失败

Check commits between base and head branches for changes to the test file or the code under test.
Look at the failure message for clues about what changed.

检查基准分支和头分支之间的提交，看是否有测试文件或被测代码的变更
从失败信息中查找变更相关的线索

Newly flaky

新增不稳定

Common patterns: timing/async issues, shared state, environment dependencies.
Check if
```
repetitions
```
show intermittent pass/fail patterns.
See the fix-flaky-tests skill for detailed flaky test analysis patterns.

常见模式：时序/异步问题、共享状态、环境依赖
检查
```
repetitions
```
是否存在间歇性通过/失败的模式
参考fix-flaky-tests技能获取详细的不稳定测试分析模式

Duration regression

耗时回归

Check if setup/teardown time increased.
Check if the test is doing more work (new assertions, larger data sets).
Check if a dependency became slower.

检查是否是测试搭建/清理的耗时增加
检查测试是否执行了更多操作（新增断言、更大的数据集）
检查是否是依赖项变慢

Summary Format

汇总格式

Produce a summary with:

Test case info: Name, module, suite, overall reliability.
Base branch behavior: Pass rate, avg duration, flaky count.
Head branch behavior: Pass rate, avg duration, flaky count.
Verdict: What changed and classification.
Root cause: Hypothesis based on failure analysis.
Recommendations: Specific file paths, line numbers, and fix suggestions.

Example:

Test Case Comparison: AuthModuleTests/LoginTests/test_login_with_expired_token

Overall reliability: 85% (was 100% before head branch)

Base (main):
  Pass rate: 100% (15/15 runs)
  Avg duration: 0.3s
  Flaky: No

Head (feature/auth-refactor):
  Pass rate: 60% (3/5 runs)
  Avg duration: 0.5s
  Flaky: Yes (2 flaky runs)

Verdict: NEWLY FLAKY -- test was stable on main but intermittently fails on feature branch

Root cause: The auth refactor introduced an async token refresh that races with the
test's synchronous assertion. Failures show "Expected status 401, got nil" at
Tests/AuthModuleTests/LoginTests.swift:42, suggesting the response arrives before
the token refresh completes.

Recommendations:
- Add an await/expectation before the assertion at LoginTests.swift:42
- Consider mocking the token refresh to make the test deterministic

输出的汇总需要包含以下内容：

测试用例信息：名称、模块、测试套件、整体可靠性
基准分支表现：通过率、平均执行时长、不稳定次数
头分支表现：通过率、平均执行时长、不稳定次数
结论：发生的变更和分类
根因：基于失败分析得出的假设
建议：具体的文件路径、行号和修复建议

示例：

Test Case Comparison: AuthModuleTests/LoginTests/test_login_with_expired_token

Overall reliability: 85% (was 100% before head branch)

Base (main):
  Pass rate: 100% (15/15 runs)
  Avg duration: 0.3s
  Flaky: No

Head (feature/auth-refactor):
  Pass rate: 60% (3/5 runs)
  Avg duration: 0.5s
  Flaky: Yes (2 flaky runs)

Verdict: NEWLY FLAKY -- test was stable on main but intermittently fails on feature branch

Root cause: The auth refactor introduced an async token refresh that races with the
test's synchronous assertion. Failures show "Expected status 401, got nil" at
Tests/AuthModuleTests/LoginTests.swift:42, suggesting the response arrives before
the token refresh completes.

Recommendations:
- Add an await/expectation before the assertion at LoginTests.swift:42
- Consider mocking the token refresh to make the test deterministic

Done Checklist

完成检查清单

Resolved the test case identity
Gathered runs on both branches
Compared pass rates, durations, and flakiness
Inspected failure details for failing runs
Identified root cause with file paths and line numbers
Provided actionable fix recommendations

已确认测试用例身份
已收集两个分支的运行记录
已对比通过率、执行时长和不稳定性
已排查失败运行的详情
已定位根因，附带文件路径和行号
已提供可落地的修复建议