compare-test-case

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Compare Test Case

测试用例对比

Quick Start

快速上手

You'll typically receive a test case identifier and two branches. Follow these steps:
  1. Run
    tuist test case show <id-or-identifier> --json
    to get the test case metrics.
  2. Run
    tuist test case run list <identifier> --json
    to see runs across branches.
  3. Compare behavior between base and head branches.
  4. Inspect failures with
    tuist test case run show <run-id> --json
    .
  5. Summarize findings with root cause analysis.
通常你会拿到一个测试用例标识和两个分支,按照以下步骤操作:
  1. 执行
    tuist test case show <id-or-identifier> --json
    获取测试用例指标
  2. 执行
    tuist test case run list <identifier> --json
    查看跨分支的运行记录
  3. 对比基准分支和头分支的测试表现
  4. 执行
    tuist test case run show <run-id> --json
    排查失败原因
  5. 汇总分析结果并输出根因分析

Step 1: Resolve the Test Case

步骤1:解析测试用例

By ID or dashboard URL

通过ID或看板URL

bash
tuist test case show <test-case-id> --json
bash
tuist test case show <test-case-id> --json

By identifier (Module/Suite/TestCase)

通过标识(模块/测试套件/测试用例)

bash
tuist test case show Module/Suite/TestCase --json
bash
tuist test case show Module/Suite/TestCase --json

If no test case is provided

如果未提供测试用例

Discover flaky or failing tests to investigate:
bash
tuist test case list --flaky --json --page-size 10
Key fields from the response:
  • id
    -- unique identifier for subsequent commands
  • name
    ,
    module_name
    ,
    suite_name
    -- the test identity
  • reliability_rate
    -- percentage of successful runs
  • flakiness_rate
    -- percentage of flaky runs in the last 30 days
  • total_runs
    /
    failed_runs
    -- volume context
  • is_flaky
    /
    is_quarantined
    -- current flags
可先查询待排查的不稳定或失败测试:
bash
tuist test case list --flaky --json --page-size 10
响应中的关键字段:
  • id
    -- 后续命令需要用到的唯一标识
  • name
    ,
    module_name
    ,
    suite_name
    -- 测试的身份信息
  • reliability_rate
    -- 运行成功率百分比
  • flakiness_rate
    -- 过去30天内不稳定运行的占比
  • total_runs
    /
    failed_runs
    -- 运行次数上下文
  • is_flaky
    /
    is_quarantined
    -- 当前状态标记

Step 2: Get Runs on Each Branch

步骤2:获取每个分支的运行记录

List test case runs filtered by the test case, and look at the
git_branch
field:
bash
tuist test case run list <identifier> --json --page-size 20
Separate runs by branch. For each branch, compute:
  • Pass rate:
    passed_runs / total_runs * 100
  • Average duration
  • Flaky run count
  • Most recent status
筛选对应测试用例的运行记录,查看
git_branch
字段:
bash
tuist test case run list <identifier> --json --page-size 20
按分支拆分运行记录,为每个分支计算以下指标:
  • 通过率:
    运行成功次数 / 总运行次数 * 100
  • 平均执行时长
  • 不稳定运行次数
  • 最近一次运行状态

Defaults

默认规则

  • If no base branch is provided, use the project's default branch (usually
    main
    ).
  • If no head branch is provided, detect the current git branch.
  • 如果未提供基准分支,使用项目的默认分支(通常是
    main
  • 如果未提供头分支,自动检测当前git分支

Step 3: Compare Branch Behavior

步骤3:对比分支表现

MetricBase branchHead branchVerdict
Pass ratee.g. 100%e.g. 60%REGRESSION
Avg duratione.g. 0.5se.g. 2.1sREGRESSION
Flaky runs03NEW FLAKINESS
Last statussuccessfailureREGRESSION
Classify the change:
  • Newly failing: 100% pass rate on base, <100% on head
  • Newly flaky: No flaky runs on base, flaky runs on head
  • Duration regression: >50% increase in average duration
  • Fixed: Failing on base, passing on head
  • Stable: Same behavior on both branches
指标基准分支头分支结论
通过率例:100%例:60%回归
平均执行时长例:0.5s例:2.1s回归
不稳定运行次数03新增不稳定问题
最近状态成功失败回归
对变更分类:
  • 新增失败:基准分支通过率100%,头分支低于100%
  • 新增不稳定:基准分支无不稳定运行,头分支存在不稳定运行
  • 耗时回归:平均执行时长涨幅超过50%
  • 已修复:基准分支运行失败,头分支运行成功
  • 稳定:两个分支表现一致

Step 4: Inspect Failures

步骤4:排查失败原因

For each failing run on the head branch:
bash
tuist test case run show <test-case-run-id> --json
Examine:
  • failures[].message
    -- the assertion or error message
  • failures[].path
    -- source file path
  • failures[].line_number
    -- exact line of failure
  • failures[].issue_type
    -- type of issue
  • repetitions
    -- shows retry behavior (e.g., pass-fail-pass means flaky)
  • crash_report
    -- crash data if the test runner crashed
针对头分支的每一次失败运行:
bash
tuist test case run show <test-case-run-id> --json
检查以下字段:
  • failures[].message
    -- 断言或错误信息
  • failures[].path
    -- 源文件路径
  • failures[].line_number
    -- 失败代码的精确行号
  • failures[].issue_type
    -- 问题类型
  • repetitions
    -- 展示重试行为(比如过-败-过说明是不稳定测试)
  • crash_report
    -- 测试运行器崩溃时的崩溃数据

Step 5: Identify Root Cause

步骤5:定位根因

Based on the comparison:
基于对比结果:

Newly failing

新增失败

  • Check commits between base and head branches for changes to the test file or the code under test.
  • Look at the failure message for clues about what changed.
  • 检查基准分支和头分支之间的提交,看是否有测试文件或被测代码的变更
  • 从失败信息中查找变更相关的线索

Newly flaky

新增不稳定

  • Common patterns: timing/async issues, shared state, environment dependencies.
  • Check if
    repetitions
    show intermittent pass/fail patterns.
  • See the fix-flaky-tests skill for detailed flaky test analysis patterns.
  • 常见模式:时序/异步问题、共享状态、环境依赖
  • 检查
    repetitions
    是否存在间歇性通过/失败的模式
  • 参考fix-flaky-tests技能获取详细的不稳定测试分析模式

Duration regression

耗时回归

  • Check if setup/teardown time increased.
  • Check if the test is doing more work (new assertions, larger data sets).
  • Check if a dependency became slower.
  • 检查是否是测试搭建/清理的耗时增加
  • 检查测试是否执行了更多操作(新增断言、更大的数据集)
  • 检查是否是依赖项变慢

Summary Format

汇总格式

Produce a summary with:
  1. Test case info: Name, module, suite, overall reliability.
  2. Base branch behavior: Pass rate, avg duration, flaky count.
  3. Head branch behavior: Pass rate, avg duration, flaky count.
  4. Verdict: What changed and classification.
  5. Root cause: Hypothesis based on failure analysis.
  6. Recommendations: Specific file paths, line numbers, and fix suggestions.
Example:
Test Case Comparison: AuthModuleTests/LoginTests/test_login_with_expired_token

Overall reliability: 85% (was 100% before head branch)

Base (main):
  Pass rate: 100% (15/15 runs)
  Avg duration: 0.3s
  Flaky: No

Head (feature/auth-refactor):
  Pass rate: 60% (3/5 runs)
  Avg duration: 0.5s
  Flaky: Yes (2 flaky runs)

Verdict: NEWLY FLAKY -- test was stable on main but intermittently fails on feature branch

Root cause: The auth refactor introduced an async token refresh that races with the
test's synchronous assertion. Failures show "Expected status 401, got nil" at
Tests/AuthModuleTests/LoginTests.swift:42, suggesting the response arrives before
the token refresh completes.

Recommendations:
- Add an await/expectation before the assertion at LoginTests.swift:42
- Consider mocking the token refresh to make the test deterministic
输出的汇总需要包含以下内容:
  1. 测试用例信息:名称、模块、测试套件、整体可靠性
  2. 基准分支表现:通过率、平均执行时长、不稳定次数
  3. 头分支表现:通过率、平均执行时长、不稳定次数
  4. 结论:发生的变更和分类
  5. 根因:基于失败分析得出的假设
  6. 建议:具体的文件路径、行号和修复建议
示例:
Test Case Comparison: AuthModuleTests/LoginTests/test_login_with_expired_token

Overall reliability: 85% (was 100% before head branch)

Base (main):
  Pass rate: 100% (15/15 runs)
  Avg duration: 0.3s
  Flaky: No

Head (feature/auth-refactor):
  Pass rate: 60% (3/5 runs)
  Avg duration: 0.5s
  Flaky: Yes (2 flaky runs)

Verdict: NEWLY FLAKY -- test was stable on main but intermittently fails on feature branch

Root cause: The auth refactor introduced an async token refresh that races with the
test's synchronous assertion. Failures show "Expected status 401, got nil" at
Tests/AuthModuleTests/LoginTests.swift:42, suggesting the response arrives before
the token refresh completes.

Recommendations:
- Add an await/expectation before the assertion at LoginTests.swift:42
- Consider mocking the token refresh to make the test deterministic

Done Checklist

完成检查清单

  • Resolved the test case identity
  • Gathered runs on both branches
  • Compared pass rates, durations, and flakiness
  • Inspected failure details for failing runs
  • Identified root cause with file paths and line numbers
  • Provided actionable fix recommendations
  • 已确认测试用例身份
  • 已收集两个分支的运行记录
  • 已对比通过率、执行时长和不稳定性
  • 已排查失败运行的详情
  • 已定位根因,附带文件路径和行号
  • 已提供可落地的修复建议