test-anti-patterns

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test Anti-Pattern Detection

测试反模式检测

Quick, pragmatic analysis of .NET test code for anti-patterns and quality issues that undermine test reliability, maintainability, and diagnostic value.
快速实用地分析.NET测试代码中会损害测试可靠性、可维护性和诊断价值的反模式和质量问题。

When to Use

适用场景

  • User asks to review test quality or find test smells
  • User wants to know why tests are flaky or unreliable
  • User asks "are my tests good?" or "what's wrong with my tests?"
  • User requests a test audit or test code review
  • User wants to improve existing test code
  • 用户要求评审测试质量或查找测试异味
  • 用户想了解测试不稳定或不可靠的原因
  • 用户询问「我的测试写得好吗?」或「我的测试有什么问题?」
  • 用户请求测试审计或测试代码评审
  • 用户想要改进现有测试代码

When Not to Use

不适用场景

  • User wants to write new tests from scratch (use
    writing-mstest-tests
    )
  • User wants to run or execute tests (use
    run-tests
    )
  • User wants to migrate between test frameworks or versions (use migration skills)
  • User wants to measure code coverage (out of scope)
  • User wants a deep formal test smell audit with academic taxonomy and extended catalog (use
    exp-test-smell-detection
    )
  • 用户想要从零编写新测试(请使用
    writing-mstest-tests
  • 用户想要运行或执行测试(请使用
    run-tests
  • 用户想要在测试框架或版本之间迁移(请使用迁移相关技能)
  • 用户想要衡量代码覆盖率(超出范围)
  • 用户想要基于学术分类体系和扩展目录的深度正式测试异味审计(请使用
    exp-test-smell-detection

Inputs

输入项

InputRequiredDescription
Test codeYesOne or more test files or classes to analyze
Production codeNoThe code under test, for context on what tests should verify
Specific concernNoA focused area like "flakiness" or "naming" to narrow the review
输入必填说明
测试代码待分析的一个或多个测试文件或类
生产代码被测试的代码,用于提供测试应验证内容的上下文
特定关注点可指定如「flakiness」或「naming」等聚焦领域来缩小评审范围

Workflow

工作流程

Step 1: Gather the test code

步骤1:收集测试代码

Read the test files the user wants reviewed. If the user points to a directory or project, scan for all test files using the framework-specific markers in the
dotnet-test-frameworks
skill (e.g.,
[TestClass]
,
[Fact]
,
[Test]
).
If production code is available, read it too -- this is critical for detecting tests that are coupled to implementation details rather than behavior.
读取用户想要评审的测试文件。如果用户指定了目录或项目,可使用
dotnet-test-frameworks
技能中的框架特定标记(例如
[TestClass]
[Fact]
[Test]
)扫描所有测试文件。
如果有可用的生产代码,也请读取——这对于检测耦合于实现细节而非行为的测试至关重要。

Step 2: Scan for anti-patterns

步骤2:扫描反模式

Check each test file against the anti-pattern catalog below. Report findings grouped by severity.
对照下方的反模式目录检查每个测试文件。按严重程度分组报告发现的问题。

Critical -- Tests that give false confidence

严重——会给出错误信心的测试

Anti-PatternWhat to Look For
No assertionsTest methods that execute code but never assert anything. A passing test without assertions proves nothing.
Swallowed exceptions
try { ... } catch { }
or
catch (Exception)
without rethrowing or asserting. Failures are silently hidden.
Assert in catch block only
try { Act(); } catch (Exception ex) { Assert.Fail(ex.Message); }
-- use
Assert.ThrowsException
or equivalent instead. The test passes when no exception is thrown even if the result is wrong.
Always-true assertions
Assert.IsTrue(true)
,
Assert.AreEqual(x, x)
, or conditions that can never fail.
Commented-out assertionsAssertions that were disabled but the test still runs, giving the illusion of coverage.
反模式检测点
无断言执行了代码但未做任何断言的测试方法。没有断言的测试通过毫无意义。
异常被吞
try { ... } catch { }
catch (Exception)
没有重新抛出或做断言,失败被静默隐藏。
仅在catch块中做断言
try { Act(); } catch (Exception ex) { Assert.Fail(ex.Message); }
——请改用
Assert.ThrowsException
或等价方法。这种写法下即使结果错误,只要没有抛出异常测试就会通过。
永真断言
Assert.IsTrue(true)
Assert.AreEqual(x, x)
或永远不会失败的判断条件。
被注释的断言断言被禁用但测试仍在运行,给人一种有覆盖率的错觉。

High -- Tests likely to cause pain

高危——可能带来麻烦的测试

Anti-PatternWhat to Look For
Flakiness indicators
Thread.Sleep(...)
,
Task.Delay(...)
for synchronization,
DateTime.Now
/
DateTime.UtcNow
without abstraction,
Random
without a seed, environment-dependent paths.
Test ordering dependencyStatic mutable fields modified across tests,
[TestInitialize]
that doesn't fully reset state, tests that fail when run individually but pass in suite (or vice versa).
Over-mockingMore mock setup lines than actual test logic. Verifying exact call sequences on mocks rather than outcomes. Mocking types the test owns. For a deep mock audit, use
exp-mock-usage-analysis
.
Implementation couplingTesting private methods via reflection, asserting on internal state, verifying exact method call counts on collaborators instead of observable behavior.
Broad exception assertions
Assert.ThrowsException<Exception>(...)
instead of the specific exception type. Also:
[ExpectedException(typeof(Exception))]
.
反模式检测点
不稳定性指标用于同步的
Thread.Sleep(...)
Task.Delay(...)
,未做抽象的
DateTime.Now
/
DateTime.UtcNow
,无固定种子的
Random
,依赖环境的路径。
测试执行顺序依赖跨测试修改的静态可变字段,未完全重置状态的
[TestInitialize]
,单独运行失败但在套件中运行通过(或反之)的测试。
过度MockMock设置代码行数多于实际测试逻辑。验证Mock的精确调用顺序而非结果。Mock测试所属的自有类型。如需深度Mock审计,请使用
exp-mock-usage-analysis
实现耦合通过反射测试私有方法,断言内部状态,验证协作者的精确方法调用次数而非可观测行为。
宽泛的异常断言
Assert.ThrowsException<Exception>(...)
而非特定异常类型,还有
[ExpectedException(typeof(Exception))]

Medium -- Maintainability and clarity issues

中等——可维护性和清晰度问题

Anti-PatternWhat to Look For
Poor namingTest names like
Test1
,
TestMethod
, names that don't describe the scenario or expected outcome. Good:
Add_NegativeNumber_ThrowsArgumentException
.
Magic valuesUnexplained numbers or strings in arrange/assert:
Assert.AreEqual(42, result)
-- what does 42 mean?
Duplicate testsThree or more test methods with near-identical bodies that differ only in a single input value. Should be data-driven (
[DataRow]
,
[Theory]
,
[TestCase]
). For a detailed duplication analysis, use
exp-test-maintainability
. Note: Two tests covering distinct boundary conditions (e.g., zero vs. negative) are NOT duplicates -- separate tests for different edge cases provide clearer failure diagnostics and are a valid practice.
Giant testsTest methods exceeding ~30 lines or testing multiple behaviors at once. Hard to diagnose when they fail.
Assertion messages that repeat the assertion
Assert.AreEqual(expected, actual, "Expected and actual are not equal")
adds no information. Messages should describe the business meaning.
Missing AAA separationArrange, Act, Assert phases are interleaved or indistinguishable.
反模式检测点
命名糟糕测试名称如
Test1
TestMethod
,名称未描述场景或预期结果。优秀示例:
Add_NegativeNumber_ThrowsArgumentException
魔法值准备/断言阶段出现未解释的数字或字符串:
Assert.AreEqual(42, result)
——42代表什么含义?
重复测试三个及以上测试方法的主体几乎完全相同,仅单个输入值不同。应该用数据驱动实现(
[DataRow]
[Theory]
[TestCase]
)。如需详细的重复度分析,请使用
exp-test-maintainability
。注意:覆盖不同边界条件的两个测试(例如零vs负数)不属于重复——为不同边缘case编写单独的测试可提供更清晰的失败诊断,是合理实践。
巨型测试测试方法超过约30行,或同时测试多个行为。失败时难以定位问题。
断言信息重复断言逻辑
Assert.AreEqual(expected, actual, "Expected and actual are not equal")
没有提供额外信息。断言信息应该描述业务含义。
缺少AAA阶段划分准备(Arrange)、执行(Act)、断言(Assert)阶段交错或无法区分。

Low -- Style and hygiene

轻微——风格和规范问题

Anti-PatternWhat to Look For
Unused test infrastructure
[TestInitialize]
/
[SetUp]
that does nothing, test helper methods that are never called.
IDisposable not disposedTest creates
HttpClient
,
Stream
, or other disposable objects without
using
or cleanup.
Console.WriteLine debuggingLeftover
Console.WriteLine
or
Debug.WriteLine
statements used during test development.
Inconsistent naming conventionMix of naming styles in the same test class (e.g., some use
Method_Scenario_Expected
, others use
ShouldDoSomething
).
反模式检测点
未使用的测试基础设施空的
[TestInitialize]
/
[SetUp]
,从未被调用的测试辅助方法。
未释放IDisposable资源测试创建了
HttpClient
Stream
或其他可释放对象,但未使用
using
或做清理。
调试用Console.WriteLine测试开发阶段遗留的
Console.WriteLine
Debug.WriteLine
语句。
命名规范不一致同一个测试类中混合使用多种命名风格(例如部分使用
方法_场景_预期结果
,部分使用
ShouldDoSomething
)。

Step 3: Calibrate severity honestly

步骤3:客观校准严重程度

Before reporting, re-check each finding against these severity rules:
  • Critical/High: Only for issues that cause tests to give false confidence or be unreliable. A test that always passes regardless of correctness is Critical. Flaky shared state is High.
  • Medium: Only for issues that actively harm maintainability -- 5+ nearly-identical tests, truly meaningless names like
    Test1
    .
  • Low: Cosmetic naming mismatches, minor style preferences, assertion messages that could be better. When in doubt, rate Low.
  • Not an issue: Separate tests for distinct boundary conditions (zero vs. negative vs. null). Explicit per-test setup instead of
    [TestInitialize]
    (this improves isolation). Tests that are short and clear but could theoretically be consolidated.
IMPORTANT: If the tests are well-written, say so clearly up front. Do not inflate severity to justify the review. A review that finds zero Critical/High issues and only minor Low suggestions is a valid and valuable outcome. Lead with what the tests do well.
报告前,请对照以下严重程度规则重新检查每个发现的问题:
  • 严重/高危:仅针对会导致测试给出错误信心或不可靠的问题。无论正确性如何始终通过的测试属于严重。导致状态共享的不稳定问题属于高危。
  • 中等:仅针对会切实损害可维护性的问题——5个及以上几乎完全相同的测试,像
    Test1
    这种完全无意义的命名。
  • 轻微:外观上的命名不匹配、次要的风格偏好、可以优化的断言信息。存疑时评级为轻微。
  • 不是问题:为不同边界条件(零vs负数vs空值)编写的单独测试。显式的单测试设置而非使用
    [TestInitialize]
    (这会提升隔离性)。简短清晰但理论上可以合并的测试。
重要提示:如果测试写得很好,请在开头明确说明。不要为了证明评审有价值而抬高严重程度。没有发现严重/高危问题,仅找到少量轻微改进建议的评审是有效且有价值的结果。请先说明测试的优点。

Step 4: Report findings

步骤4:报告发现的问题

Present findings in this structure:
  1. Summary -- Total issues found, broken down by severity (Critical / High / Medium / Low). If tests are well-written, lead with that assessment.
  2. Critical and High findings -- List each with:
    • The anti-pattern name
    • The specific location (file, method name, line)
    • A brief explanation of why it's a problem
    • A concrete fix (show before/after code when helpful)
  3. Medium and Low findings -- Summarize in a table unless the user wants full detail
  4. Positive observations -- Call out things the tests do well (sealed class, specific exception types, data-driven tests, clear AAA structure, proper use of fakes, good naming). Don't only report negatives.
按照以下结构呈现结果:
  1. 摘要——发现的问题总数,按严重程度(严重 / 高危 / 中等 / 轻微)拆分。如果测试写得很好,请先给出该评价。
  2. 严重和高危问题——逐个列出,包含:
    • 反模式名称
    • 具体位置(文件、方法名、行号)
    • 问题原因的简要说明
    • 具体修复方案(必要时展示修改前后的代码)
  3. 中等和轻微问题——除非用户想要完整细节,否则用表格汇总
  4. 正面观察——指出测试做得好的地方(密封类、特定异常类型、数据驱动测试、清晰的AAA结构、正确使用fakes、良好的命名)。不要只报告负面问题。

Step 5: Prioritize recommendations

步骤5:优先级排序建议

If there are many findings, recommend which to fix first:
  1. Critical -- Fix immediately, these tests may be giving false confidence
  2. High -- Fix soon, these cause flakiness or maintenance burden
  3. Medium/Low -- Fix opportunistically during related edits
如果发现的问题很多,建议按以下顺序修复:
  1. 严重——立即修复,这些测试可能正在给出错误信心
  2. 高危——尽快修复,这些会导致不稳定或维护负担
  3. 中等/轻微——在相关代码修改时顺路修复

Validation

校验项

  • Every finding includes a specific location (not just a general warning)
  • Every Critical/High finding includes a concrete fix
  • Report covers all categories (assertions, isolation, naming, structure)
  • Positive observations are included alongside problems
  • Recommendations are prioritized by severity
  • 每个发现的问题都包含具体位置(不仅仅是通用警告)
  • 每个严重/高危问题都包含具体修复方案
  • 报告覆盖所有类别(断言、隔离性、命名、结构)
  • 问题之外包含正面观察
  • 建议按严重程度排序

Common Pitfalls

常见陷阱

PitfallSolution
Reporting style issues as criticalNaming and formatting are Medium/Low, never Critical
Suggesting rewrites instead of targeted fixesShow minimal diffs -- change the assertion, not the whole test
Flagging intentional design choicesIf
Thread.Sleep
is in an integration test testing actual timing, that's not an anti-pattern. Consider context.
Inventing false positives on clean codeIf tests follow best practices, say so. A review finding "0 Critical, 0 High, 1 Low" is perfectly valid. Don't inflate findings to justify the review.
Flagging separate boundary tests as duplicatesTwo tests for zero and negative inputs test different edge cases. Only flag as duplicates when 3+ tests have truly identical bodies differing by a single value.
Rating cosmetic issues as MediumNaming mismatches (e.g., method name says
ArgumentException
but asserts
ArgumentOutOfRangeException
) are Low, not Medium -- the test still works correctly.
Ignoring the test frameworkxUnit uses
[Fact]
/
[Theory]
, NUnit uses
[Test]
/
[TestCase]
, MSTest uses
[TestMethod]
/
[DataRow]
-- use correct terminology
Missing the forest for the treesIf 80% of tests have no assertions, lead with that systemic issue rather than listing every instance
陷阱解决方案
将风格问题报告为严重命名和格式属于中等/轻微,永远不是严重问题
建议重写而非针对性修复展示最小变更——修改断言即可,不要改动整个测试
误判有意的设计选择如果
Thread.Sleep
出现在测试实际时序的集成测试中,就不属于反模式。请结合上下文判断
在干净代码上制造误报如果测试遵循最佳实践,请如实说明。评审结论为「0个严重,0个高危,1个轻微」是完全合理的。不要为了证明评审有价值而夸大问题
将独立的边界测试标记为重复针对零输入和负数输入的两个测试覆盖的是不同的边缘case。仅当3个及以上测试的主体完全相同,仅单个值不同时才标记为重复
将外观问题评级为中等命名不匹配(例如方法名写的是
ArgumentException
但断言的是
ArgumentOutOfRangeException
)属于轻微,不是中等——测试仍然可以正常工作
忽略测试框架差异xUnit使用
[Fact]
/
[Theory]
,NUnit使用
[Test]
/
[TestCase]
,MSTest使用
[TestMethod]
/
[DataRow]
——请使用正确的术语
见树不见林如果80%的测试都没有断言,请先指出这个系统性问题,而不是逐个列出每个实例