ai-generated-ut-code-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AI UT Code Review

AI UT代码评审

Overview

概述

Review AI-generated unit tests for effectiveness, coverage, assertions, negative cases, determinism, and maintainability. Output a 0-10 score, a risk level, and a must-fix checklist. Overall line coverage must be >= 80%; otherwise risk is at least High.

评审AI生成的单元测试的有效性、覆盖率、断言、异常场景、确定性及可维护性。输出0-10分的评分、风险等级以及必修复检查清单。整体行覆盖率必须≥80%；否则风险至少为高风险。

When to Use

适用场景

AI-generated UT/test code review or quality evaluation
Need scoring, risk level, or must-fix checklist
Questions about coverage or assertion validity

AI生成的UT/测试代码评审或质量评估
需要评分、风险等级或必修复检查清单时
对覆盖率或断言有效性存疑时

Workflow

工作流程

Confirm tests target the intended business code and key paths.
Check overall line coverage (>= 80% required).
Inspect assertions for behavioral validity; flag missing/ineffective assertions.
Verify negative/edge cases and determinism (no env/time dependency).
Score by rubric, assign risk, list must-fix items with evidence.

确认测试针对目标业务代码及关键路径。
检查整体行覆盖率（要求≥80%）。
检查断言的行为有效性；标记缺失或无效的断言。
验证异常/边界场景及确定性（无环境/时间依赖）。
根据评分标准打分，分配风险等级，列出带证据的必修复项。

Scoring (0-10)

评分标准（0-10分）

Each dimension 0-2 points. Sum = total score.

Dimension	0	1	2
Coverage	< 80%	80%+ but shallow	80%+ and meaningful
Assertion Quality	No/invalid assertions	Some weak assertions	Behavior-anchored assertions
Negative & Edge	Missing	Partial	Comprehensive
Data & Isolation	Flaky/env-dependent	Mixed	Deterministic, isolated
Maintainability	Hard to read/modify	Mixed quality	Clear structure & naming

每个维度0-2分，总和为总分。

维度	0分	1分	2分
覆盖率	<80%	≥80%但覆盖较浅	≥80%且覆盖有意义
断言质量	无断言/断言无效	存在部分弱断言	基于行为的断言
异常与边界场景	缺失	部分覆盖	全面覆盖
数据与隔离性	不稳定/依赖环境	混合情况	确定性强、隔离性好
可维护性	难以阅读/修改	质量参差不齐	结构清晰、命名规范

Risk Levels

风险等级

Blocker: Coverage < 80% AND key paths untested, or tests have no meaningful assertions
High: Coverage < 80% OR assertions largely ineffective
Medium: Coverage OK but weak edge cases or fragile design
Low: Minor improvements

阻塞级（Blocker）：覆盖率<80%且关键路径未测试，或测试无有意义的断言
高风险（High）：覆盖率<80%或断言基本无效
中风险（Medium）：覆盖率达标但边界场景薄弱或设计脆弱
低风险（Low）：仅需少量优化

Must-Fix Checklist

必修复检查清单

Overall line coverage >= 80%
Each test has at least one behavior-relevant assertion
Negative/exception cases exist for core logic
Tests are deterministic and repeatable

整体行覆盖率≥80%
每个测试至少包含一个与行为相关的断言
核心逻辑存在异常/异常场景测试
测试具有确定性且可重复执行

AI-Generated Test Pitfalls (Check Explicitly)

AI生成测试的常见陷阱（需重点检查）

No assertions or assertions unrelated to behavior (e.g., only not-null)
Over-mocking hides real behavior
Only happy-path coverage
Tests depend on time/network/env
Missing verification of side effects

无断言或断言与行为无关（例如仅断言非空）
过度Mock掩盖真实行为
仅覆盖正常流程
测试依赖时间/网络/环境
未验证副作用

Output Format (Required, Semi-fixed)

输出格式（必填，半固定）

```
Score
```
: x/10 — Coverage x, Assertion Quality x, Negative & Edge x, Data & Isolation x, Maintainability x
```
Risk
```
: Low/Medium/High/Blocker — 简述原因（1 行）
```
Must-fix
```
:
- [动作 + 证据]
- [动作 + 证据]
```
Key Evidence
```
:
- 引用具体测试用例名或覆盖率报告摘要（1-2 条）
```
Notes
```
:
- 最小修复建议或替代方案（1-2 行）

Rules:

覆盖率 < 80% 风险至少 High，并必须列入
```
Must-fix
```
无断言/无效断言直接提升风险级别，必须列入
```
Must-fix
```
至少 2 条证据；证据不足需说明并降分

```
Score
```
: x/10 — 覆盖率x，断言质量x，异常与边界x，数据与隔离性x，可维护性x
```
Risk
```
: 低/中/高/阻塞级 — 简述原因（1行）
```
Must-fix
```
:
- [动作 + 证据]
- [动作 + 证据]
```
Key Evidence
```
:
- 引用具体测试用例名或覆盖率报告摘要（1-2条）
```
Notes
```
:
- 最小修复建议或替代方案（1-2行）

规则：

覆盖率<80%时风险至少为高风险，且必须列入
```
Must-fix
```
无断言/无效断言直接提升风险等级，必须列入
```
Must-fix
```
至少提供2条证据；证据不足需说明并扣分

Common Mistakes

常见错误

仅报告覆盖率，不评价断言有效性
把日志输出当成断言
忽略失败路径/异常路径

仅报告覆盖率，不评价断言有效性
将日志输出当作断言
忽略失败路径/异常路径

Example (Concise)

示例（简洁版）

Score: 5/10 (Coverage 1, Assertion 0, Negative 1, Data 2, Maintainability 1) Risk: High Must-fix:

Tests for
```
parseConfig()
```
contain no behavior assertions (only logs)
No negative cases for malformed input Key Evidence:
```
parseConfig()
```
tests only assert no crash
Coverage report shows 62% lines Notes:
Add assertions on outputs and side effects; add invalid input tests.

Score: 5/10（覆盖率1，断言0，异常与边界1，数据与隔离性2，可维护性1） Risk: 高风险 Must-fix:

```
parseConfig()
```
的测试无行为断言（仅含日志）
未针对格式错误的输入添加异常场景测试 Key Evidence:
```
parseConfig()
```
的测试仅断言无崩溃
覆盖率报告显示行覆盖率为62% Notes:
添加针对输出及副作用的断言；添加无效输入测试。