sherlock-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Sherlock Review

Sherlock 代码审查

<default_to_action> When investigating code claims:

OBSERVE: Gather all evidence (code, tests, history, behavior)
DEDUCE: What does evidence actually show vs. what was claimed?
ELIMINATE: Rule out what cannot be true
CONCLUDE: Does evidence support the claim?
DOCUMENT: Findings with proof, not assumptions

The 3-Step Investigation:

bash

undefined

<default_to_action> 当调查代码相关声明时：

观察：收集所有证据（代码、测试、历史记录、运行行为）
推演：证据实际表明的情况与声称的内容有何差异？
排除：排除不可能成立的情况
结论：证据是否支持该声明？
记录：基于证据而非假设记录调查结果

三步调查法：

bash

undefined

1. OBSERVE: Gather evidence

1. 观察：收集证据

git diff <commit> npm test -- --coverage

2. DEDUCE: Compare claim vs reality

2. 推演：对比声明与实际情况

Does code match description?

代码是否与描述一致？

Do tests prove the fix/feature?

测试能否证明修复/功能的有效性？

3. CONCLUDE: Verdict with evidence

3. 结论：基于证据给出判定

SUPPORTED / PARTIALLY SUPPORTED / NOT SUPPORTED

支持/部分支持/不支持


**Holmesian Principles:**
- "Data! Data! Data!" - Collect before concluding
- "Eliminate the impossible" - What cannot be true?
- "You see, but do not observe" - Run code, don't just read
- Trust only reproducible evidence
</default_to_action>


**福尔摩斯原则：**
- "数据！数据！数据！" - 先收集再下结论
- "排除一切不可能的" - 哪些情况不可能成立？
- "你是在看，而不是在观察" - 运行代码，不要只阅读
- 只信任可复现的证据
</default_to_action>

Quick Reference Card

快速参考卡片

Evidence Collection Checklist

证据收集清单

Category	What to Check	How
Claim	PR description, commit messages	Read thoroughly
Code	Actual file changes	`git diff`
Tests	Coverage, assertions	Run independently
Behavior	Runtime output	Execute locally
Timeline	When things happened	`git log` , `git blame`

分类	检查要点	检查方式
声明内容	PR描述、提交信息	仔细阅读
代码	实际文件变更	`git diff`
测试	覆盖率、断言逻辑	独立运行测试
运行行为	运行时输出	本地执行代码
时间线	事件发生顺序	`git log` , `git blame`

Verdict Levels

判定等级

Verdict	Meaning
✓ TRUE	Evidence fully supports claim
⚠ PARTIALLY TRUE	Claim accurate but incomplete
✗ FALSE	Evidence contradicts claim
? NONSENSICAL	Claim doesn't apply to context

判定	含义
✓ 真实	证据完全支持声明
⚠ 部分真实	声明内容准确但不完整
✗ 虚假	证据与声明矛盾
? 无意义	声明与当前语境不相关

Investigation Template

调查模板

markdown

undefined

markdown

undefined

Sherlock Investigation: [Claim]

Sherlock 调查：[声明内容]

The Claim

声明内容

"[What PR/commit claims to do]"

"[PR/提交声称实现的功能]"

Evidence Examined

已检查的证据

Code changes: [files, lines]
Tests added: [count, coverage]
Behavior observed: [what actually happens]

代码变更：[文件、行号]
新增测试：[数量、覆盖率]
观察到的运行行为：[实际发生的情况]

Deductive Analysis

推演分析

Claim: [specific assertion] Evidence: [what you found] Deduction: [logical conclusion] Verdict: ✓/⚠/✗

声明： [具体断言] 证据： [你的发现] 推演： [逻辑结论] 判定： ✓/⚠/✗

Findings

调查结果

What works: [with evidence]
What doesn't: [with evidence]
What's missing: [gaps in implementation/testing]

有效部分：[附证据]
无效部分：[附证据]
缺失部分：[实现/测试中的漏洞]

Recommendations

建议

[Action based on findings]

---

[基于调查结果的行动项]

---

Investigation Scenarios

调查场景

Scenario 1: "This Fixed the Bug"

场景1："该修复解决了Bug"

Steps:

Reproduce bug on commit before fix
Verify bug is gone on commit with fix
Check if fix addresses root cause or symptom
Test edge cases not in original report

Red Flags:

Fix that just removes error logging
Works only for specific test case
Workarounds instead of root cause fix
No regression test added

步骤：

在修复前的提交版本上复现Bug
验证修复后的提交版本中Bug已消失
检查修复是否针对根本原因而非表面症状
测试原始报告未提及的边缘情况

预警信号：

仅移除错误日志的修复
仅在特定测试用例中生效
使用临时替代方案而非修复根本原因
未添加回归测试

Scenario 2: "Improved Performance by 50%"

场景2："性能提升50%"

Steps:

Run benchmark on baseline commit
Run same benchmark on optimized commit
Compare in identical conditions
Verify measurement methodology

Red Flags:

Tested only on toy data
Different comparison conditions
Trade-offs not mentioned

步骤：

在基准提交版本上运行性能测试
在优化后的提交版本上运行相同的性能测试
在完全相同的条件下对比结果
验证测试方法的合理性

预警信号：

仅在测试数据上验证
对比条件不一致
未提及性能折衷

Scenario 3: "Handles All Edge Cases"

场景3："处理所有边缘情况"

Steps:

List all edge cases in code path
Check each has test coverage
Test boundary conditions
Verify error handling paths

Red Flags:

```
catch {}
```
swallowing errors
Generic error messages
No logging of critical errors

步骤：

列出代码路径中的所有边缘情况
检查每个边缘情况是否有测试覆盖
测试边界条件
验证错误处理逻辑

预警信号：

```
catch {}
```
吞掉错误
通用错误提示信息
未记录关键错误

Example Investigation

调查示例

markdown

undefined

markdown

undefined

Case: PR #123 "Fix race condition in async handler"

案例：PR #123 "修复异步处理器中的竞态条件"

Claims Examined:

待检查的声明：

"Eliminates race condition"
"Adds mutex locking"
"100% thread safe"

"消除了竞态条件"
"添加了互斥锁"
"100%线程安全"

Evidence:

证据：

File: src/handlers/async-handler.js
Changes: Added
```
async/await
```
, removed callbacks
Tests: 2 new tests for async flow
Coverage: 85% (was 75%)

文件：src/handlers/async-handler.js
变更：新增
```
async/await
```
，移除回调函数
测试：新增2个异步流程测试
覆盖率：85%（之前为75%）

Analysis:

分析：

Claim 1: "Eliminates race condition" Evidence: Added

await

to sequential operations. No actual mutex. Deduction: Race avoided by removing concurrency, not synchronization. Verdict: ⚠ PARTIALLY TRUE (solved differently than claimed)

Claim 2: "Adds mutex locking" Evidence: No mutex library, no lock variables, no sync primitives. Verdict: ✗ FALSE

Claim 3: "100% thread safe" Evidence: JavaScript is single-threaded. No worker threads used. Verdict: ? NONSENSICAL (meaningless in this context)

声明1："消除了竞态条件" 证据：为顺序操作添加了

await

。未使用实际的互斥锁。推演：通过移除并发而非同步机制避免了竞态。判定：⚠ 部分真实（解决方式与声称不符）

声明2："添加了互斥锁" 证据：无互斥锁库、无锁变量、无同步原语。判定：✗ 虚假

声明3："100%线程安全" 证据：JavaScript是单线程语言，未使用工作线程。判定：? 无意义（在此语境下无实际含义）

Conclusion:

结论：

Fix works but not for reasons claimed. Race condition avoided by making operations sequential, not by adding synchronization.

修复有效，但并非基于声称的原因。竞态条件通过使操作顺序化而非添加同步机制得以避免。

Recommendations:

建议：

Update PR description to accurately reflect solution
Add test for concurrent request handling
Remove incorrect technical claims

---

更新PR描述以准确反映解决方案
添加并发请求处理的测试
删除错误的技术声明

---

Agent Integration

Agent 集成

typescript

// Evidence-based code review
await Task("Sherlock Review", {
  prNumber: 123,
  claims: [
    "Fixes memory leak",
    "Improves performance 30%"
  ],
  verifyReproduction: true,
  testEdgeCases: true
}, "qe-code-reviewer");

// Bug fix verification
await Task("Verify Fix", {
  bugCommit: 'abc123',
  fixCommit: 'def456',
  reproductionSteps: steps,
  testBoundaryConditions: true
}, "qe-code-reviewer");

typescript

// 基于证据的代码审查
await Task("Sherlock Review", {
  prNumber: 123,
  claims: [
    "Fixes memory leak",
    "Improves performance 30%"
  ],
  verifyReproduction: true,
  testEdgeCases: true
}, "qe-code-reviewer");

// Bug修复验证
await Task("Verify Fix", {
  bugCommit: 'abc123',
  fixCommit: 'def456',
  reproductionSteps: steps,
  testBoundaryConditions: true
}, "qe-code-reviewer");

Agent Coordination Hints

Agent 协作提示

Memory Namespace

内存命名空间

aqe/sherlock/
├── investigations/*   - Investigation reports
├── evidence/*         - Collected evidence
├── verdicts/*         - Claim verdicts
└── patterns/*         - Common deception patterns

aqe/sherlock/
├── investigations/*   - 调查报告
├── evidence/*         - 收集的证据
├── verdicts/*         - 声明判定结果
└── patterns/*         - 常见误导模式

Fleet Coordination

集群协作

typescript

const investigationFleet = await FleetManager.coordinate({
  strategy: 'evidence-investigation',
  agents: [
    'qe-code-reviewer',        // Code analysis
    'qe-security-auditor',     // Security claim verification
    'qe-performance-validator' // Performance claim verification
  ],
  topology: 'parallel'
});

typescript

const investigationFleet = await FleetManager.coordinate({
  strategy: 'evidence-investigation',
  agents: [
    'qe-code-reviewer',        // 代码分析
    'qe-security-auditor',     // 安全声明验证
    'qe-performance-validator' // 性能声明验证
  ],
  topology: 'parallel'
});

Related Skills

Remember

谨记

"It is a capital mistake to theorize before one has data." Trust only reproducible evidence. Don't trust commit messages, documentation, or "works on my machine."

The Sherlock Standard: Every claim must be verified empirically. What does the evidence actually show?

"在没有数据之前就妄下结论是大错特错。" 只信任可复现的证据。不要轻信提交信息、文档或"在我机器上能运行"的说法。

Sherlock 标准： 每一项声明都必须通过实证验证。证据实际表明了什么？

sherlock-review

Original

Translation

Sherlock Review

Sherlock 代码审查

1. OBSERVE: Gather evidence

1. 观察：收集证据

2. DEDUCE: Compare claim vs reality

2. 推演：对比声明与实际情况

Does code match description?

代码是否与描述一致？

Do tests prove the fix/feature?

测试能否证明修复/功能的有效性？

3. CONCLUDE: Verdict with evidence

3. 结论：基于证据给出判定

SUPPORTED / PARTIALLY SUPPORTED / NOT SUPPORTED

支持/部分支持/不支持

Quick Reference Card

快速参考卡片

Evidence Collection Checklist

证据收集清单

Verdict Levels

判定等级

Investigation Template

调查模板

Sherlock Investigation: [Claim]

Sherlock 调查：[声明内容]

The Claim

声明内容

Evidence Examined

已检查的证据

Deductive Analysis

推演分析

Findings

调查结果

Recommendations

建议

Investigation Scenarios

调查场景

Scenario 1: "This Fixed the Bug"

场景1："该修复解决了Bug"

Scenario 2: "Improved Performance by 50%"

场景2："性能提升50%"

Scenario 3: "Handles All Edge Cases"

场景3："处理所有边缘情况"

Example Investigation

调查示例

Case: PR #123 "Fix race condition in async handler"

案例：PR #123 "修复异步处理器中的竞态条件"

Claims Examined:

待检查的声明：

Evidence:

证据：

Analysis:

分析：

Conclusion:

结论：

Recommendations:

建议：

Agent Integration

Agent 集成

Agent Coordination Hints

Agent 协作提示

Memory Namespace

内存命名空间

Fleet Coordination

集群协作

Related Skills

相关技能

Remember

谨记