sherlock-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Sherlock Review

Sherlock 代码审查

<default_to_action> When investigating code claims:
  1. OBSERVE: Gather all evidence (code, tests, history, behavior)
  2. DEDUCE: What does evidence actually show vs. what was claimed?
  3. ELIMINATE: Rule out what cannot be true
  4. CONCLUDE: Does evidence support the claim?
  5. DOCUMENT: Findings with proof, not assumptions
The 3-Step Investigation:
bash
undefined
<default_to_action> 当调查代码相关声明时:
  1. 观察:收集所有证据(代码、测试、历史记录、运行行为)
  2. 推演:证据实际表明的情况与声称的内容有何差异?
  3. 排除:排除不可能成立的情况
  4. 结论:证据是否支持该声明?
  5. 记录:基于证据而非假设记录调查结果
三步调查法:
bash
undefined

1. OBSERVE: Gather evidence

1. 观察:收集证据

git diff <commit> npm test -- --coverage
git diff <commit> npm test -- --coverage

2. DEDUCE: Compare claim vs reality

2. 推演:对比声明与实际情况

Does code match description?

代码是否与描述一致?

Do tests prove the fix/feature?

测试能否证明修复/功能的有效性?

3. CONCLUDE: Verdict with evidence

3. 结论:基于证据给出判定

SUPPORTED / PARTIALLY SUPPORTED / NOT SUPPORTED

支持/部分支持/不支持


**Holmesian Principles:**
- "Data! Data! Data!" - Collect before concluding
- "Eliminate the impossible" - What cannot be true?
- "You see, but do not observe" - Run code, don't just read
- Trust only reproducible evidence
</default_to_action>

**福尔摩斯原则:**
- "数据!数据!数据!" - 先收集再下结论
- "排除一切不可能的" - 哪些情况不可能成立?
- "你是在看,而不是在观察" - 运行代码,不要只阅读
- 只信任可复现的证据
</default_to_action>

Quick Reference Card

快速参考卡片

Evidence Collection Checklist

证据收集清单

CategoryWhat to CheckHow
ClaimPR description, commit messagesRead thoroughly
CodeActual file changes
git diff
TestsCoverage, assertionsRun independently
BehaviorRuntime outputExecute locally
TimelineWhen things happened
git log
,
git blame
分类检查要点检查方式
声明内容PR描述、提交信息仔细阅读
代码实际文件变更
git diff
测试覆盖率、断言逻辑独立运行测试
运行行为运行时输出本地执行代码
时间线事件发生顺序
git log
,
git blame

Verdict Levels

判定等级

VerdictMeaning
TRUEEvidence fully supports claim
PARTIALLY TRUEClaim accurate but incomplete
FALSEEvidence contradicts claim
? NONSENSICALClaim doesn't apply to context

判定含义
真实证据完全支持声明
部分真实声明内容准确但不完整
虚假证据与声明矛盾
? 无意义声明与当前语境不相关

Investigation Template

调查模板

markdown
undefined
markdown
undefined

Sherlock Investigation: [Claim]

Sherlock 调查:[声明内容]

The Claim

声明内容

"[What PR/commit claims to do]"
"[PR/提交声称实现的功能]"

Evidence Examined

已检查的证据

  • Code changes: [files, lines]
  • Tests added: [count, coverage]
  • Behavior observed: [what actually happens]
  • 代码变更:[文件、行号]
  • 新增测试:[数量、覆盖率]
  • 观察到的运行行为:[实际发生的情况]

Deductive Analysis

推演分析

Claim: [specific assertion] Evidence: [what you found] Deduction: [logical conclusion] Verdict: ✓/⚠/✗
声明: [具体断言] 证据: [你的发现] 推演: [逻辑结论] 判定: ✓/⚠/✗

Findings

调查结果

  • What works: [with evidence]
  • What doesn't: [with evidence]
  • What's missing: [gaps in implementation/testing]
  • 有效部分:[附证据]
  • 无效部分:[附证据]
  • 缺失部分:[实现/测试中的漏洞]

Recommendations

建议

  1. [Action based on findings]

---
  1. [基于调查结果的行动项]

---

Investigation Scenarios

调查场景

Scenario 1: "This Fixed the Bug"

场景1:"该修复解决了Bug"

Steps:
  1. Reproduce bug on commit before fix
  2. Verify bug is gone on commit with fix
  3. Check if fix addresses root cause or symptom
  4. Test edge cases not in original report
Red Flags:
  • Fix that just removes error logging
  • Works only for specific test case
  • Workarounds instead of root cause fix
  • No regression test added
步骤:
  1. 在修复前的提交版本上复现Bug
  2. 验证修复后的提交版本中Bug已消失
  3. 检查修复是否针对根本原因而非表面症状
  4. 测试原始报告未提及的边缘情况
预警信号:
  • 仅移除错误日志的修复
  • 仅在特定测试用例中生效
  • 使用临时替代方案而非修复根本原因
  • 未添加回归测试

Scenario 2: "Improved Performance by 50%"

场景2:"性能提升50%"

Steps:
  1. Run benchmark on baseline commit
  2. Run same benchmark on optimized commit
  3. Compare in identical conditions
  4. Verify measurement methodology
Red Flags:
  • Tested only on toy data
  • Different comparison conditions
  • Trade-offs not mentioned
步骤:
  1. 在基准提交版本上运行性能测试
  2. 在优化后的提交版本上运行相同的性能测试
  3. 在完全相同的条件下对比结果
  4. 验证测试方法的合理性
预警信号:
  • 仅在测试数据上验证
  • 对比条件不一致
  • 未提及性能折衷

Scenario 3: "Handles All Edge Cases"

场景3:"处理所有边缘情况"

Steps:
  1. List all edge cases in code path
  2. Check each has test coverage
  3. Test boundary conditions
  4. Verify error handling paths
Red Flags:
  • catch {}
    swallowing errors
  • Generic error messages
  • No logging of critical errors

步骤:
  1. 列出代码路径中的所有边缘情况
  2. 检查每个边缘情况是否有测试覆盖
  3. 测试边界条件
  4. 验证错误处理逻辑
预警信号:
  • catch {}
    吞掉错误
  • 通用错误提示信息
  • 未记录关键错误

Example Investigation

调查示例

markdown
undefined
markdown
undefined

Case: PR #123 "Fix race condition in async handler"

案例:PR #123 "修复异步处理器中的竞态条件"

Claims Examined:

待检查的声明:

  1. "Eliminates race condition"
  2. "Adds mutex locking"
  3. "100% thread safe"
  1. "消除了竞态条件"
  2. "添加了互斥锁"
  3. "100%线程安全"

Evidence:

证据:

  • File: src/handlers/async-handler.js
  • Changes: Added
    async/await
    , removed callbacks
  • Tests: 2 new tests for async flow
  • Coverage: 85% (was 75%)
  • 文件:src/handlers/async-handler.js
  • 变更:新增
    async/await
    ,移除回调函数
  • 测试:新增2个异步流程测试
  • 覆盖率:85%(之前为75%)

Analysis:

分析:

Claim 1: "Eliminates race condition" Evidence: Added
await
to sequential operations. No actual mutex. Deduction: Race avoided by removing concurrency, not synchronization. Verdict: ⚠ PARTIALLY TRUE (solved differently than claimed)
Claim 2: "Adds mutex locking" Evidence: No mutex library, no lock variables, no sync primitives. Verdict: ✗ FALSE
Claim 3: "100% thread safe" Evidence: JavaScript is single-threaded. No worker threads used. Verdict: ? NONSENSICAL (meaningless in this context)
声明1:"消除了竞态条件" 证据:为顺序操作添加了
await
。未使用实际的互斥锁。 推演:通过移除并发而非同步机制避免了竞态。 判定:⚠ 部分真实(解决方式与声称不符)
声明2:"添加了互斥锁" 证据:无互斥锁库、无锁变量、无同步原语。 判定:✗ 虚假
声明3:"100%线程安全" 证据:JavaScript是单线程语言,未使用工作线程。 判定:? 无意义(在此语境下无实际含义)

Conclusion:

结论:

Fix works but not for reasons claimed. Race condition avoided by making operations sequential, not by adding synchronization.
修复有效,但并非基于声称的原因。竞态条件通过使操作顺序化而非添加同步机制得以避免。

Recommendations:

建议:

  1. Update PR description to accurately reflect solution
  2. Add test for concurrent request handling
  3. Remove incorrect technical claims

---
  1. 更新PR描述以准确反映解决方案
  2. 添加并发请求处理的测试
  3. 删除错误的技术声明

---

Agent Integration

Agent 集成

typescript
// Evidence-based code review
await Task("Sherlock Review", {
  prNumber: 123,
  claims: [
    "Fixes memory leak",
    "Improves performance 30%"
  ],
  verifyReproduction: true,
  testEdgeCases: true
}, "qe-code-reviewer");

// Bug fix verification
await Task("Verify Fix", {
  bugCommit: 'abc123',
  fixCommit: 'def456',
  reproductionSteps: steps,
  testBoundaryConditions: true
}, "qe-code-reviewer");

typescript
// 基于证据的代码审查
await Task("Sherlock Review", {
  prNumber: 123,
  claims: [
    "Fixes memory leak",
    "Improves performance 30%"
  ],
  verifyReproduction: true,
  testEdgeCases: true
}, "qe-code-reviewer");

// Bug修复验证
await Task("Verify Fix", {
  bugCommit: 'abc123',
  fixCommit: 'def456',
  reproductionSteps: steps,
  testBoundaryConditions: true
}, "qe-code-reviewer");

Agent Coordination Hints

Agent 协作提示

Memory Namespace

内存命名空间

aqe/sherlock/
├── investigations/*   - Investigation reports
├── evidence/*         - Collected evidence
├── verdicts/*         - Claim verdicts
└── patterns/*         - Common deception patterns
aqe/sherlock/
├── investigations/*   - 调查报告
├── evidence/*         - 收集的证据
├── verdicts/*         - 声明判定结果
└── patterns/*         - 常见误导模式

Fleet Coordination

集群协作

typescript
const investigationFleet = await FleetManager.coordinate({
  strategy: 'evidence-investigation',
  agents: [
    'qe-code-reviewer',        // Code analysis
    'qe-security-auditor',     // Security claim verification
    'qe-performance-validator' // Performance claim verification
  ],
  topology: 'parallel'
});

typescript
const investigationFleet = await FleetManager.coordinate({
  strategy: 'evidence-investigation',
  agents: [
    'qe-code-reviewer',        // 代码分析
    'qe-security-auditor',     // 安全声明验证
    'qe-performance-validator' // 性能声明验证
  ],
  topology: 'parallel'
});

Related Skills

相关技能

  • brutal-honesty-review - Direct technical criticism
  • context-driven-testing - Adapt to context
  • bug-reporting-excellence - Document findings

  • brutal-honesty-review - 直接技术批评
  • context-driven-testing - 适配语境
  • bug-reporting-excellence - 记录调查结果

Remember

谨记

"It is a capital mistake to theorize before one has data." Trust only reproducible evidence. Don't trust commit messages, documentation, or "works on my machine."
The Sherlock Standard: Every claim must be verified empirically. What does the evidence actually show?
"在没有数据之前就妄下结论是大错特错。" 只信任可复现的证据。不要轻信提交信息、文档或"在我机器上能运行"的说法。
Sherlock 标准: 每一项声明都必须通过实证验证。证据实际表明了什么?