sherlock-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSherlock Review
Sherlock 代码审查
<default_to_action>
When investigating code claims:
- OBSERVE: Gather all evidence (code, tests, history, behavior)
- DEDUCE: What does evidence actually show vs. what was claimed?
- ELIMINATE: Rule out what cannot be true
- CONCLUDE: Does evidence support the claim?
- DOCUMENT: Findings with proof, not assumptions
The 3-Step Investigation:
bash
undefined<default_to_action>
当调查代码相关声明时:
- 观察:收集所有证据(代码、测试、历史记录、运行行为)
- 推演:证据实际表明的情况与声称的内容有何差异?
- 排除:排除不可能成立的情况
- 结论:证据是否支持该声明?
- 记录:基于证据而非假设记录调查结果
三步调查法:
bash
undefined1. OBSERVE: Gather evidence
1. 观察:收集证据
git diff <commit>
npm test -- --coverage
git diff <commit>
npm test -- --coverage
2. DEDUCE: Compare claim vs reality
2. 推演:对比声明与实际情况
Does code match description?
代码是否与描述一致?
Do tests prove the fix/feature?
测试能否证明修复/功能的有效性?
3. CONCLUDE: Verdict with evidence
3. 结论:基于证据给出判定
SUPPORTED / PARTIALLY SUPPORTED / NOT SUPPORTED
支持/部分支持/不支持
**Holmesian Principles:**
- "Data! Data! Data!" - Collect before concluding
- "Eliminate the impossible" - What cannot be true?
- "You see, but do not observe" - Run code, don't just read
- Trust only reproducible evidence
</default_to_action>
**福尔摩斯原则:**
- "数据!数据!数据!" - 先收集再下结论
- "排除一切不可能的" - 哪些情况不可能成立?
- "你是在看,而不是在观察" - 运行代码,不要只阅读
- 只信任可复现的证据
</default_to_action>Quick Reference Card
快速参考卡片
Evidence Collection Checklist
证据收集清单
| Category | What to Check | How |
|---|---|---|
| Claim | PR description, commit messages | Read thoroughly |
| Code | Actual file changes | |
| Tests | Coverage, assertions | Run independently |
| Behavior | Runtime output | Execute locally |
| Timeline | When things happened | |
| 分类 | 检查要点 | 检查方式 |
|---|---|---|
| 声明内容 | PR描述、提交信息 | 仔细阅读 |
| 代码 | 实际文件变更 | |
| 测试 | 覆盖率、断言逻辑 | 独立运行测试 |
| 运行行为 | 运行时输出 | 本地执行代码 |
| 时间线 | 事件发生顺序 | |
Verdict Levels
判定等级
| Verdict | Meaning |
|---|---|
| ✓ TRUE | Evidence fully supports claim |
| ⚠ PARTIALLY TRUE | Claim accurate but incomplete |
| ✗ FALSE | Evidence contradicts claim |
| ? NONSENSICAL | Claim doesn't apply to context |
| 判定 | 含义 |
|---|---|
| ✓ 真实 | 证据完全支持声明 |
| ⚠ 部分真实 | 声明内容准确但不完整 |
| ✗ 虚假 | 证据与声明矛盾 |
| ? 无意义 | 声明与当前语境不相关 |
Investigation Template
调查模板
markdown
undefinedmarkdown
undefinedSherlock Investigation: [Claim]
Sherlock 调查:[声明内容]
The Claim
声明内容
"[What PR/commit claims to do]"
"[PR/提交声称实现的功能]"
Evidence Examined
已检查的证据
- Code changes: [files, lines]
- Tests added: [count, coverage]
- Behavior observed: [what actually happens]
- 代码变更:[文件、行号]
- 新增测试:[数量、覆盖率]
- 观察到的运行行为:[实际发生的情况]
Deductive Analysis
推演分析
Claim: [specific assertion]
Evidence: [what you found]
Deduction: [logical conclusion]
Verdict: ✓/⚠/✗
声明: [具体断言]
证据: [你的发现]
推演: [逻辑结论]
判定: ✓/⚠/✗
Findings
调查结果
- What works: [with evidence]
- What doesn't: [with evidence]
- What's missing: [gaps in implementation/testing]
- 有效部分:[附证据]
- 无效部分:[附证据]
- 缺失部分:[实现/测试中的漏洞]
Recommendations
建议
- [Action based on findings]
---- [基于调查结果的行动项]
---Investigation Scenarios
调查场景
Scenario 1: "This Fixed the Bug"
场景1:"该修复解决了Bug"
Steps:
- Reproduce bug on commit before fix
- Verify bug is gone on commit with fix
- Check if fix addresses root cause or symptom
- Test edge cases not in original report
Red Flags:
- Fix that just removes error logging
- Works only for specific test case
- Workarounds instead of root cause fix
- No regression test added
步骤:
- 在修复前的提交版本上复现Bug
- 验证修复后的提交版本中Bug已消失
- 检查修复是否针对根本原因而非表面症状
- 测试原始报告未提及的边缘情况
预警信号:
- 仅移除错误日志的修复
- 仅在特定测试用例中生效
- 使用临时替代方案而非修复根本原因
- 未添加回归测试
Scenario 2: "Improved Performance by 50%"
场景2:"性能提升50%"
Steps:
- Run benchmark on baseline commit
- Run same benchmark on optimized commit
- Compare in identical conditions
- Verify measurement methodology
Red Flags:
- Tested only on toy data
- Different comparison conditions
- Trade-offs not mentioned
步骤:
- 在基准提交版本上运行性能测试
- 在优化后的提交版本上运行相同的性能测试
- 在完全相同的条件下对比结果
- 验证测试方法的合理性
预警信号:
- 仅在测试数据上验证
- 对比条件不一致
- 未提及性能折衷
Scenario 3: "Handles All Edge Cases"
场景3:"处理所有边缘情况"
Steps:
- List all edge cases in code path
- Check each has test coverage
- Test boundary conditions
- Verify error handling paths
Red Flags:
- swallowing errors
catch {} - Generic error messages
- No logging of critical errors
步骤:
- 列出代码路径中的所有边缘情况
- 检查每个边缘情况是否有测试覆盖
- 测试边界条件
- 验证错误处理逻辑
预警信号:
- 吞掉错误
catch {} - 通用错误提示信息
- 未记录关键错误
Example Investigation
调查示例
markdown
undefinedmarkdown
undefinedCase: PR #123 "Fix race condition in async handler"
案例:PR #123 "修复异步处理器中的竞态条件"
Claims Examined:
待检查的声明:
- "Eliminates race condition"
- "Adds mutex locking"
- "100% thread safe"
- "消除了竞态条件"
- "添加了互斥锁"
- "100%线程安全"
Evidence:
证据:
- File: src/handlers/async-handler.js
- Changes: Added , removed callbacks
async/await - Tests: 2 new tests for async flow
- Coverage: 85% (was 75%)
- 文件:src/handlers/async-handler.js
- 变更:新增 ,移除回调函数
async/await - 测试:新增2个异步流程测试
- 覆盖率:85%(之前为75%)
Analysis:
分析:
Claim 1: "Eliminates race condition"
Evidence: Added to sequential operations. No actual mutex.
Deduction: Race avoided by removing concurrency, not synchronization.
Verdict: ⚠ PARTIALLY TRUE (solved differently than claimed)
awaitClaim 2: "Adds mutex locking"
Evidence: No mutex library, no lock variables, no sync primitives.
Verdict: ✗ FALSE
Claim 3: "100% thread safe"
Evidence: JavaScript is single-threaded. No worker threads used.
Verdict: ? NONSENSICAL (meaningless in this context)
声明1:"消除了竞态条件"
证据:为顺序操作添加了 。未使用实际的互斥锁。
推演:通过移除并发而非同步机制避免了竞态。
判定:⚠ 部分真实(解决方式与声称不符)
await声明2:"添加了互斥锁"
证据:无互斥锁库、无锁变量、无同步原语。
判定:✗ 虚假
声明3:"100%线程安全"
证据:JavaScript是单线程语言,未使用工作线程。
判定:? 无意义(在此语境下无实际含义)
Conclusion:
结论:
Fix works but not for reasons claimed. Race condition avoided by
making operations sequential, not by adding synchronization.
修复有效,但并非基于声称的原因。竞态条件通过使操作顺序化而非添加同步机制得以避免。
Recommendations:
建议:
- Update PR description to accurately reflect solution
- Add test for concurrent request handling
- Remove incorrect technical claims
---- 更新PR描述以准确反映解决方案
- 添加并发请求处理的测试
- 删除错误的技术声明
---Agent Integration
Agent 集成
typescript
// Evidence-based code review
await Task("Sherlock Review", {
prNumber: 123,
claims: [
"Fixes memory leak",
"Improves performance 30%"
],
verifyReproduction: true,
testEdgeCases: true
}, "qe-code-reviewer");
// Bug fix verification
await Task("Verify Fix", {
bugCommit: 'abc123',
fixCommit: 'def456',
reproductionSteps: steps,
testBoundaryConditions: true
}, "qe-code-reviewer");typescript
// 基于证据的代码审查
await Task("Sherlock Review", {
prNumber: 123,
claims: [
"Fixes memory leak",
"Improves performance 30%"
],
verifyReproduction: true,
testEdgeCases: true
}, "qe-code-reviewer");
// Bug修复验证
await Task("Verify Fix", {
bugCommit: 'abc123',
fixCommit: 'def456',
reproductionSteps: steps,
testBoundaryConditions: true
}, "qe-code-reviewer");Agent Coordination Hints
Agent 协作提示
Memory Namespace
内存命名空间
aqe/sherlock/
├── investigations/* - Investigation reports
├── evidence/* - Collected evidence
├── verdicts/* - Claim verdicts
└── patterns/* - Common deception patternsaqe/sherlock/
├── investigations/* - 调查报告
├── evidence/* - 收集的证据
├── verdicts/* - 声明判定结果
└── patterns/* - 常见误导模式Fleet Coordination
集群协作
typescript
const investigationFleet = await FleetManager.coordinate({
strategy: 'evidence-investigation',
agents: [
'qe-code-reviewer', // Code analysis
'qe-security-auditor', // Security claim verification
'qe-performance-validator' // Performance claim verification
],
topology: 'parallel'
});typescript
const investigationFleet = await FleetManager.coordinate({
strategy: 'evidence-investigation',
agents: [
'qe-code-reviewer', // 代码分析
'qe-security-auditor', // 安全声明验证
'qe-performance-validator' // 性能声明验证
],
topology: 'parallel'
});Related Skills
相关技能
- brutal-honesty-review - Direct technical criticism
- context-driven-testing - Adapt to context
- bug-reporting-excellence - Document findings
- brutal-honesty-review - 直接技术批评
- context-driven-testing - 适配语境
- bug-reporting-excellence - 记录调查结果
Remember
谨记
"It is a capital mistake to theorize before one has data." Trust only reproducible evidence. Don't trust commit messages, documentation, or "works on my machine."
The Sherlock Standard: Every claim must be verified empirically. What does the evidence actually show?
"在没有数据之前就妄下结论是大错特错。" 只信任可复现的证据。不要轻信提交信息、文档或"在我机器上能运行"的说法。
Sherlock 标准: 每一项声明都必须通过实证验证。证据实际表明了什么?