meta-cognitive-reasoning
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMeta-Cognitive Reasoning
元认知推理
This skill provides disciplined reasoning frameworks for avoiding cognitive failures in analysis, reviews, and decision-making. It enforces evidence-based conclusions, multiple hypothesis generation, and systematic verification.
该技能提供规范化的推理框架,用于在分析、评审和决策过程中避免认知错误。它强制要求基于证据得出结论、生成多假设并进行系统性验证。
When to Use This Skill
适用场景
- Before making claims about code, systems, or versions
- When conducting code reviews or architectural assessments
- When debugging issues with multiple possible causes
- When encountering unfamiliar patterns or versions
- When making recommendations that could have significant impact
- When pattern matching triggers immediate conclusions
- When analyzing documentation or specifications
- During any task requiring rigorous analytical reasoning
- 在对代码、系统或版本提出论断之前
- 开展代码评审或架构评估时
- 调试存在多种可能原因的问题时
- 遇到不熟悉的模式或版本时
- 提出可能产生重大影响的建议时
- 模式匹配触发即时结论时
- 分析文档或规范时
- 任何需要严谨分析推理的任务中
What This Skill Does
该技能的作用
- Evidence-Based Reasoning: Enforces showing evidence before interpretation
- Multiple Hypothesis Generation: Prevents premature commitment to single explanation
- Temporal Knowledge Verification: Handles knowledge cutoff limitations
- Cognitive Failure Prevention: Recognizes and prevents common reasoning errors
- Self-Correction Protocol: Provides framework for transparent error correction
- Scope Discipline: Allocates cognitive effort appropriately
- 循证推理:要求先展示证据再进行解读
- 多假设生成:避免过早局限于单一解释
- 时效性知识验证:处理知识截止日期限制
- 认知错误预防:识别并预防常见推理错误
- 自我修正流程:提供透明的错误修正框架
- 范围管控:合理分配认知精力
Core Principles
核心原则
1. Evidence-Based Reasoning Protocol
1. 循证推理协议
Universal Rule: Never conclude without proof
MANDATORY SEQUENCE:
1. Show tool output FIRST
2. Quote specific evidence
3. THEN interpretForbidden Phrases:
- "I assume"
- "typically means"
- "appears to"
- "Tests pass" (without output)
- "Meets standards" (without evidence)
Required Phrases:
- "Command shows: 'actual output' - interpretation"
- "Line N: 'code snippet' - meaning"
- "Let me verify..." -> tool output -> interpretation
通用规则:无证据不结论
强制流程:
1. 先展示工具输出
2. 引用具体证据
3. 再进行解读禁用表述:
- "我假设"
- "通常意味着"
- "看起来像是"
- "测试通过"(无输出佐证)
- "符合标准"(无证据支持)
要求使用的表述:
- "命令输出:'实际结果' - 解读内容"
- "第N行:'代码片段' - 含义"
- "我来验证一下..." -> 工具输出 -> 解读
2. Multiple Working Hypotheses
2. 多工作假设
When identical observations can arise from different mechanisms with opposite implications - investigate before concluding.
Three-Layer Reasoning Model:
Layer 1: OBSERVATION (What do I see?)
Layer 2: MECHANISM (How/why does this exist?)
Layer 3: ASSESSMENT (Is this good/bad/critical?)
FAILURE: Jump from Layer 1 -> Layer 3 (skip mechanism)
CORRECT: Layer 1 -> Layer 2 (investigate) -> Layer 3 (assess with context)Decision Framework:
-
Recognize multiple hypotheses exist
- What mechanisms could produce this observation?
- Which mechanisms have opposite implications?
-
Generate competing hypotheses explicitly
- Hypothesis A: [mechanism] -> [implication]
- Hypothesis B: [different mechanism] -> [opposite implication]
-
Identify discriminating evidence
- What single observation would prove/disprove each?
-
Gather discriminating evidence
- Run the specific test that distinguishes hypotheses
-
Assess with mechanism context
- Same observation + different mechanism = different assessment
当相同的观察结果可能由具有相反影响的不同机制产生时,需先调查再下结论。
三层推理模型:
第一层:观察(我看到了什么?)
第二层:机制(这是如何/为何存在的?)
第三层:评估(这是好/坏/严重问题?)
错误做法:从第一层直接跳到第三层(跳过机制分析)
正确做法:第一层 -> 第二层(调查) -> 第三层(结合背景评估)决策框架:
-
确认存在多种假设
- 哪些机制可能产生该观察结果?
- 哪些机制的影响是相反的?
-
明确生成对立假设
- 假设A:[机制] -> [影响]
- 假设B:[不同机制] -> [相反影响]
-
识别区分性证据
- 哪一项观察结果可以证实/推翻每个假设?
-
收集区分性证据
- 运行能区分假设的特定测试
-
结合机制背景进行评估
- 相同观察结果 + 不同机制 = 不同评估结论
3. Temporal Knowledge Currency
3. 时效性知识校验
Training data has a timestamp; absence of knowledge ≠ evidence of absence
Critical Context Check:
Before making claims about what exists:
1. What is my knowledge cutoff date?
2. What is today's date?
3. How much time has elapsed?
4. Could versions/features beyond my training exist?High Risk Domains (always verify):
- Package versions (npm, pip, maven)
- Framework versions (React, Vue, Django)
- Language versions (Python, Node, Go)
- Cloud service features (AWS, GCP, Azure)
- API versions and tool versions
Anti-Patterns:
- "Version X doesn't exist" (without verification)
- "Latest is Y" (based on stale training data)
- "CRITICAL/BLOCKER" without evidence
训练数据有时间戳;不知道≠不存在
关键背景检查:
在对事物的存在性提出论断前:
1. 我的知识截止日期是什么时候?
2. 今天的日期是什么?
3. 间隔了多长时间?
4. 训练数据之外是否可能存在新的版本/功能?高风险领域(必须验证):
- 包版本(npm, pip, maven)
- 框架版本(React, Vue, Django)
- 语言版本(Python, Node, Go)
- 云服务功能(AWS, GCP, Azure)
- API版本和工具版本
反模式:
- "版本X不存在"(未验证)
- "最新版本是Y"(基于过时的训练数据)
- "严重/阻塞问题"(无证据)
4. Self-Correction Protocol
4. 自我修正流程
When discovering errors in previous output:
STEP 1: ACKNOWLEDGE EXPLICITLY
- Lead with "CRITICAL CORRECTION"
- Make it impossible to miss
STEP 2: STATE PREVIOUS CLAIM
- Quote exact wrong statement
STEP 3: PROVIDE EVIDENCE
- Show what proves the correction
STEP 4: EXPLAIN ERROR CAUSE
- Root cause: temporal gap? assumption?
STEP 5: CLEAR ACTION
- "NO CHANGE NEEDED" or "Revert suggestion"当发现之前输出存在错误时:
步骤1:明确承认
- 以"重要修正"开头
- 确保不会被忽略
步骤2:陈述之前的论断
- 引用确切的错误表述
步骤3:提供证据
- 展示能证明修正的内容
步骤4:解释错误原因
- 根本原因:时间差?假设?
步骤5:明确行动
- "无需更改"或"撤回建议"5. Cognitive Resource Allocation
5. 认知资源分配
Parsimony Principle:
- Choose simplest approach that satisfies requirements
- Simple verification first, complexity only when simple fails
Scope Discipline:
- Allocate resources to actual requirements, not hypothetical ones
- "Was this explicitly requested?"
Information Economy:
- Reuse established facts
- Re-verify when context changes
Atomicity Principle:
- Each action should have one clear purpose
- If description requires "and" between distinct purposes, split it
- Benefits: clearer failure diagnosis, easier progress tracking, better evidence attribution
简约原则:
- 选择能满足需求的最简单方法
- 先进行简单验证,仅在简单方法失败时再采用复杂方案
范围管控:
- 将资源分配给实际需求,而非假设需求
- 思考"这是明确要求的吗?"
信息经济性:
- 复用已确认的事实
- 背景变化时重新验证
原子性原则:
- 每个行动应有一个明确的目标
- 如果描述中包含多个不同目标(用"和"连接),则拆分行动
- 优势:更清晰的故障诊断、更易跟踪进度、更好的证据归因
6. Systematic Completion Discipline
6. 系统性完成准则
Never declare success until ALL requirements verified
High-Risk Scenarios for Premature Completion:
- Multi-step tasks with many quality gates
- After successfully fixing major issues (cognitive reward triggers)
- When tools show many errors (avoidance temptation)
- Near end of session (completion pressure)
Completion Protocol:
- Break requirements into explicit checkpoints
- Complete each gate fully before proceeding
- Show evidence at each checkpoint
- Resist "good enough" shortcuts
Warning Signs:
- Thinking "good enough" instead of checking all requirements
- Applying blanket solutions without individual analysis
- Skipping systematic verification
- Declaring success while evidence shows otherwise
在所有需求都验证通过前,绝不能宣告成功
过早完成的高风险场景:
- 包含多个质量关卡的多步骤任务
- 成功修复重大问题后(认知奖励触发)
- 工具显示大量错误时(产生逃避倾向)
- 会话即将结束时(完成压力)
完成流程:
- 将需求拆分为明确的检查点
- 完全通过每个关卡后再推进
- 在每个检查点展示证据
- 拒绝"足够好"的捷径
警示信号:
- 想着"足够好"而不检查所有需求
- 不进行个体分析就采用通用解决方案
- 跳过系统性验证
- 证据显示未完成却宣告成功
7. Individual Analysis Over Batch Processing
7. 个体分析优先于批量处理
Core Principle: Every item deserves individual attention
Apply to:
- Error messages (read each one individually)
- Review items (analyze each line/file)
- Decisions (don't apply blanket rules)
- Suppressions (justify each one specifically)
Anti-Patterns:
- Bulk categorization without reading details
- Blanket solutions applied without context
- Batch processing of unique situations
核心原则:每个条目都值得单独关注
适用场景:
- 错误消息(逐条阅读)
- 评审条目(逐行/逐文件分析)
- 决策(不套用通用规则)
- 抑制操作(为每个操作提供具体理由)
反模式:
- 不阅读详情就批量分类
- 不结合背景就采用通用解决方案
- 对独特场景进行批量处理
8. Semantic vs Literal Analysis
8. 语义分析 vs 字面分析
Look for conceptual overlap, not just text/pattern duplication
Key Questions:
- What is the actual PURPOSE here?
- Does this serve a functional need or just match a pattern?
- What would be LOST if I removed/changed this?
- Is this the same CONCEPT expressed differently?
Applications:
- Documentation: Identify semantic duplication across hierarchy levels
- Code review: Understand intent before suggesting changes
- Optimization: Analyze actual necessity before improving
关注概念重叠,而非仅文本/模式重复
关键问题:
- 实际目的是什么?
- 这是满足功能需求还是仅匹配模式?
- 如果我删除/更改它,会失去什么?
- 这是否是同一概念的不同表达?
应用场景:
- 文档:识别层级间的语义重复
- 代码评审:先理解意图再提出修改建议
- 优化:先分析实际必要性再进行改进
How to Use
使用方法
Verify Before Claiming
先验证再论断
Verify that package X version Y exists before recommending changesCheck if this file structure is symlinks or duplicates before recommending consolidation在建议更改前,先验证包X的版本Y是否存在在建议合并前,先检查该文件结构是符号链接还是重复文件Generate Multiple Hypotheses
生成多假设
The tests are failing with timeout errors. What are the possible mechanisms?These three files have identical content. What could explain this?测试因"连接超时"失败。可能的机制有哪些?这三个文件内容完全相同。可能的原因是什么?Conduct Evidence-Based Review
开展循证评审
Review this code and show evidence for every claim评审此代码,并为每个论断提供证据Reasoning Workflows
推理工作流
Verification Workflow
验证工作流
When encountering unfamiliar versions/features:
- Recognize uncertainty: "I don't recall X from training"
- Form hypotheses: A) doesn't exist, B) exists but new, C) is current
- Verify before concluding: Check authoritative source
- Show evidence, then interpret: Command output -> conclusion
遇到不熟悉的版本/功能时:
- 识别不确定性:"我在训练数据中不记得有X"
- 形成假设:A) 不存在,B) 存在但为新增,C) 是当前版本
- 先验证再下结论:检查权威来源
- 先展示证据再解读:命令输出 -> 结论
Assessment Workflow
评估工作流
When analyzing code, architecture, or configurations:
- Observe: What do I see?
- Investigate mechanism: HOW does this exist?
- Then assess: Based on mechanism, is this good/bad?
分析代码、架构或配置时:
- 观察:我看到了什么?
- 调查机制:这是如何存在的?
- 再评估:基于机制,这是好/坏?
Review Workflow
评审工作流
For code reviews, documentation reviews, or any analysis:
- Clarify scope: Ask before assuming
- Show evidence for every claim: File:line:code
- Generate hypotheses before concluding
- Distinguish mechanism from observation
- Reserve strong language for verified issues
适用于代码评审、文档评审或任何分析任务:
- 明确范围:先询问再假设
- 为每个论断提供证据:文件:行号:代码
- 先生成假设再下结论
- 区分机制与观察结果
- 仅对已验证的问题使用强烈措辞
Cognitive Failure Patterns
认知错误模式
Pattern 1: Scanning Instead of Reading
模式1:扫描而非阅读
- Missing obvious issues while finding minor ones
- Prevention: Read every line/error individually
- 找到小问题却忽略明显的大问题
- 预防措施:逐行/逐条阅读每个错误
Pattern 2: Pattern Matching Without Context
模式2:无背景的模式匹配
- Applying solutions without understanding problems
- Prevention: Analyze actual purpose before applying templates
- 不理解问题就套用解决方案
- 预防措施:先分析实际目的再套用模板
Pattern 3: Assumption-Based Conclusions
模式3:基于假设的结论
- Guessing instead of verifying
- Prevention: Evidence-based verification required
- 猜测而非验证
- 预防措施:必须进行循证验证
Pattern 4: Premature Success Declaration
模式4:过早宣告成功
- "Task complete" ≠ "Requirements verified"
- Prevention: Show tool output proving completion
- "任务完成"≠"需求已验证"
- 预防措施:展示能证明完成的工具输出
Pattern 5: Temporal Knowledge Decay
模式5:时效性知识衰减
- Confusing "I don't know" with "doesn't exist"
- Prevention: Verify version/feature currency
- 将"我不知道"等同于"不存在"
- 预防措施:验证版本/功能的时效性
Pattern 6: Overconfidence Cascade
模式6:过度自信连锁反应
False premise: "X doesn't exist" (unverified)
↓
Amplified: "This is CRITICAL/BLOCKER"
↓
Harmful: "Change X to older version Y"
↓
Impact: Downgrade from newer to older
BREAK THE CASCADE: Verify the premise first错误前提:"X不存在"(未验证)
↓
放大:"这是严重/阻塞问题"
↓
有害操作:"将X降级为旧版本Y"
↓
影响:从新版本降级到旧版本
打破连锁:先验证前提Examples
示例
Example 1: Version Verification
示例1:版本验证
Situation: Reviewing code that uses
certifi==2025.10.5Wrong Approach:
markdown
CRITICAL: Version 2025.10.5 doesn't exist!
Action: Change to 2024.x.xCorrect Approach:
markdown
Let me verify this version exists...
$ grep -A 3 'name = "certifi"' uv.lock
name = "certifi"
version = "2025.10.5"
source = { registry = "https://pypi.org/simple" }
Evidence shows version EXISTS on PyPI.
Action: NO CHANGE NEEDED - version is valid.场景:评审使用的代码
certifi==2025.10.5错误做法:
markdown
严重问题:版本2025.10.5不存在!
操作:改为2024.x.x版本正确做法:
markdown
我来验证这个版本是否存在...
$ grep -A 3 'name = "certifi"' uv.lock
name = "certifi"
version = "2025.10.5"
source = { registry = "https://pypi.org/simple" }
证据显示该版本在PyPI上存在。
操作:无需更改 - 版本有效。Example 2: File Structure Analysis
示例2:文件结构分析
Observation: Three files with identical 1494-line content
- specs/AGENTS.md
- specs/CLAUDE.md
- specs/GEMINI.md
Wrong Approach:
markdown
CRITICAL: File duplication - consolidate to single fileCorrect Approach:
markdown
Competing hypotheses:
A: Duplicated copies -> CRITICAL: Consolidate
B: Symlinks to single source -> EXCELLENT: Keep
Gathering discriminating evidence:
$ ls -la specs/
lrwxrwxrwx CLAUDE.md -> AGENTS.md
lrwxrwxrwx GEMINI.md -> AGENTS.md
Mechanism: Symlinks (Hypothesis B confirmed)
Assessment: EXCELLENT architecture - agent-specific entry points with single source of truth
Action: Keep as-is观察:三个文件内容完全相同,均为1494行
- specs/AGENTS.md
- specs/CLAUDE.md
- specs/GEMINI.md
错误做法:
markdown
严重问题:文件重复 - 合并为单个文件正确做法:
markdown
对立假设:
A: 重复副本 -> 严重问题:需合并
B: 指向单个源文件的符号链接 -> 优秀设计:保留
收集区分性证据:
$ ls -la specs/
lrwxrwxrwx CLAUDE.md -> AGENTS.md
lrwxrwxrwx GEMINI.md -> AGENTS.md
机制:符号链接(假设B已确认)
评估:优秀架构 - 以单个事实源为基础,提供面向特定Agent的入口
操作:保持原样Example 3: Test Failure Analysis
示例3:测试失败分析
Observation: 5 tests failing with "connection timeout"
Hypotheses:
- A: Single dependency down (fix one thing)
- B: Multiple independent timeouts (fix five things)
- C: Test infrastructure issue (fix setup)
- D: Environment config missing (fix config)
Investigation:
- Check test dependencies
- Check error timestamps (simultaneous vs sequential)
- Run tests in isolation
Then conclude based on evidence.
观察:5个测试因"连接超时"失败
假设:
- A: 单个依赖项故障(修复一个点)
- B: 多个独立超时(修复五个点)
- C: 测试基础设施问题(修复环境搭建)
- D: 缺少环境配置(修复配置)
调查:
- 检查测试依赖项
- 检查错误时间戳(同时发生 vs 顺序发生)
- 单独运行测试
然后基于证据得出结论。
Anti-Patterns
反模式
DO NOT:
- "File X doesn't exist" without: ls X
- "Function not used" without: grep -r "function_name"
- "Version invalid" without: checking registry/lockfile
- "Tests fail" without: running tests
- "CRITICAL/BLOCKER" without verification
- Use strong language without evidence
- Skip mechanism investigation
- Pattern match to first familiar case
DO:
- Show grep/ls/find output BEFORE claiming
- Quote actual lines: "file.py:123: 'code here' - issue"
- Check lockfiles for resolved versions
- Run available tools and show output
- Reserve strong language for evidence-proven issues
- "Let me verify..." -> tool output -> interpretation
- Generate multiple hypotheses before gathering evidence
- Distinguish observation from mechanism禁止:
- 未执行ls X就说"文件X不存在"
- 未执行grep -r "function_name"就说"函数未被使用"
- 未检查注册表/锁文件就说"版本无效"
- 未运行测试就说"测试失败"
- 未验证就使用"严重/阻塞问题"措辞
- 无证据使用强烈措辞
- 跳过机制调查
- 匹配到第一个熟悉的案例就下结论
建议:
- 先展示grep/ls/find输出再提出论断
- 引用实际代码行:"file.py:123: '代码内容' - 问题"
- 检查锁文件中的已解析版本
- 运行可用工具并展示输出
- 仅对有证据证明的问题使用强烈措辞
- "我来验证一下..." -> 工具输出 -> 解读
- 先生成多假设再收集证据
- 区分观察结果与机制Clarifying Questions
澄清问题
Before proceeding with complex tasks, ask:
- What is the primary goal/context?
- What scope is expected (simple fix vs comprehensive)?
- What are the success criteria?
- What constraints exist?
For reviews specifically:
- Scope: All changed files or specific ones?
- Depth: Quick feedback or comprehensive analysis?
- Focus: Implementation quality, standards, or both?
- Output: List of issues or prioritized roadmap?
在开展复杂任务前,询问:
- 主要目标/背景是什么?
- 预期范围是什么(简单修复 vs 全面分析)?
- 成功标准是什么?
- 存在哪些约束?
针对评审任务:
- 范围:所有变更文件还是特定文件?
- 深度:快速反馈还是全面分析?
- 重点:实现质量、标准合规,还是两者兼顾?
- 输出:问题列表还是优先级路线图?
Task Management Patterns
任务管理模式
Review Request Interpretation
评审请求解读
Universal Rule: ALL reviews are comprehensive unless explicitly scoped
Never assume limited scope based on:
- Recent conversation topics
- Previously completed partial work
- Specific words that seem to narrow scope
- Apparent simplicity of request
Always include:
- All applicable quality gates
- Evidence for every claim
- Complete verification of requirements
- Systematic coverage (not spot-checking)
通用规则:除非明确限定范围,否则所有评审均为全面评审
绝不要基于以下内容假设有限范围:
- 近期对话主题
- 之前完成的部分工作
- 看似缩小范围的特定词汇
- 请求表面上的简单性
必须包含:
- 所有适用的质量关卡
- 每个论断的证据
- 对需求的完整验证
- 系统性覆盖(而非抽查)
Context Analysis Decision Framework
背景分析决策框架
Universal Process:
- Analyze actual purpose (don't assume from patterns)
- Check consistency with actual usage
- Verify with evidence (read/test to confirm)
- Ask before acting when uncertain
Recognition Pattern:
WRONG: "Other components do X, so this needs X"
RIGHT: "Let me analyze if this component actually needs X for its purpose"通用流程:
- 分析实际目的(不要从模式推断)
- 检查与实际使用的一致性
- 用证据验证(阅读/测试以确认)
- 不确定时先询问再行动
正确认知模式:
错误:"其他组件都用X,所以这个也需要用X"
正确:"我来分析这个组件是否确实需要用X来实现其目的"Related Use Cases
相关用例
- Code reviews requiring evidence-based claims
- Version verification before recommendations
- Architectural assessments
- Debugging with multiple possible causes
- Documentation analysis
- Security audits
- Performance investigations
- Any analysis requiring rigorous reasoning
- 需要循证论断的代码评审
- 提出建议前的版本验证
- 架构评估
- 存在多种可能原因的调试
- 文档分析
- 安全审计
- 性能调查
- 任何需要严谨推理的分析任务