meta-cognitive-reasoning

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Meta-Cognitive Reasoning

元认知推理

This skill provides disciplined reasoning frameworks for avoiding cognitive failures in analysis, reviews, and decision-making. It enforces evidence-based conclusions, multiple hypothesis generation, and systematic verification.
该技能提供规范化的推理框架,用于在分析、评审和决策过程中避免认知错误。它强制要求基于证据得出结论、生成多假设并进行系统性验证。

When to Use This Skill

适用场景

  • Before making claims about code, systems, or versions
  • When conducting code reviews or architectural assessments
  • When debugging issues with multiple possible causes
  • When encountering unfamiliar patterns or versions
  • When making recommendations that could have significant impact
  • When pattern matching triggers immediate conclusions
  • When analyzing documentation or specifications
  • During any task requiring rigorous analytical reasoning
  • 在对代码、系统或版本提出论断之前
  • 开展代码评审或架构评估时
  • 调试存在多种可能原因的问题时
  • 遇到不熟悉的模式或版本时
  • 提出可能产生重大影响的建议时
  • 模式匹配触发即时结论时
  • 分析文档或规范时
  • 任何需要严谨分析推理的任务中

What This Skill Does

该技能的作用

  1. Evidence-Based Reasoning: Enforces showing evidence before interpretation
  2. Multiple Hypothesis Generation: Prevents premature commitment to single explanation
  3. Temporal Knowledge Verification: Handles knowledge cutoff limitations
  4. Cognitive Failure Prevention: Recognizes and prevents common reasoning errors
  5. Self-Correction Protocol: Provides framework for transparent error correction
  6. Scope Discipline: Allocates cognitive effort appropriately
  1. 循证推理:要求先展示证据再进行解读
  2. 多假设生成:避免过早局限于单一解释
  3. 时效性知识验证:处理知识截止日期限制
  4. 认知错误预防:识别并预防常见推理错误
  5. 自我修正流程:提供透明的错误修正框架
  6. 范围管控:合理分配认知精力

Core Principles

核心原则

1. Evidence-Based Reasoning Protocol

1. 循证推理协议

Universal Rule: Never conclude without proof
MANDATORY SEQUENCE:
1. Show tool output FIRST
2. Quote specific evidence
3. THEN interpret
Forbidden Phrases:
  • "I assume"
  • "typically means"
  • "appears to"
  • "Tests pass" (without output)
  • "Meets standards" (without evidence)
Required Phrases:
  • "Command shows: 'actual output' - interpretation"
  • "Line N: 'code snippet' - meaning"
  • "Let me verify..." -> tool output -> interpretation
通用规则:无证据不结论
强制流程:
1. 先展示工具输出
2. 引用具体证据
3. 再进行解读
禁用表述:
  • "我假设"
  • "通常意味着"
  • "看起来像是"
  • "测试通过"(无输出佐证)
  • "符合标准"(无证据支持)
要求使用的表述:
  • "命令输出:'实际结果' - 解读内容"
  • "第N行:'代码片段' - 含义"
  • "我来验证一下..." -> 工具输出 -> 解读

2. Multiple Working Hypotheses

2. 多工作假设

When identical observations can arise from different mechanisms with opposite implications - investigate before concluding.
Three-Layer Reasoning Model:
Layer 1: OBSERVATION (What do I see?)
Layer 2: MECHANISM (How/why does this exist?)
Layer 3: ASSESSMENT (Is this good/bad/critical?)

FAILURE: Jump from Layer 1 -> Layer 3 (skip mechanism)
CORRECT: Layer 1 -> Layer 2 (investigate) -> Layer 3 (assess with context)
Decision Framework:
  1. Recognize multiple hypotheses exist
    • What mechanisms could produce this observation?
    • Which mechanisms have opposite implications?
  2. Generate competing hypotheses explicitly
    • Hypothesis A: [mechanism] -> [implication]
    • Hypothesis B: [different mechanism] -> [opposite implication]
  3. Identify discriminating evidence
    • What single observation would prove/disprove each?
  4. Gather discriminating evidence
    • Run the specific test that distinguishes hypotheses
  5. Assess with mechanism context
    • Same observation + different mechanism = different assessment
当相同的观察结果可能由具有相反影响的不同机制产生时,需先调查再下结论。
三层推理模型:
第一层:观察(我看到了什么?)
第二层:机制(这是如何/为何存在的?)
第三层:评估(这是好/坏/严重问题?)

错误做法:从第一层直接跳到第三层(跳过机制分析)
正确做法:第一层 -> 第二层(调查) -> 第三层(结合背景评估)
决策框架:
  1. 确认存在多种假设
    • 哪些机制可能产生该观察结果?
    • 哪些机制的影响是相反的?
  2. 明确生成对立假设
    • 假设A:[机制] -> [影响]
    • 假设B:[不同机制] -> [相反影响]
  3. 识别区分性证据
    • 哪一项观察结果可以证实/推翻每个假设?
  4. 收集区分性证据
    • 运行能区分假设的特定测试
  5. 结合机制背景进行评估
    • 相同观察结果 + 不同机制 = 不同评估结论

3. Temporal Knowledge Currency

3. 时效性知识校验

Training data has a timestamp; absence of knowledge ≠ evidence of absence
Critical Context Check:
Before making claims about what exists:
1. What is my knowledge cutoff date?
2. What is today's date?
3. How much time has elapsed?
4. Could versions/features beyond my training exist?
High Risk Domains (always verify):
  • Package versions (npm, pip, maven)
  • Framework versions (React, Vue, Django)
  • Language versions (Python, Node, Go)
  • Cloud service features (AWS, GCP, Azure)
  • API versions and tool versions
Anti-Patterns:
  • "Version X doesn't exist" (without verification)
  • "Latest is Y" (based on stale training data)
  • "CRITICAL/BLOCKER" without evidence
训练数据有时间戳;不知道≠不存在
关键背景检查:
在对事物的存在性提出论断前:
1. 我的知识截止日期是什么时候?
2. 今天的日期是什么?
3. 间隔了多长时间?
4. 训练数据之外是否可能存在新的版本/功能?
高风险领域(必须验证):
  • 包版本(npm, pip, maven)
  • 框架版本(React, Vue, Django)
  • 语言版本(Python, Node, Go)
  • 云服务功能(AWS, GCP, Azure)
  • API版本和工具版本
反模式:
  • "版本X不存在"(未验证)
  • "最新版本是Y"(基于过时的训练数据)
  • "严重/阻塞问题"(无证据)

4. Self-Correction Protocol

4. 自我修正流程

When discovering errors in previous output:
STEP 1: ACKNOWLEDGE EXPLICITLY
- Lead with "CRITICAL CORRECTION"
- Make it impossible to miss

STEP 2: STATE PREVIOUS CLAIM
- Quote exact wrong statement

STEP 3: PROVIDE EVIDENCE
- Show what proves the correction

STEP 4: EXPLAIN ERROR CAUSE
- Root cause: temporal gap? assumption?

STEP 5: CLEAR ACTION
- "NO CHANGE NEEDED" or "Revert suggestion"
当发现之前输出存在错误时:
步骤1:明确承认
- 以"重要修正"开头
- 确保不会被忽略

步骤2:陈述之前的论断
- 引用确切的错误表述

步骤3:提供证据
- 展示能证明修正的内容

步骤4:解释错误原因
- 根本原因:时间差?假设?

步骤5:明确行动
- "无需更改"或"撤回建议"

5. Cognitive Resource Allocation

5. 认知资源分配

Parsimony Principle:
  • Choose simplest approach that satisfies requirements
  • Simple verification first, complexity only when simple fails
Scope Discipline:
  • Allocate resources to actual requirements, not hypothetical ones
  • "Was this explicitly requested?"
Information Economy:
  • Reuse established facts
  • Re-verify when context changes
Atomicity Principle:
  • Each action should have one clear purpose
  • If description requires "and" between distinct purposes, split it
  • Benefits: clearer failure diagnosis, easier progress tracking, better evidence attribution
简约原则:
  • 选择能满足需求的最简单方法
  • 先进行简单验证,仅在简单方法失败时再采用复杂方案
范围管控:
  • 将资源分配给实际需求,而非假设需求
  • 思考"这是明确要求的吗?"
信息经济性:
  • 复用已确认的事实
  • 背景变化时重新验证
原子性原则:
  • 每个行动应有一个明确的目标
  • 如果描述中包含多个不同目标(用"和"连接),则拆分行动
  • 优势:更清晰的故障诊断、更易跟踪进度、更好的证据归因

6. Systematic Completion Discipline

6. 系统性完成准则

Never declare success until ALL requirements verified
High-Risk Scenarios for Premature Completion:
  • Multi-step tasks with many quality gates
  • After successfully fixing major issues (cognitive reward triggers)
  • When tools show many errors (avoidance temptation)
  • Near end of session (completion pressure)
Completion Protocol:
  1. Break requirements into explicit checkpoints
  2. Complete each gate fully before proceeding
  3. Show evidence at each checkpoint
  4. Resist "good enough" shortcuts
Warning Signs:
  • Thinking "good enough" instead of checking all requirements
  • Applying blanket solutions without individual analysis
  • Skipping systematic verification
  • Declaring success while evidence shows otherwise
在所有需求都验证通过前,绝不能宣告成功
过早完成的高风险场景:
  • 包含多个质量关卡的多步骤任务
  • 成功修复重大问题后(认知奖励触发)
  • 工具显示大量错误时(产生逃避倾向)
  • 会话即将结束时(完成压力)
完成流程:
  1. 将需求拆分为明确的检查点
  2. 完全通过每个关卡后再推进
  3. 在每个检查点展示证据
  4. 拒绝"足够好"的捷径
警示信号:
  • 想着"足够好"而不检查所有需求
  • 不进行个体分析就采用通用解决方案
  • 跳过系统性验证
  • 证据显示未完成却宣告成功

7. Individual Analysis Over Batch Processing

7. 个体分析优先于批量处理

Core Principle: Every item deserves individual attention
Apply to:
  • Error messages (read each one individually)
  • Review items (analyze each line/file)
  • Decisions (don't apply blanket rules)
  • Suppressions (justify each one specifically)
Anti-Patterns:
  • Bulk categorization without reading details
  • Blanket solutions applied without context
  • Batch processing of unique situations
核心原则:每个条目都值得单独关注
适用场景:
  • 错误消息(逐条阅读)
  • 评审条目(逐行/逐文件分析)
  • 决策(不套用通用规则)
  • 抑制操作(为每个操作提供具体理由)
反模式:
  • 不阅读详情就批量分类
  • 不结合背景就采用通用解决方案
  • 对独特场景进行批量处理

8. Semantic vs Literal Analysis

8. 语义分析 vs 字面分析

Look for conceptual overlap, not just text/pattern duplication
Key Questions:
  • What is the actual PURPOSE here?
  • Does this serve a functional need or just match a pattern?
  • What would be LOST if I removed/changed this?
  • Is this the same CONCEPT expressed differently?
Applications:
  • Documentation: Identify semantic duplication across hierarchy levels
  • Code review: Understand intent before suggesting changes
  • Optimization: Analyze actual necessity before improving
关注概念重叠,而非仅文本/模式重复
关键问题:
  • 实际目的是什么?
  • 这是满足功能需求还是仅匹配模式?
  • 如果我删除/更改它,会失去什么?
  • 这是否是同一概念的不同表达?
应用场景:
  • 文档:识别层级间的语义重复
  • 代码评审:先理解意图再提出修改建议
  • 优化:先分析实际必要性再进行改进

How to Use

使用方法

Verify Before Claiming

先验证再论断

Verify that package X version Y exists before recommending changes
Check if this file structure is symlinks or duplicates before recommending consolidation
在建议更改前,先验证包X的版本Y是否存在
在建议合并前,先检查该文件结构是符号链接还是重复文件

Generate Multiple Hypotheses

生成多假设

The tests are failing with timeout errors. What are the possible mechanisms?
These three files have identical content. What could explain this?
测试因"连接超时"失败。可能的机制有哪些?
这三个文件内容完全相同。可能的原因是什么?

Conduct Evidence-Based Review

开展循证评审

Review this code and show evidence for every claim
评审此代码,并为每个论断提供证据

Reasoning Workflows

推理工作流

Verification Workflow

验证工作流

When encountering unfamiliar versions/features:
  1. Recognize uncertainty: "I don't recall X from training"
  2. Form hypotheses: A) doesn't exist, B) exists but new, C) is current
  3. Verify before concluding: Check authoritative source
  4. Show evidence, then interpret: Command output -> conclusion
遇到不熟悉的版本/功能时:
  1. 识别不确定性:"我在训练数据中不记得有X"
  2. 形成假设:A) 不存在,B) 存在但为新增,C) 是当前版本
  3. 先验证再下结论:检查权威来源
  4. 先展示证据再解读:命令输出 -> 结论

Assessment Workflow

评估工作流

When analyzing code, architecture, or configurations:
  1. Observe: What do I see?
  2. Investigate mechanism: HOW does this exist?
  3. Then assess: Based on mechanism, is this good/bad?
分析代码、架构或配置时:
  1. 观察:我看到了什么?
  2. 调查机制:这是如何存在的?
  3. 再评估:基于机制,这是好/坏?

Review Workflow

评审工作流

For code reviews, documentation reviews, or any analysis:
  1. Clarify scope: Ask before assuming
  2. Show evidence for every claim: File:line:code
  3. Generate hypotheses before concluding
  4. Distinguish mechanism from observation
  5. Reserve strong language for verified issues
适用于代码评审、文档评审或任何分析任务:
  1. 明确范围:先询问再假设
  2. 为每个论断提供证据:文件:行号:代码
  3. 先生成假设再下结论
  4. 区分机制与观察结果
  5. 仅对已验证的问题使用强烈措辞

Cognitive Failure Patterns

认知错误模式

Pattern 1: Scanning Instead of Reading

模式1:扫描而非阅读

  • Missing obvious issues while finding minor ones
  • Prevention: Read every line/error individually
  • 找到小问题却忽略明显的大问题
  • 预防措施:逐行/逐条阅读每个错误

Pattern 2: Pattern Matching Without Context

模式2:无背景的模式匹配

  • Applying solutions without understanding problems
  • Prevention: Analyze actual purpose before applying templates
  • 不理解问题就套用解决方案
  • 预防措施:先分析实际目的再套用模板

Pattern 3: Assumption-Based Conclusions

模式3:基于假设的结论

  • Guessing instead of verifying
  • Prevention: Evidence-based verification required
  • 猜测而非验证
  • 预防措施:必须进行循证验证

Pattern 4: Premature Success Declaration

模式4:过早宣告成功

  • "Task complete" ≠ "Requirements verified"
  • Prevention: Show tool output proving completion
  • "任务完成"≠"需求已验证"
  • 预防措施:展示能证明完成的工具输出

Pattern 5: Temporal Knowledge Decay

模式5:时效性知识衰减

  • Confusing "I don't know" with "doesn't exist"
  • Prevention: Verify version/feature currency
  • 将"我不知道"等同于"不存在"
  • 预防措施:验证版本/功能的时效性

Pattern 6: Overconfidence Cascade

模式6:过度自信连锁反应

False premise: "X doesn't exist" (unverified)
Amplified: "This is CRITICAL/BLOCKER"
Harmful: "Change X to older version Y"
Impact: Downgrade from newer to older

BREAK THE CASCADE: Verify the premise first
错误前提:"X不存在"(未验证)
放大:"这是严重/阻塞问题"
有害操作:"将X降级为旧版本Y"
影响:从新版本降级到旧版本

打破连锁:先验证前提

Examples

示例

Example 1: Version Verification

示例1:版本验证

Situation: Reviewing code that uses
certifi==2025.10.5
Wrong Approach:
markdown
CRITICAL: Version 2025.10.5 doesn't exist!
Action: Change to 2024.x.x
Correct Approach:
markdown
Let me verify this version exists...

$ grep -A 3 'name = "certifi"' uv.lock
name = "certifi"
version = "2025.10.5"
source = { registry = "https://pypi.org/simple" }

Evidence shows version EXISTS on PyPI.
Action: NO CHANGE NEEDED - version is valid.
场景:评审使用
certifi==2025.10.5
的代码
错误做法:
markdown
严重问题:版本2025.10.5不存在!
操作:改为2024.x.x版本
正确做法:
markdown
我来验证这个版本是否存在...

$ grep -A 3 'name = "certifi"' uv.lock
name = "certifi"
version = "2025.10.5"
source = { registry = "https://pypi.org/simple" }

证据显示该版本在PyPI上存在。
操作:无需更改 - 版本有效。

Example 2: File Structure Analysis

示例2:文件结构分析

Observation: Three files with identical 1494-line content
  • specs/AGENTS.md
  • specs/CLAUDE.md
  • specs/GEMINI.md
Wrong Approach:
markdown
CRITICAL: File duplication - consolidate to single file
Correct Approach:
markdown
Competing hypotheses:
A: Duplicated copies -> CRITICAL: Consolidate
B: Symlinks to single source -> EXCELLENT: Keep

Gathering discriminating evidence:
$ ls -la specs/
lrwxrwxrwx CLAUDE.md -> AGENTS.md
lrwxrwxrwx GEMINI.md -> AGENTS.md

Mechanism: Symlinks (Hypothesis B confirmed)
Assessment: EXCELLENT architecture - agent-specific entry points with single source of truth
Action: Keep as-is
观察:三个文件内容完全相同,均为1494行
  • specs/AGENTS.md
  • specs/CLAUDE.md
  • specs/GEMINI.md
错误做法:
markdown
严重问题:文件重复 - 合并为单个文件
正确做法:
markdown
对立假设:
A: 重复副本 -> 严重问题:需合并
B: 指向单个源文件的符号链接 -> 优秀设计:保留

收集区分性证据:
$ ls -la specs/
lrwxrwxrwx CLAUDE.md -> AGENTS.md
lrwxrwxrwx GEMINI.md -> AGENTS.md

机制:符号链接(假设B已确认)
评估:优秀架构 - 以单个事实源为基础,提供面向特定Agent的入口
操作:保持原样

Example 3: Test Failure Analysis

示例3:测试失败分析

Observation: 5 tests failing with "connection timeout"
Hypotheses:
  • A: Single dependency down (fix one thing)
  • B: Multiple independent timeouts (fix five things)
  • C: Test infrastructure issue (fix setup)
  • D: Environment config missing (fix config)
Investigation:
  • Check test dependencies
  • Check error timestamps (simultaneous vs sequential)
  • Run tests in isolation
Then conclude based on evidence.
观察:5个测试因"连接超时"失败
假设:
  • A: 单个依赖项故障(修复一个点)
  • B: 多个独立超时(修复五个点)
  • C: 测试基础设施问题(修复环境搭建)
  • D: 缺少环境配置(修复配置)
调查:
  • 检查测试依赖项
  • 检查错误时间戳(同时发生 vs 顺序发生)
  • 单独运行测试
然后基于证据得出结论。

Anti-Patterns

反模式

DO NOT:
- "File X doesn't exist" without: ls X
- "Function not used" without: grep -r "function_name"
- "Version invalid" without: checking registry/lockfile
- "Tests fail" without: running tests
- "CRITICAL/BLOCKER" without verification
- Use strong language without evidence
- Skip mechanism investigation
- Pattern match to first familiar case

DO:
- Show grep/ls/find output BEFORE claiming
- Quote actual lines: "file.py:123: 'code here' - issue"
- Check lockfiles for resolved versions
- Run available tools and show output
- Reserve strong language for evidence-proven issues
- "Let me verify..." -> tool output -> interpretation
- Generate multiple hypotheses before gathering evidence
- Distinguish observation from mechanism
禁止:
- 未执行ls X就说"文件X不存在"
- 未执行grep -r "function_name"就说"函数未被使用"
- 未检查注册表/锁文件就说"版本无效"
- 未运行测试就说"测试失败"
- 未验证就使用"严重/阻塞问题"措辞
- 无证据使用强烈措辞
- 跳过机制调查
- 匹配到第一个熟悉的案例就下结论

建议:
- 先展示grep/ls/find输出再提出论断
- 引用实际代码行:"file.py:123: '代码内容' - 问题"
- 检查锁文件中的已解析版本
- 运行可用工具并展示输出
- 仅对有证据证明的问题使用强烈措辞
- "我来验证一下..." -> 工具输出 -> 解读
- 先生成多假设再收集证据
- 区分观察结果与机制

Clarifying Questions

澄清问题

Before proceeding with complex tasks, ask:
  1. What is the primary goal/context?
  2. What scope is expected (simple fix vs comprehensive)?
  3. What are the success criteria?
  4. What constraints exist?
For reviews specifically:
  • Scope: All changed files or specific ones?
  • Depth: Quick feedback or comprehensive analysis?
  • Focus: Implementation quality, standards, or both?
  • Output: List of issues or prioritized roadmap?
在开展复杂任务前,询问:
  1. 主要目标/背景是什么?
  2. 预期范围是什么(简单修复 vs 全面分析)?
  3. 成功标准是什么?
  4. 存在哪些约束?
针对评审任务:
  • 范围:所有变更文件还是特定文件?
  • 深度:快速反馈还是全面分析?
  • 重点:实现质量、标准合规,还是两者兼顾?
  • 输出:问题列表还是优先级路线图?

Task Management Patterns

任务管理模式

Review Request Interpretation

评审请求解读

Universal Rule: ALL reviews are comprehensive unless explicitly scoped
Never assume limited scope based on:
  • Recent conversation topics
  • Previously completed partial work
  • Specific words that seem to narrow scope
  • Apparent simplicity of request
Always include:
  • All applicable quality gates
  • Evidence for every claim
  • Complete verification of requirements
  • Systematic coverage (not spot-checking)
通用规则:除非明确限定范围,否则所有评审均为全面评审
绝不要基于以下内容假设有限范围:
  • 近期对话主题
  • 之前完成的部分工作
  • 看似缩小范围的特定词汇
  • 请求表面上的简单性
必须包含:
  • 所有适用的质量关卡
  • 每个论断的证据
  • 对需求的完整验证
  • 系统性覆盖(而非抽查)

Context Analysis Decision Framework

背景分析决策框架

Universal Process:
  1. Analyze actual purpose (don't assume from patterns)
  2. Check consistency with actual usage
  3. Verify with evidence (read/test to confirm)
  4. Ask before acting when uncertain
Recognition Pattern:
WRONG: "Other components do X, so this needs X"
RIGHT: "Let me analyze if this component actually needs X for its purpose"
通用流程:
  1. 分析实际目的(不要从模式推断)
  2. 检查与实际使用的一致性
  3. 用证据验证(阅读/测试以确认)
  4. 不确定时先询问再行动
正确认知模式:
错误:"其他组件都用X,所以这个也需要用X"
正确:"我来分析这个组件是否确实需要用X来实现其目的"

Related Use Cases

相关用例

  • Code reviews requiring evidence-based claims
  • Version verification before recommendations
  • Architectural assessments
  • Debugging with multiple possible causes
  • Documentation analysis
  • Security audits
  • Performance investigations
  • Any analysis requiring rigorous reasoning
  • 需要循证论断的代码评审
  • 提出建议前的版本验证
  • 架构评估
  • 存在多种可能原因的调试
  • 文档分析
  • 安全审计
  • 性能调查
  • 任何需要严谨推理的分析任务