quality-reflective-questions

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Reflective Questions for Work Completeness

工作完成度反思提问框架

Quick Start

快速开始

Before marking ANYTHING as "done", ask yourself these questions and provide HONEST answers:
在将任何工作标记为“已完成”之前,问自己以下问题并给出诚实的答案:

The Four Mandatory Questions

四个必问问题

  1. How do I trigger this? (What's the entry point?)
  2. What connects it to the system? (Where's the wiring?)
  3. What evidence proves it runs? (Show me the logs)
  4. What shows it works correctly? (What's the outcome?)
If you cannot answer ALL FOUR with specific, concrete details, the work is NOT complete.
  1. 如何触发该功能?(入口点是什么?)
  2. 它与系统的连接点在哪里?(关联逻辑在哪里?)
  3. 有什么证据能证明它已运行?(展示相关日志)
  4. 有什么能证明它运行正常?(结果是什么?)
如果你无法针对所有四个问题给出具体、详实的细节,那么这项工作尚未完成

The Honesty Test

诚实性测试

Replace vague answers with specific evidence:
Bad (vague): "It's integrated" → ✅ Good (specific): "Imported in builder.py line 45" ❌ Bad (vague): "It works" → ✅ Good (specific): "Logs show execution at 10:30:45" ❌ Bad (vague): "Tests pass" → ✅ Good (specific): "46 unit tests + 2 integration tests pass"
用具体证据替代模糊的回答:
错误示例(模糊):“已完成集成” → ✅ 正确示例(具体):“在builder.py第45行导入” ❌ 错误示例(模糊):“可以正常运行” → ✅ 正确示例(具体):“日志显示10:30:45时已执行” ❌ 错误示例(模糊):“测试已通过” → ✅ 正确示例(具体):“46个单元测试 + 2个集成测试全部通过”

Table of Contents

目录

  1. When to Use This Skill
  2. What This Skill Does
  3. The Four Mandatory Questions (Deep Dive)
  4. Category-Specific Questions
  5. Red Flag Questions
  6. The Honesty Checklist
  7. Common Self-Deception Patterns
  8. Supporting Files
  9. Expected Outcomes
  10. Requirements
  11. Red Flags to Avoid
  1. 何时使用该框架
  2. 该框架的作用
  3. 四个必问问题(深度解析)
  4. 特定分类的补充问题
  5. 警示性问题
  6. 诚实性检查清单
  7. 常见自我欺骗模式
  8. 支持文件
  9. 预期成果
  10. 必要条件
  11. 需要避免的警示信号

When to Use This Skill

何时使用该框架

Explicit Triggers

明确触发场景

  • "Challenge my assumptions about completeness"
  • "Ask me reflective questions about my work"
  • "Self-review my implementation"
  • "Is this really done?"
  • "Verify my work is complete"
  • "Question my completion claims"
  • “挑战我对完成度的假设”
  • “问我关于工作的反思性问题”
  • “自我评审我的实现代码”
  • “这真的完成了吗?”
  • “验证我的工作是否已完成”
  • “质疑我关于工作完成的宣称”

Implicit Triggers (PROACTIVE)

主动触发场景(预防性)

  • Before marking any task complete (every single time)
  • Before moving ADR from in_progress to completed
  • Before claiming "feature works"
  • Before self-approving work
  • After implementing any feature
  • When about to say "all tests passing ✅"
  • 在标记任何任务为已完成之前(每次都要)
  • 在将ADR从进行中转为已完成状态之前
  • 在宣称“功能可正常运行”之前
  • 在自我批准工作之前
  • 在完成任何功能实现之后
  • 当你准备说“所有测试已通过 ✅”时

Debugging Triggers

调试触发场景

  • "Why do I feel uncertain about this?"
  • "Something seems incomplete but I can't pinpoint it"
  • "I want to mark this done but have doubts"
  • "Am I missing something?"
  • “为什么我对这项工作感到不确定?”
  • “感觉有些地方不完整,但我无法明确指出”
  • “我想标记为已完成,但心存疑虑”
  • “我是不是遗漏了什么?”

What This Skill Does

该框架的作用

This skill provides a structured framework of reflective questions that:
  1. Challenges assumptions about what "done" means
  2. Exposes gaps between claimed completion and actual completion
  3. Forces specificity instead of vague assurances
  4. Prevents premature completion by requiring evidence
  5. Catches integration failures before they become incidents
This skill complements
quality-verify-implementation-complete
by providing the mental framework for self-questioning BEFORE running technical verification.
本框架提供一套结构化的反思性提问体系,能够:
  1. 挑战对“已完成”定义的假设
  2. 暴露宣称的完成度与实际完成度之间的差距
  3. 强制要求给出具体细节,而非模糊的保证
  4. 通过要求提供证据防止过早标记完成
  5. 在问题演变为事故之前发现集成故障
本框架是
quality-verify-implementation-complete
的补充
,它在执行技术验证之前,提供自我反思的思维框架。

The Four Mandatory Questions (Deep Dive)

四个必问问题(深度解析)

These questions MUST be answered for EVERY piece of work before claiming "done".
在宣称工作“已完成”之前,必须针对每一项工作回答这些问题。

Question 1: How do I trigger this?

问题1:如何触发该功能?

Purpose: Verify the feature has a reachable entry point
What it really asks:
  • Can a user/system actually invoke this code?
  • Is there a documented way to make this execute?
  • Could someone else trigger this without asking me?
Good Answers (Specific):
  • ✅ "Run:
    uv run temet-run -a talky -p 'analyze code'
    "
  • ✅ "Call:
    curl -X POST /api/endpoint -d '{...}'
    "
  • ✅ "Import:
    from myapp import MyService; MyService().method()
    "
  • ✅ "Event: Coordinator triggers when
    should_review_architecture()
    returns True"
Bad Answers (Vague):
  • ❌ "Run the system"
  • ❌ "It's automatic"
  • ❌ "The coordinator calls it"
  • ❌ "When needed"
Follow-up Questions:
  • "Can you show me the EXACT command right now?"
  • "What arguments/parameters are required?"
  • "Under what conditions does this trigger?"
  • "Could you trigger this in the next 30 seconds if asked?"
If you cannot answer specifically: The feature has no entry point → NOT COMPLETE
**目的:**验证功能有可访问的入口点
本质是问:
  • 用户/系统是否真的能调用这段代码?
  • 是否有文档说明如何执行该功能?
  • 其他人无需询问我就能触发该功能吗?
优秀回答(具体):
  • ✅ “执行命令:
    uv run temet-run -a talky -p 'analyze code'
  • ✅ “调用方式:
    curl -X POST /api/endpoint -d '{...}'
  • ✅ “导入方式:
    from myapp import MyService; MyService().method()
  • ✅ “触发事件:当
    should_review_architecture()
    返回True时,协调器会触发该功能”
糟糕回答(模糊):
  • ❌ “运行系统即可”
  • ❌ “它是自动触发的”
  • ❌ “协调器会调用它”
  • ❌ “需要时就会触发”
跟进问题:
  • “你现在能给出确切的触发命令吗?”
  • “需要哪些参数?”
  • 在什么条件下会触发该功能?
  • “如果现在要求你触发,你能在30秒内完成吗?”
**如果无法给出具体回答:**功能没有可访问的入口点 → 未完成

Question 2: What connects it to the system?

问题2:它与系统的连接点在哪里?

Purpose: Verify the artifact is actually wired into the codebase
What it really asks:
  • Where is the import statement?
  • Where is the registration/initialization?
  • Where is the configuration that enables this?
  • Can you show me the LINE NUMBER where this is connected?
Good Answers (Specific):
  • ✅ "builder.py line 45:
    from .architecture_nodes import create_review_node
    "
  • ✅ "main.py line 12:
    app.add_command(my_command)
    "
  • ✅ "container.py line 67:
    container.register(MyService, scope=Scope.SINGLETON)
    "
  • ✅ "routes.py line 23:
    router.add_route('/endpoint', handler)
    "
Bad Answers (Vague):
  • ❌ "It's imported"
  • ❌ "It's in the builder"
  • ❌ "It's registered"
  • ❌ "It's wired up"
Follow-up Questions:
  • "Can you paste the EXACT import line?"
  • "What FILE and LINE NUMBER has the registration?"
  • "Can you show me with grep output?"
  • "Could I find this connection in 60 seconds if I looked?"
If you cannot answer specifically: The artifact is orphaned → NOT COMPLETE
**目的:**验证该组件已正确接入代码库
本质是问:
  • 导入语句在哪里?
  • 注册/初始化逻辑在哪里?
  • 启用该功能的配置在哪里?
  • 你能展示连接点所在的具体行号吗?
优秀回答(具体):
  • ✅ “builder.py第45行:
    from .architecture_nodes import create_review_node
  • ✅ “main.py第12行:
    app.add_command(my_command)
  • ✅ “container.py第67行:
    container.register(MyService, scope=Scope.SINGLETON)
  • ✅ “routes.py第23行:
    router.add_route('/endpoint', handler)
糟糕回答(模糊):
  • ❌ “已经导入了”
  • ❌ “在builder文件里”
  • ❌ “已经注册过了”
  • ❌ “已经关联好了”
跟进问题:
  • “你能粘贴确切的导入代码行吗?”
  • “注册逻辑在哪个文件的哪一行?”
  • “你能用grep命令的输出展示吗?”
  • “如果我现在去查找,能在60秒内找到这个连接点吗?”
**如果无法给出具体回答:**该组件是孤立的 → 未完成

Question 3: What evidence proves it runs?

问题3:有什么证据能证明它已运行?

Purpose: Verify the code actually executes at runtime
What it really asks:
  • Have you ACTUALLY triggered this and observed execution?
  • What logs/traces show this code path was hit?
  • Can you show me timestamped evidence of execution?
  • Did you observe this with your own eyes (or grep)?
Good Answers (Specific):
  • ✅ "Logs:
    [2025-12-07 10:30:45] INFO architecture_review_triggered agent=talky
    "
  • ✅ "Output:
    ✓ Task completed successfully
    (from CLI run at 10:30)"
  • ✅ "Trace: OpenTelemetry span
    architecture_review
    with duration 1.2s"
  • ✅ "Debug: Added print statement, saw output 'Node executed'"
Bad Answers (Vague):
  • ❌ "It should run"
  • ❌ "Tests pass"
  • ❌ "No errors when I ran it"
  • ❌ "The system works"
Follow-up Questions:
  • "Can you paste the ACTUAL log line showing execution?"
  • "What TIMESTAMP did this execute?"
  • "Did you observe this directly or are you assuming?"
  • "Could you trigger this RIGHT NOW and show me the logs?"
If you cannot answer specifically: No execution proof → NOT COMPLETE
**目的:**验证代码在运行时确实被执行
本质是问:
  • 你是否真的触发过该功能并观察到执行过程?
  • 哪些日志/追踪信息显示该代码路径已被执行?
  • 你能展示带时间戳的执行证据吗?
  • 你是亲眼看到(或通过grep命令确认)的吗?
优秀回答(具体):
  • ✅ “日志:
    [2025-12-07 10:30:45] INFO architecture_review_triggered agent=talky
  • ✅ “输出:
    ✓ Task completed successfully
    (10:30时通过CLI执行)”
  • ✅ “追踪信息:OpenTelemetry的
    architecture_review
    span,耗时1.2秒”
  • ✅ “调试:添加了打印语句,看到输出‘Node executed’”
糟糕回答(模糊):
  • ❌ “应该能运行”
  • ❌ “测试已通过”
  • ❌ “我运行时没有报错”
  • ❌ “系统能正常工作”
跟进问题:
  • “你能粘贴显示执行的实际日志行吗?”
  • “执行的时间戳是什么?”
  • “你是直接观察到的,还是只是假设?”
  • “你现在能触发该功能并展示日志吗?”
**如果无法给出具体回答:**没有执行证据 → 未完成

Question 4: What shows it works correctly?

问题4:有什么能证明它运行正常?

Purpose: Verify the code produces the expected outcome
What it really asks:
  • What observable outcome proves correct behavior?
  • What state changed as a result of execution?
  • What output/artifact was created?
  • How do you KNOW it did the right thing?
Good Answers (Specific):
  • ✅ "State:
    result.architecture_review = ArchitectureReviewResult(status=APPROVED, violations=[])
    "
  • ✅ "Database: Row inserted with ID 123, verified with query"
  • ✅ "File: Created
    output.txt
    with expected contents (see: cat output.txt)"
  • ✅ "API: Returned HTTP 200 with JSON body containing expected fields"
Bad Answers (Vague):
  • ❌ "It works"
  • ❌ "No errors"
  • ❌ "Tests pass"
  • ❌ "Everything looks good"
Follow-up Questions:
  • "Can you show me the EXACT output/state change?"
  • "What VALUE did this produce?"
  • "How do you KNOW this is correct vs just 'no errors'?"
  • "Could you demonstrate correct behavior RIGHT NOW?"
If you cannot answer specifically: No outcome proof → NOT COMPLETE
**目的:**验证代码产生了预期的结果
本质是问:
  • 有哪些可观察的结果能证明行为正确?
  • 执行后哪些状态发生了变化?
  • 生成了哪些输出/产物?
  • 你如何确认它做了正确的事情?
优秀回答(具体):
  • ✅ “状态:
    result.architecture_review = ArchitectureReviewResult(status=APPROVED, violations=[])
  • ✅ “数据库:插入了ID为123的行,已通过查询验证”
  • ✅ “文件:生成了
    output.txt
    ,内容符合预期(可查看:cat output.txt)”
  • ✅ “API:返回HTTP 200状态码,JSON响应包含预期字段”
糟糕回答(模糊):
  • ❌ “能正常运行”
  • ❌ “没有报错”
  • ❌ “测试已通过”
  • ❌ “看起来一切正常”
跟进问题:
  • “你能展示确切的输出/状态变化吗?”
  • “它产生了什么实际价值?”
  • “你如何确认这是正确的,而不仅仅是‘没有报错’?”
  • “你现在能演示它的正确行为吗?”
**如果无法给出具体回答:**没有结果正确性的证据 → 未完成

Category-Specific Questions

特定分类的补充问题

Apply the Four Questions framework to specific implementation types. For detailed questions by category, see references/category-specific-questions.md.
Categories covered:
  • Modules/Files: Import verification, call-site validation
  • LangGraph Nodes: Graph registration, edge connectivity
  • CLI Commands: Registration, --help visibility, execution
  • Service Classes (DI): Container registration, injection points
  • API Endpoints: Route registration, response validation
将四个必问问题框架应用到特定类型的实现中。关于各分类的详细问题,请查看references/category-specific-questions.md
覆盖的分类:
  • 模块/文件:导入验证、调用点确认
  • LangGraph节点:图注册、边连接验证
  • CLI命令:注册状态、--help可见性、执行验证
  • 服务类(DI):容器注册、注入点确认
  • API端点:路由注册、响应验证

Red Flag Questions

警示性问题

These questions expose common self-deception patterns. If you answer "yes" to any, stop and investigate.
这些问题能暴露常见的自我欺骗模式。如果任何问题的答案是“是”,请立即停止并调查

Integration Red Flags

集成相关警示信号

  1. "Did I only test this in isolation?"
    • If YES: You might have orphaned code
    • Action: Add integration test, verify in real system
  2. "Am I assuming something is connected without verifying?"
    • If YES: Assumption might be wrong
    • Action: Grep for imports, verify connection exists
  3. "Did I only run unit tests, not integration tests?"
    • If YES: Integration might be broken
    • Action: Create/run integration tests
  4. "Am I relying on 'should' or 'probably' language?"
    • If YES: You're guessing, not verifying
    • Action: Replace guesses with evidence
  5. "Could this code exist and never execute?"
    • If YES: It might be orphaned
    • Action: Verify call-sites exist in production code
  1. “我只在隔离环境中测试过该功能?”
    • 如果是:你的代码可能是孤立的
    • 行动:添加集成测试,在真实系统中验证
  2. “我假设它已正确连接,但没有实际验证?”
    • 如果是:你的假设可能是错误的
    • 行动:用grep命令查找导入语句,验证连接是否存在
  3. “我只运行了单元测试,没有运行集成测试?”
    • 如果是:集成可能存在问题
    • 行动:创建/运行集成测试
  4. “我使用了‘应该’或‘可能’这类表述?”
    • 如果是:你在猜测,而非验证
    • 行动:用证据替代猜测
  5. “这段代码可能存在但永远不会被执行?”
    • 如果是:它可能是孤立代码
    • 行动:验证生产代码中是否存在调用点

Execution Red Flags

执行相关警示信号

  1. "Have I not actually triggered this feature?"
    • If YES: You don't know if it works
    • Action: Trigger it, observe execution
  2. "Am I claiming it works based on 'no errors' vs positive proof?"
    • If YES: Absence of errors ≠ presence of success
    • Action: Show positive evidence of correct behavior
  3. "Did I forget to check logs after running?"
    • If YES: No execution proof
    • Action: Run again, capture logs
  4. "Am I trusting tests alone without manual verification?"
    • If YES: Tests might be mocked/isolated
    • Action: Manual E2E test, verify in real environment
  5. "Could this feature be wired but the conditional never triggers?"
    • If YES: Dead code path
    • Action: Verify the condition is reachable
  1. “我从未实际触发过该功能?”
    • 如果是:你无法确认它是否能正常运行
    • 行动:触发该功能,观察执行过程
  2. “我仅根据‘没有报错’就宣称它能正常运行,而非有正面证据?”
    • 如果是:没有报错 ≠ 运行正常
    • 行动:提供能证明行为正确的正面证据
  3. “我运行后忘记检查日志了?”
    • 如果是:没有执行证据
    • 行动:重新运行并捕获日志
  4. “我只依赖测试结果,没有进行手动验证?”
    • 如果是:测试可能被模拟/隔离,无法反映真实情况
    • 行动:进行手动端到端测试,在真实环境中验证
  5. “该功能已接入系统,但触发条件永远无法满足?”
    • 如果是:这是死代码路径
    • 行动:验证触发条件是否可达

Completion Red Flags

完成度相关警示信号

  1. "Am I rushing to mark this complete?"
    • If YES: Slow down, verify properly
    • Action: Run through Four Questions again
  2. "Do I have doubts I'm ignoring?"
    • If YES: Your instinct is usually right
    • Action: Investigate the doubt before proceeding
  3. "Would I bet $1000 this works end-to-end?"
    • If NO: You're not confident
    • Action: Find out why, verify until confident
  4. "Could someone else verify this works without asking me?"
    • If NO: Insufficient documentation/evidence
    • Action: Document entry point, provide evidence
  5. "Am I self-approving without external review?"
    • If YES: You might miss blind spots
    • Action: Request reviewer agent or peer review
  1. “我急于将这项工作标记为已完成?”
    • 如果是:放慢速度,进行充分验证
    • 行动:重新过一遍四个必问问题
  2. “我有疑虑但选择忽略?”
    • 如果是:你的直觉通常是对的
    • 行动:在继续之前先调查疑虑点
  3. “我愿意赌1000美元它能端到端正常运行?”
    • 如果否:你并不确定
    • 行动:找出不确定的原因,验证到你有信心为止
  4. “其他人无需询问我就能验证它是否正常运行?”
    • 如果否:文档/证据不足
    • 行动:记录入口点,提供验证证据
  5. “我在没有外部评审的情况下自我批准了工作?”
    • 如果是:你可能遗漏了盲点
    • 行动:请求评审Agent或同行评审

The Honesty Checklist

诚实性检查清单

Before marking ANYTHING complete, answer these honestly:
在标记任何工作为已完成之前,诚实地回答以下问题:

Evidence Requirements

证据要求

  • I can paste the exact command to trigger this feature (Not "run the system" - the EXACT command with args)
  • I can show the file and line number where this is imported/registered (Not "it's in builder.py" - the EXACT line number)
  • I have actual logs showing this code executed (Not "it should log" - actual timestamped log lines)
  • I can show the specific output/state change this produced (Not "it works" - the EXACT output/data)
  • I triggered this manually and observed it work (Not "tests pass" - I personally ran it)
  • 我能粘贴触发该功能的确切命令 (不是“运行系统”,而是带参数的确切命令)
  • 我能展示该组件被导入/注册的文件和行号 (不是“在builder.py里”,而是确切的行号)
  • 我有显示该代码已执行的实际日志 (不是“应该会有日志”,而是带时间戳的实际日志行)
  • 我能展示它产生的具体输出/状态变化 (不是“能正常运行”,而是确切的输出/数据)
  • 我手动触发过该功能并观察到它正常运行 (不是“测试已通过”,而是我亲自运行过)

Integration Requirements

集成要求

  • This code is imported in at least one production file (grep output shows import, not just tests)
  • This code has call-sites in production paths (grep output shows calls, not just definitions)
  • This code is registered/wired where it needs to be (container, graph, router, CLI - verified)
  • Integration tests verify this component is in the system (Not just unit tests - integration/E2E tests exist)
  • 这段代码已在至少一个生产文件中被导入 (grep命令输出显示存在导入,而非仅在测试文件中)
  • 这段代码在生产路径中有调用点 (grep命令输出显示存在调用,而非仅定义)
  • 这段代码已在所需位置完成注册/关联 (容器、图、路由、CLI - 已验证)
  • 有集成测试验证该组件已接入系统 (不仅是单元测试,还存在集成/端到端测试)

Outcome Requirements

结果要求

  • I can demonstrate this works to someone else right now (Could walk someone through triggering and observing)
  • The behavior matches the specification (Not just "no errors" - correct behavior observed)
  • I would bet money this works end-to-end (Confident enough to stake reputation on it)
  • I have answered all Four Questions with specific details (No vague answers, all concrete)
  • 我现在就能向他人演示它的正常运行 (能引导他人完成触发和观察的全过程)
  • 行为符合需求规格 (不仅是“没有报错”,而是观察到了正确的行为)
  • 我愿意为它的端到端正常运行下注 (足够自信,愿意为此赌上自己的信誉)
  • 我已用具体细节回答了所有四个必问问题 (没有模糊回答,全部是具体内容)

If ANY checkbox is unchecked: NOT COMPLETE

如果有任何一个复选框未勾选:工作未完成

Common Self-Deception Patterns

常见自我欺骗模式

Be aware of these patterns that lead to premature completion claims. For detailed analysis and fixes, see references/self-deception-patterns.md.
Common Patterns:
  1. "Tests Pass" Syndrome - Unit tests pass but integration untested
  2. "Should Work" Fallacy - Using assumptions instead of evidence
  3. "No Errors" Confusion - Equating silence with correctness
  4. "File Exists" Completion - Code written but not integrated
  5. "Looks Good" Approval - Vague approval without specifics
  6. "I Remember Doing It" - Trusting memory over verification
  7. "Later Will Be Fine" - Deferring critical verification steps
请注意这些会导致过早宣称工作完成的模式。关于详细分析和解决方法,请查看references/self-deception-patterns.md
常见模式:
  1. “测试通过”综合征 - 单元测试通过,但集成未测试
  2. “应该能运行”谬误 - 用假设替代证据
  3. “没有报错”误区 - 将无报错等同于运行正常
  4. “文件已存在”式完成 - 代码已编写但未接入系统
  5. “看起来不错”式批准 - 模糊的批准,没有具体依据
  6. “我记得做过” - 相信记忆而非当前验证
  7. “以后再处理” - 推迟关键的验证步骤

Usage

使用方法

  1. Before marking work complete, run through the Four Questions
  2. Check the Honesty Checklist - all boxes must be checked
  3. Verify no Red Flags are present
  4. If uncertain, review references/category-specific-questions.md for your implementation type
Supporting Files:
  • references/category-specific-questions.md - Detailed questions by category
  • references/self-deception-patterns.md - Pattern recognition and fixes
  1. 在标记工作为已完成之前,过一遍四个必问问题
  2. 检查诚实性检查清单 - 所有复选框必须勾选
  3. 确认没有警示信号存在
  4. 如果不确定,查看references/category-specific-questions.md中对应实现类型的补充问题
支持文件:
  • references/category-specific-questions.md - 各分类的详细问题
  • references/self-deception-patterns.md - 模式识别与解决方法

Expected Outcomes

预期成果

Successful Self-Review

成功的自我评审

Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-07

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ✅ SPECIFIC: uv run temet-run -a talky -p "Write a function"
   When should_review_architecture() returns True (when code_changes detected)

2. What connects it to the system?
   ✅ SPECIFIC: builder.py line 12: from .architecture_nodes import create_architecture_review_node
   builder.py line 146: graph.add_node("architecture_review", review_node)
   builder.py line 189: Conditional edge from "query_claude"

3. What evidence proves it runs?
   ✅ SPECIFIC: Logs from execution at 2025-12-07 10:30:45:
   [INFO] architecture_review_triggered agent=talky session=abc123
   [INFO] architecture_review_complete status=approved violations=0

4. What shows it works correctly?
   ✅ SPECIFIC: state.architecture_review = ArchitectureReviewResult(
       status=ReviewStatus.APPROVED,
       violations=[],
       recommendations=["Code follows Clean Architecture"]
   )

HONESTY CHECKLIST:
✅ All evidence specific, not vague
✅ All connections verified with grep
✅ Execution observed directly
✅ Outcome matches specification

SELF-DECEPTION CHECK:
✅ Not relying on "tests pass" alone
✅ Not using "should" or "probably"
✅ Not assuming - all verified
✅ Would bet $1000 this works

DECISION: ✅ WORK IS COMPLETE
Ready to mark as done.
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-07

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ✅ SPECIFIC: uv run temet-run -a talky -p "Write a function"
   When should_review_architecture() returns True (when code_changes detected)

2. What connects it to the system?
   ✅ SPECIFIC: builder.py line 12: from .architecture_nodes import create_architecture_review_node
   builder.py line 146: graph.add_node("architecture_review", review_node)
   builder.py line 189: Conditional edge from "query_claude"

3. What evidence proves it runs?
   ✅ SPECIFIC: Logs from execution at 2025-12-07 10:30:45:
   [INFO] architecture_review_triggered agent=talky session=abc123
   [INFO] architecture_review_complete status=approved violations=0

4. What shows it works correctly?
   ✅ SPECIFIC: state.architecture_review = ArchitectureReviewResult(
       status=ReviewStatus.APPROVED,
       violations=[],
       recommendations=["Code follows Clean Architecture"]
   )

HONESTY CHECKLIST:
✅ All evidence specific, not vague
✅ All connections verified with grep
✅ Execution observed directly
✅ Outcome matches specification

SELF-DECEPTION CHECK:
✅ Not relying on "tests pass" alone
✅ Not using "should" or "probably"
✅ Not assuming - all verified
✅ Would bet $1000 this works

DECISION: ✅ WORK IS COMPLETE
Ready to mark as done.

Failed Self-Review (Catches Incompleteness)

失败的自我评审(发现不完整问题)

Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-05 (BEFORE FIX)

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ⚠️  VAGUE: "Run the coordinator"
   FOLLOW-UP: What's the EXACT command?
   RE-ANSWER: uv run temet-run -a talky -p "..."
   ⚠️  STILL VAGUE: What prompts trigger the node?

2. What connects it to the system?
   ❌ VAGUE: "It should be in builder.py"
   FOLLOW-UP: Can you show me the line number?
   RE-CHECK: grep "architecture_nodes" builder.py
   RESULT: (empty) ❌
   CRITICAL: MODULE IS NOT IMPORTED

3. What evidence proves it runs?
   ❌ ASSUMPTION: "Tests pass so it should run"
   FOLLOW-UP: Have you actually run it and seen logs?
   RE-ANSWER: "No, just ran unit tests"
   CRITICAL: NO EXECUTION PROOF

4. What shows it works correctly?
   ❌ ASSUMPTION: "Tests verify behavior"
   FOLLOW-UP: What actual output did you observe?
   RE-ANSWER: "Just the test assertions passing"
   CRITICAL: NO RUNTIME OUTCOME PROOF

HONESTY CHECKLIST:
❌ Using vague language ("should", "I think")
❌ No specific line numbers or imports shown
❌ No execution logs captured
❌ Relying on tests, not runtime verification

SELF-DECEPTION CHECK:
❌ Relying on "tests pass" only
❌ Using "should" repeatedly
❌ Assuming instead of verifying
❌ Would NOT bet $1000 (honest answer: no)

DECISION: ❌ WORK IS NOT COMPLETE
Critical issues found:
1. Module not imported in builder.py
2. No runtime execution proof
3. No integration test

DO NOT mark as done. Fix integration first.
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-05 (BEFORE FIX)

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ⚠️  VAGUE: "Run the coordinator"
   FOLLOW-UP: What's the EXACT command?
   RE-ANSWER: uv run temet-run -a talky -p "..."
   ⚠️  STILL VAGUE: What prompts trigger the node?

2. What connects it to the system?
   ❌ VAGUE: "It should be in builder.py"
   FOLLOW-UP: Can you show me the line number?
   RE-CHECK: grep "architecture_nodes" builder.py
   RESULT: (empty) ❌
   CRITICAL: MODULE IS NOT IMPORTED

3. What evidence proves it runs?
   ❌ ASSUMPTION: "Tests pass so it should run"
   FOLLOW-UP: Have you actually run it and seen logs?
   RE-ANSWER: "No, just ran unit tests"
   CRITICAL: NO EXECUTION PROOF

4. What shows it works correctly?
   ❌ ASSUMPTION: "Tests verify behavior"
   FOLLOW-UP: What actual output did you observe?
   RE-ANSWER: "Just the test assertions passing"
   CRITICAL: NO RUNTIME OUTCOME PROOF

HONESTY CHECKLIST:
❌ Using vague language ("should", "I think")
❌ No specific line numbers or imports shown
❌ No execution logs captured
❌ Relying on tests, not runtime verification

SELF-DECEPTION CHECK:
❌ Relying on "tests pass" only
❌ Using "should" repeatedly
❌ Assuming instead of verifying
❌ Would NOT bet $1000 (honest answer: no)

DECISION: ❌ WORK IS NOT COMPLETE
Critical issues found:
1. Module not imported in builder.py
2. No runtime execution proof
3. No integration test

DO NOT mark as done. Fix integration first.

Requirements

必要条件

Tools Required

所需工具

  • None (this is a mental framework)
  • 无(这是一个思维框架)

Knowledge Required

所需知识

  • Understanding of what "done" means in your domain
  • Willingness to be honest with yourself
  • Ability to distinguish vague from specific answers
  • 理解你所在领域中“已完成”的定义
  • 愿意对自己诚实
  • 能够区分模糊回答和具体回答

Mindset Required

所需心态

  • Intellectual honesty - Admit when you don't know
  • Rigor - Don't accept vague answers from yourself
  • Patience - Take time to verify properly
  • Courage - Admit incompleteness vs rushing to "done"
  • 理智诚实 - 承认自己不知道的事情
  • 严谨性 - 不接受自己给出的模糊回答
  • 耐心 - 花时间进行充分验证
  • 勇气 - 承认工作未完成,而非急于标记为“已完成”

Red Flags to Avoid

需要避免的警示信号

Do Not

不要

  • ❌ Accept vague answers from yourself
  • ❌ Use "should", "probably", "I think" language
  • ❌ Rush through questions to mark done faster
  • ❌ Skip questions that feel uncomfortable
  • ❌ Trust memory instead of current verification
  • ❌ Assume connection without grep proof
  • ❌ Claim execution without logs
  • ❌ Rely on unit tests alone for integration work
  • ❌ 接受自己给出的模糊回答
  • ❌ 使用“应该”“可能”“我认为”这类表述
  • ❌ 为了更快标记为已完成而仓促回答问题
  • ❌ 跳过让你感到不适的问题
  • ❌ 相信记忆而非当前的验证结果
  • ❌ 假设已连接但不用grove命令验证
  • ❌ 没有日志就宣称已执行
  • ❌ 仅依赖单元测试来验证集成工作

Do

  • ✅ Answer all Four Questions with specific details
  • ✅ Replace assumptions with evidence
  • ✅ Be honest about gaps and uncertainties
  • ✅ Verify current state, don't trust memory
  • ✅ Show concrete proof (line numbers, logs, output)
  • ✅ Admit incompleteness when found
  • ✅ Fix gaps before marking complete
  • ✅ Use this framework for EVERY completion claim
  • ✅ 用具体细节回答所有四个必问问题
  • ✅ 用证据替代假设
  • ✅ 诚实地面对差距和不确定性
  • ✅ 验证当前状态,不相信记忆
  • ✅ 展示具体证据(行号、日志、输出)
  • ✅ 发现不完整时勇于承认
  • ✅ 标记为已完成之前先修复问题
  • ✅ 对每一项完成宣称都使用该框架

Notes

说明

  • This skill was created in response to ADR-013 (2025-12-07)
  • The pattern: Self-deception about completeness led to orphaned code
  • This skill provides the mental framework BEFORE technical verification
  • Pair this with
    quality-verify-implementation-complete
    for full coverage
  • The Four Questions are the MINIMUM bar, not the complete verification
  • Honesty with yourself is the foundation of quality work
Remember: The person you're most likely to deceive is yourself. These questions force honesty.
  • 该框架是为响应ADR-013(2025-12-07)而创建的
  • 背景:对完成度的自我欺骗导致了孤立代码的产生
  • 该框架在技术验证之前提供思维层面的准备
  • quality-verify-implementation-complete
    配合使用可实现全面覆盖
  • 四个必问问题是最低标准,而非完整的验证流程
  • 对自己诚实是高质量工作的基础
**请记住:**你最容易欺骗的人就是自己。这些问题能迫使你保持诚实。