quality-reflective-questions
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseReflective Questions for Work Completeness
工作完成度反思提问框架
Quick Start
快速开始
Before marking ANYTHING as "done", ask yourself these questions and provide HONEST answers:
在将任何工作标记为“已完成”之前,问自己以下问题并给出诚实的答案:
The Four Mandatory Questions
四个必问问题
- How do I trigger this? (What's the entry point?)
- What connects it to the system? (Where's the wiring?)
- What evidence proves it runs? (Show me the logs)
- What shows it works correctly? (What's the outcome?)
If you cannot answer ALL FOUR with specific, concrete details, the work is NOT complete.
- 如何触发该功能?(入口点是什么?)
- 它与系统的连接点在哪里?(关联逻辑在哪里?)
- 有什么证据能证明它已运行?(展示相关日志)
- 有什么能证明它运行正常?(结果是什么?)
如果你无法针对所有四个问题给出具体、详实的细节,那么这项工作尚未完成。
The Honesty Test
诚实性测试
Replace vague answers with specific evidence:
❌ Bad (vague): "It's integrated" → ✅ Good (specific): "Imported in builder.py line 45"
❌ Bad (vague): "It works" → ✅ Good (specific): "Logs show execution at 10:30:45"
❌ Bad (vague): "Tests pass" → ✅ Good (specific): "46 unit tests + 2 integration tests pass"
用具体证据替代模糊的回答:
❌ 错误示例(模糊):“已完成集成” → ✅ 正确示例(具体):“在builder.py第45行导入”
❌ 错误示例(模糊):“可以正常运行” → ✅ 正确示例(具体):“日志显示10:30:45时已执行”
❌ 错误示例(模糊):“测试已通过” → ✅ 正确示例(具体):“46个单元测试 + 2个集成测试全部通过”
Table of Contents
目录
- When to Use This Skill
- What This Skill Does
- The Four Mandatory Questions (Deep Dive)
- Category-Specific Questions
- Red Flag Questions
- The Honesty Checklist
- Common Self-Deception Patterns
- Supporting Files
- Expected Outcomes
- Requirements
- Red Flags to Avoid
- 何时使用该框架
- 该框架的作用
- 四个必问问题(深度解析)
- 特定分类的补充问题
- 警示性问题
- 诚实性检查清单
- 常见自我欺骗模式
- 支持文件
- 预期成果
- 必要条件
- 需要避免的警示信号
When to Use This Skill
何时使用该框架
Explicit Triggers
明确触发场景
- "Challenge my assumptions about completeness"
- "Ask me reflective questions about my work"
- "Self-review my implementation"
- "Is this really done?"
- "Verify my work is complete"
- "Question my completion claims"
- “挑战我对完成度的假设”
- “问我关于工作的反思性问题”
- “自我评审我的实现代码”
- “这真的完成了吗?”
- “验证我的工作是否已完成”
- “质疑我关于工作完成的宣称”
Implicit Triggers (PROACTIVE)
主动触发场景(预防性)
- Before marking any task complete (every single time)
- Before moving ADR from in_progress to completed
- Before claiming "feature works"
- Before self-approving work
- After implementing any feature
- When about to say "all tests passing ✅"
- 在标记任何任务为已完成之前(每次都要)
- 在将ADR从进行中转为已完成状态之前
- 在宣称“功能可正常运行”之前
- 在自我批准工作之前
- 在完成任何功能实现之后
- 当你准备说“所有测试已通过 ✅”时
Debugging Triggers
调试触发场景
- "Why do I feel uncertain about this?"
- "Something seems incomplete but I can't pinpoint it"
- "I want to mark this done but have doubts"
- "Am I missing something?"
- “为什么我对这项工作感到不确定?”
- “感觉有些地方不完整,但我无法明确指出”
- “我想标记为已完成,但心存疑虑”
- “我是不是遗漏了什么?”
What This Skill Does
该框架的作用
This skill provides a structured framework of reflective questions that:
- Challenges assumptions about what "done" means
- Exposes gaps between claimed completion and actual completion
- Forces specificity instead of vague assurances
- Prevents premature completion by requiring evidence
- Catches integration failures before they become incidents
This skill complements by providing the mental framework for self-questioning BEFORE running technical verification.
quality-verify-implementation-complete本框架提供一套结构化的反思性提问体系,能够:
- 挑战对“已完成”定义的假设
- 暴露宣称的完成度与实际完成度之间的差距
- 强制要求给出具体细节,而非模糊的保证
- 通过要求提供证据防止过早标记完成
- 在问题演变为事故之前发现集成故障
本框架是的补充,它在执行技术验证之前,提供自我反思的思维框架。
quality-verify-implementation-completeThe Four Mandatory Questions (Deep Dive)
四个必问问题(深度解析)
These questions MUST be answered for EVERY piece of work before claiming "done".
在宣称工作“已完成”之前,必须针对每一项工作回答这些问题。
Question 1: How do I trigger this?
问题1:如何触发该功能?
Purpose: Verify the feature has a reachable entry point
What it really asks:
- Can a user/system actually invoke this code?
- Is there a documented way to make this execute?
- Could someone else trigger this without asking me?
Good Answers (Specific):
- ✅ "Run: "
uv run temet-run -a talky -p 'analyze code' - ✅ "Call: "
curl -X POST /api/endpoint -d '{...}' - ✅ "Import: "
from myapp import MyService; MyService().method() - ✅ "Event: Coordinator triggers when returns True"
should_review_architecture()
Bad Answers (Vague):
- ❌ "Run the system"
- ❌ "It's automatic"
- ❌ "The coordinator calls it"
- ❌ "When needed"
Follow-up Questions:
- "Can you show me the EXACT command right now?"
- "What arguments/parameters are required?"
- "Under what conditions does this trigger?"
- "Could you trigger this in the next 30 seconds if asked?"
If you cannot answer specifically: The feature has no entry point → NOT COMPLETE
**目的:**验证功能有可访问的入口点
本质是问:
- 用户/系统是否真的能调用这段代码?
- 是否有文档说明如何执行该功能?
- 其他人无需询问我就能触发该功能吗?
优秀回答(具体):
- ✅ “执行命令:”
uv run temet-run -a talky -p 'analyze code' - ✅ “调用方式:”
curl -X POST /api/endpoint -d '{...}' - ✅ “导入方式:”
from myapp import MyService; MyService().method() - ✅ “触发事件:当返回True时,协调器会触发该功能”
should_review_architecture()
糟糕回答(模糊):
- ❌ “运行系统即可”
- ❌ “它是自动触发的”
- ❌ “协调器会调用它”
- ❌ “需要时就会触发”
跟进问题:
- “你现在能给出确切的触发命令吗?”
- “需要哪些参数?”
- 在什么条件下会触发该功能?
- “如果现在要求你触发,你能在30秒内完成吗?”
**如果无法给出具体回答:**功能没有可访问的入口点 → 未完成
Question 2: What connects it to the system?
问题2:它与系统的连接点在哪里?
Purpose: Verify the artifact is actually wired into the codebase
What it really asks:
- Where is the import statement?
- Where is the registration/initialization?
- Where is the configuration that enables this?
- Can you show me the LINE NUMBER where this is connected?
Good Answers (Specific):
- ✅ "builder.py line 45: "
from .architecture_nodes import create_review_node - ✅ "main.py line 12: "
app.add_command(my_command) - ✅ "container.py line 67: "
container.register(MyService, scope=Scope.SINGLETON) - ✅ "routes.py line 23: "
router.add_route('/endpoint', handler)
Bad Answers (Vague):
- ❌ "It's imported"
- ❌ "It's in the builder"
- ❌ "It's registered"
- ❌ "It's wired up"
Follow-up Questions:
- "Can you paste the EXACT import line?"
- "What FILE and LINE NUMBER has the registration?"
- "Can you show me with grep output?"
- "Could I find this connection in 60 seconds if I looked?"
If you cannot answer specifically: The artifact is orphaned → NOT COMPLETE
**目的:**验证该组件已正确接入代码库
本质是问:
- 导入语句在哪里?
- 注册/初始化逻辑在哪里?
- 启用该功能的配置在哪里?
- 你能展示连接点所在的具体行号吗?
优秀回答(具体):
- ✅ “builder.py第45行:”
from .architecture_nodes import create_review_node - ✅ “main.py第12行:”
app.add_command(my_command) - ✅ “container.py第67行:”
container.register(MyService, scope=Scope.SINGLETON) - ✅ “routes.py第23行:”
router.add_route('/endpoint', handler)
糟糕回答(模糊):
- ❌ “已经导入了”
- ❌ “在builder文件里”
- ❌ “已经注册过了”
- ❌ “已经关联好了”
跟进问题:
- “你能粘贴确切的导入代码行吗?”
- “注册逻辑在哪个文件的哪一行?”
- “你能用grep命令的输出展示吗?”
- “如果我现在去查找,能在60秒内找到这个连接点吗?”
**如果无法给出具体回答:**该组件是孤立的 → 未完成
Question 3: What evidence proves it runs?
问题3:有什么证据能证明它已运行?
Purpose: Verify the code actually executes at runtime
What it really asks:
- Have you ACTUALLY triggered this and observed execution?
- What logs/traces show this code path was hit?
- Can you show me timestamped evidence of execution?
- Did you observe this with your own eyes (or grep)?
Good Answers (Specific):
- ✅ "Logs: "
[2025-12-07 10:30:45] INFO architecture_review_triggered agent=talky - ✅ "Output: (from CLI run at 10:30)"
✓ Task completed successfully - ✅ "Trace: OpenTelemetry span with duration 1.2s"
architecture_review - ✅ "Debug: Added print statement, saw output 'Node executed'"
Bad Answers (Vague):
- ❌ "It should run"
- ❌ "Tests pass"
- ❌ "No errors when I ran it"
- ❌ "The system works"
Follow-up Questions:
- "Can you paste the ACTUAL log line showing execution?"
- "What TIMESTAMP did this execute?"
- "Did you observe this directly or are you assuming?"
- "Could you trigger this RIGHT NOW and show me the logs?"
If you cannot answer specifically: No execution proof → NOT COMPLETE
**目的:**验证代码在运行时确实被执行
本质是问:
- 你是否真的触发过该功能并观察到执行过程?
- 哪些日志/追踪信息显示该代码路径已被执行?
- 你能展示带时间戳的执行证据吗?
- 你是亲眼看到(或通过grep命令确认)的吗?
优秀回答(具体):
- ✅ “日志:”
[2025-12-07 10:30:45] INFO architecture_review_triggered agent=talky - ✅ “输出:(10:30时通过CLI执行)”
✓ Task completed successfully - ✅ “追踪信息:OpenTelemetry的span,耗时1.2秒”
architecture_review - ✅ “调试:添加了打印语句,看到输出‘Node executed’”
糟糕回答(模糊):
- ❌ “应该能运行”
- ❌ “测试已通过”
- ❌ “我运行时没有报错”
- ❌ “系统能正常工作”
跟进问题:
- “你能粘贴显示执行的实际日志行吗?”
- “执行的时间戳是什么?”
- “你是直接观察到的,还是只是假设?”
- “你现在能触发该功能并展示日志吗?”
**如果无法给出具体回答:**没有执行证据 → 未完成
Question 4: What shows it works correctly?
问题4:有什么能证明它运行正常?
Purpose: Verify the code produces the expected outcome
What it really asks:
- What observable outcome proves correct behavior?
- What state changed as a result of execution?
- What output/artifact was created?
- How do you KNOW it did the right thing?
Good Answers (Specific):
- ✅ "State: "
result.architecture_review = ArchitectureReviewResult(status=APPROVED, violations=[]) - ✅ "Database: Row inserted with ID 123, verified with query"
- ✅ "File: Created with expected contents (see: cat output.txt)"
output.txt - ✅ "API: Returned HTTP 200 with JSON body containing expected fields"
Bad Answers (Vague):
- ❌ "It works"
- ❌ "No errors"
- ❌ "Tests pass"
- ❌ "Everything looks good"
Follow-up Questions:
- "Can you show me the EXACT output/state change?"
- "What VALUE did this produce?"
- "How do you KNOW this is correct vs just 'no errors'?"
- "Could you demonstrate correct behavior RIGHT NOW?"
If you cannot answer specifically: No outcome proof → NOT COMPLETE
**目的:**验证代码产生了预期的结果
本质是问:
- 有哪些可观察的结果能证明行为正确?
- 执行后哪些状态发生了变化?
- 生成了哪些输出/产物?
- 你如何确认它做了正确的事情?
优秀回答(具体):
- ✅ “状态:”
result.architecture_review = ArchitectureReviewResult(status=APPROVED, violations=[]) - ✅ “数据库:插入了ID为123的行,已通过查询验证”
- ✅ “文件:生成了,内容符合预期(可查看:cat output.txt)”
output.txt - ✅ “API:返回HTTP 200状态码,JSON响应包含预期字段”
糟糕回答(模糊):
- ❌ “能正常运行”
- ❌ “没有报错”
- ❌ “测试已通过”
- ❌ “看起来一切正常”
跟进问题:
- “你能展示确切的输出/状态变化吗?”
- “它产生了什么实际价值?”
- “你如何确认这是正确的,而不仅仅是‘没有报错’?”
- “你现在能演示它的正确行为吗?”
**如果无法给出具体回答:**没有结果正确性的证据 → 未完成
Category-Specific Questions
特定分类的补充问题
Apply the Four Questions framework to specific implementation types. For detailed questions by category, see references/category-specific-questions.md.
Categories covered:
- Modules/Files: Import verification, call-site validation
- LangGraph Nodes: Graph registration, edge connectivity
- CLI Commands: Registration, --help visibility, execution
- Service Classes (DI): Container registration, injection points
- API Endpoints: Route registration, response validation
将四个必问问题框架应用到特定类型的实现中。关于各分类的详细问题,请查看references/category-specific-questions.md。
覆盖的分类:
- 模块/文件:导入验证、调用点确认
- LangGraph节点:图注册、边连接验证
- CLI命令:注册状态、--help可见性、执行验证
- 服务类(DI):容器注册、注入点确认
- API端点:路由注册、响应验证
Red Flag Questions
警示性问题
These questions expose common self-deception patterns. If you answer "yes" to any, stop and investigate.
这些问题能暴露常见的自我欺骗模式。如果任何问题的答案是“是”,请立即停止并调查。
Integration Red Flags
集成相关警示信号
-
"Did I only test this in isolation?"
- If YES: You might have orphaned code
- Action: Add integration test, verify in real system
-
"Am I assuming something is connected without verifying?"
- If YES: Assumption might be wrong
- Action: Grep for imports, verify connection exists
-
"Did I only run unit tests, not integration tests?"
- If YES: Integration might be broken
- Action: Create/run integration tests
-
"Am I relying on 'should' or 'probably' language?"
- If YES: You're guessing, not verifying
- Action: Replace guesses with evidence
-
"Could this code exist and never execute?"
- If YES: It might be orphaned
- Action: Verify call-sites exist in production code
-
“我只在隔离环境中测试过该功能?”
- 如果是:你的代码可能是孤立的
- 行动:添加集成测试,在真实系统中验证
-
“我假设它已正确连接,但没有实际验证?”
- 如果是:你的假设可能是错误的
- 行动:用grep命令查找导入语句,验证连接是否存在
-
“我只运行了单元测试,没有运行集成测试?”
- 如果是:集成可能存在问题
- 行动:创建/运行集成测试
-
“我使用了‘应该’或‘可能’这类表述?”
- 如果是:你在猜测,而非验证
- 行动:用证据替代猜测
-
“这段代码可能存在但永远不会被执行?”
- 如果是:它可能是孤立代码
- 行动:验证生产代码中是否存在调用点
Execution Red Flags
执行相关警示信号
-
"Have I not actually triggered this feature?"
- If YES: You don't know if it works
- Action: Trigger it, observe execution
-
"Am I claiming it works based on 'no errors' vs positive proof?"
- If YES: Absence of errors ≠ presence of success
- Action: Show positive evidence of correct behavior
-
"Did I forget to check logs after running?"
- If YES: No execution proof
- Action: Run again, capture logs
-
"Am I trusting tests alone without manual verification?"
- If YES: Tests might be mocked/isolated
- Action: Manual E2E test, verify in real environment
-
"Could this feature be wired but the conditional never triggers?"
- If YES: Dead code path
- Action: Verify the condition is reachable
-
“我从未实际触发过该功能?”
- 如果是:你无法确认它是否能正常运行
- 行动:触发该功能,观察执行过程
-
“我仅根据‘没有报错’就宣称它能正常运行,而非有正面证据?”
- 如果是:没有报错 ≠ 运行正常
- 行动:提供能证明行为正确的正面证据
-
“我运行后忘记检查日志了?”
- 如果是:没有执行证据
- 行动:重新运行并捕获日志
-
“我只依赖测试结果,没有进行手动验证?”
- 如果是:测试可能被模拟/隔离,无法反映真实情况
- 行动:进行手动端到端测试,在真实环境中验证
-
“该功能已接入系统,但触发条件永远无法满足?”
- 如果是:这是死代码路径
- 行动:验证触发条件是否可达
Completion Red Flags
完成度相关警示信号
-
"Am I rushing to mark this complete?"
- If YES: Slow down, verify properly
- Action: Run through Four Questions again
-
"Do I have doubts I'm ignoring?"
- If YES: Your instinct is usually right
- Action: Investigate the doubt before proceeding
-
"Would I bet $1000 this works end-to-end?"
- If NO: You're not confident
- Action: Find out why, verify until confident
-
"Could someone else verify this works without asking me?"
- If NO: Insufficient documentation/evidence
- Action: Document entry point, provide evidence
-
"Am I self-approving without external review?"
- If YES: You might miss blind spots
- Action: Request reviewer agent or peer review
-
“我急于将这项工作标记为已完成?”
- 如果是:放慢速度,进行充分验证
- 行动:重新过一遍四个必问问题
-
“我有疑虑但选择忽略?”
- 如果是:你的直觉通常是对的
- 行动:在继续之前先调查疑虑点
-
“我愿意赌1000美元它能端到端正常运行?”
- 如果否:你并不确定
- 行动:找出不确定的原因,验证到你有信心为止
-
“其他人无需询问我就能验证它是否正常运行?”
- 如果否:文档/证据不足
- 行动:记录入口点,提供验证证据
-
“我在没有外部评审的情况下自我批准了工作?”
- 如果是:你可能遗漏了盲点
- 行动:请求评审Agent或同行评审
The Honesty Checklist
诚实性检查清单
Before marking ANYTHING complete, answer these honestly:
在标记任何工作为已完成之前,诚实地回答以下问题:
Evidence Requirements
证据要求
-
I can paste the exact command to trigger this feature (Not "run the system" - the EXACT command with args)
-
I can show the file and line number where this is imported/registered (Not "it's in builder.py" - the EXACT line number)
-
I have actual logs showing this code executed (Not "it should log" - actual timestamped log lines)
-
I can show the specific output/state change this produced (Not "it works" - the EXACT output/data)
-
I triggered this manually and observed it work (Not "tests pass" - I personally ran it)
-
我能粘贴触发该功能的确切命令 (不是“运行系统”,而是带参数的确切命令)
-
我能展示该组件被导入/注册的文件和行号 (不是“在builder.py里”,而是确切的行号)
-
我有显示该代码已执行的实际日志 (不是“应该会有日志”,而是带时间戳的实际日志行)
-
我能展示它产生的具体输出/状态变化 (不是“能正常运行”,而是确切的输出/数据)
-
我手动触发过该功能并观察到它正常运行 (不是“测试已通过”,而是我亲自运行过)
Integration Requirements
集成要求
-
This code is imported in at least one production file (grep output shows import, not just tests)
-
This code has call-sites in production paths (grep output shows calls, not just definitions)
-
This code is registered/wired where it needs to be (container, graph, router, CLI - verified)
-
Integration tests verify this component is in the system (Not just unit tests - integration/E2E tests exist)
-
这段代码已在至少一个生产文件中被导入 (grep命令输出显示存在导入,而非仅在测试文件中)
-
这段代码在生产路径中有调用点 (grep命令输出显示存在调用,而非仅定义)
-
这段代码已在所需位置完成注册/关联 (容器、图、路由、CLI - 已验证)
-
有集成测试验证该组件已接入系统 (不仅是单元测试,还存在集成/端到端测试)
Outcome Requirements
结果要求
-
I can demonstrate this works to someone else right now (Could walk someone through triggering and observing)
-
The behavior matches the specification (Not just "no errors" - correct behavior observed)
-
I would bet money this works end-to-end (Confident enough to stake reputation on it)
-
I have answered all Four Questions with specific details (No vague answers, all concrete)
-
我现在就能向他人演示它的正常运行 (能引导他人完成触发和观察的全过程)
-
行为符合需求规格 (不仅是“没有报错”,而是观察到了正确的行为)
-
我愿意为它的端到端正常运行下注 (足够自信,愿意为此赌上自己的信誉)
-
我已用具体细节回答了所有四个必问问题 (没有模糊回答,全部是具体内容)
If ANY checkbox is unchecked: NOT COMPLETE
如果有任何一个复选框未勾选:工作未完成
Common Self-Deception Patterns
常见自我欺骗模式
Be aware of these patterns that lead to premature completion claims. For detailed analysis and fixes, see references/self-deception-patterns.md.
Common Patterns:
- "Tests Pass" Syndrome - Unit tests pass but integration untested
- "Should Work" Fallacy - Using assumptions instead of evidence
- "No Errors" Confusion - Equating silence with correctness
- "File Exists" Completion - Code written but not integrated
- "Looks Good" Approval - Vague approval without specifics
- "I Remember Doing It" - Trusting memory over verification
- "Later Will Be Fine" - Deferring critical verification steps
请注意这些会导致过早宣称工作完成的模式。关于详细分析和解决方法,请查看references/self-deception-patterns.md。
常见模式:
- “测试通过”综合征 - 单元测试通过,但集成未测试
- “应该能运行”谬误 - 用假设替代证据
- “没有报错”误区 - 将无报错等同于运行正常
- “文件已存在”式完成 - 代码已编写但未接入系统
- “看起来不错”式批准 - 模糊的批准,没有具体依据
- “我记得做过” - 相信记忆而非当前验证
- “以后再处理” - 推迟关键的验证步骤
Usage
使用方法
- Before marking work complete, run through the Four Questions
- Check the Honesty Checklist - all boxes must be checked
- Verify no Red Flags are present
- If uncertain, review references/category-specific-questions.md for your implementation type
Supporting Files:
- references/category-specific-questions.md - Detailed questions by category
- references/self-deception-patterns.md - Pattern recognition and fixes
- 在标记工作为已完成之前,过一遍四个必问问题
- 检查诚实性检查清单 - 所有复选框必须勾选
- 确认没有警示信号存在
- 如果不确定,查看references/category-specific-questions.md中对应实现类型的补充问题
支持文件:
- references/category-specific-questions.md - 各分类的详细问题
- references/self-deception-patterns.md - 模式识别与解决方法
Expected Outcomes
预期成果
Successful Self-Review
成功的自我评审
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-07
FOUR MANDATORY QUESTIONS:
1. How do I trigger this?
✅ SPECIFIC: uv run temet-run -a talky -p "Write a function"
When should_review_architecture() returns True (when code_changes detected)
2. What connects it to the system?
✅ SPECIFIC: builder.py line 12: from .architecture_nodes import create_architecture_review_node
builder.py line 146: graph.add_node("architecture_review", review_node)
builder.py line 189: Conditional edge from "query_claude"
3. What evidence proves it runs?
✅ SPECIFIC: Logs from execution at 2025-12-07 10:30:45:
[INFO] architecture_review_triggered agent=talky session=abc123
[INFO] architecture_review_complete status=approved violations=0
4. What shows it works correctly?
✅ SPECIFIC: state.architecture_review = ArchitectureReviewResult(
status=ReviewStatus.APPROVED,
violations=[],
recommendations=["Code follows Clean Architecture"]
)
HONESTY CHECKLIST:
✅ All evidence specific, not vague
✅ All connections verified with grep
✅ Execution observed directly
✅ Outcome matches specification
SELF-DECEPTION CHECK:
✅ Not relying on "tests pass" alone
✅ Not using "should" or "probably"
✅ Not assuming - all verified
✅ Would bet $1000 this works
DECISION: ✅ WORK IS COMPLETE
Ready to mark as done.Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-07
FOUR MANDATORY QUESTIONS:
1. How do I trigger this?
✅ SPECIFIC: uv run temet-run -a talky -p "Write a function"
When should_review_architecture() returns True (when code_changes detected)
2. What connects it to the system?
✅ SPECIFIC: builder.py line 12: from .architecture_nodes import create_architecture_review_node
builder.py line 146: graph.add_node("architecture_review", review_node)
builder.py line 189: Conditional edge from "query_claude"
3. What evidence proves it runs?
✅ SPECIFIC: Logs from execution at 2025-12-07 10:30:45:
[INFO] architecture_review_triggered agent=talky session=abc123
[INFO] architecture_review_complete status=approved violations=0
4. What shows it works correctly?
✅ SPECIFIC: state.architecture_review = ArchitectureReviewResult(
status=ReviewStatus.APPROVED,
violations=[],
recommendations=["Code follows Clean Architecture"]
)
HONESTY CHECKLIST:
✅ All evidence specific, not vague
✅ All connections verified with grep
✅ Execution observed directly
✅ Outcome matches specification
SELF-DECEPTION CHECK:
✅ Not relying on "tests pass" alone
✅ Not using "should" or "probably"
✅ Not assuming - all verified
✅ Would bet $1000 this works
DECISION: ✅ WORK IS COMPLETE
Ready to mark as done.Failed Self-Review (Catches Incompleteness)
失败的自我评审(发现不完整问题)
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-05 (BEFORE FIX)
FOUR MANDATORY QUESTIONS:
1. How do I trigger this?
⚠️ VAGUE: "Run the coordinator"
FOLLOW-UP: What's the EXACT command?
RE-ANSWER: uv run temet-run -a talky -p "..."
⚠️ STILL VAGUE: What prompts trigger the node?
2. What connects it to the system?
❌ VAGUE: "It should be in builder.py"
FOLLOW-UP: Can you show me the line number?
RE-CHECK: grep "architecture_nodes" builder.py
RESULT: (empty) ❌
CRITICAL: MODULE IS NOT IMPORTED
3. What evidence proves it runs?
❌ ASSUMPTION: "Tests pass so it should run"
FOLLOW-UP: Have you actually run it and seen logs?
RE-ANSWER: "No, just ran unit tests"
CRITICAL: NO EXECUTION PROOF
4. What shows it works correctly?
❌ ASSUMPTION: "Tests verify behavior"
FOLLOW-UP: What actual output did you observe?
RE-ANSWER: "Just the test assertions passing"
CRITICAL: NO RUNTIME OUTCOME PROOF
HONESTY CHECKLIST:
❌ Using vague language ("should", "I think")
❌ No specific line numbers or imports shown
❌ No execution logs captured
❌ Relying on tests, not runtime verification
SELF-DECEPTION CHECK:
❌ Relying on "tests pass" only
❌ Using "should" repeatedly
❌ Assuming instead of verifying
❌ Would NOT bet $1000 (honest answer: no)
DECISION: ❌ WORK IS NOT COMPLETE
Critical issues found:
1. Module not imported in builder.py
2. No runtime execution proof
3. No integration test
DO NOT mark as done. Fix integration first.Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-05 (BEFORE FIX)
FOUR MANDATORY QUESTIONS:
1. How do I trigger this?
⚠️ VAGUE: "Run the coordinator"
FOLLOW-UP: What's the EXACT command?
RE-ANSWER: uv run temet-run -a talky -p "..."
⚠️ STILL VAGUE: What prompts trigger the node?
2. What connects it to the system?
❌ VAGUE: "It should be in builder.py"
FOLLOW-UP: Can you show me the line number?
RE-CHECK: grep "architecture_nodes" builder.py
RESULT: (empty) ❌
CRITICAL: MODULE IS NOT IMPORTED
3. What evidence proves it runs?
❌ ASSUMPTION: "Tests pass so it should run"
FOLLOW-UP: Have you actually run it and seen logs?
RE-ANSWER: "No, just ran unit tests"
CRITICAL: NO EXECUTION PROOF
4. What shows it works correctly?
❌ ASSUMPTION: "Tests verify behavior"
FOLLOW-UP: What actual output did you observe?
RE-ANSWER: "Just the test assertions passing"
CRITICAL: NO RUNTIME OUTCOME PROOF
HONESTY CHECKLIST:
❌ Using vague language ("should", "I think")
❌ No specific line numbers or imports shown
❌ No execution logs captured
❌ Relying on tests, not runtime verification
SELF-DECEPTION CHECK:
❌ Relying on "tests pass" only
❌ Using "should" repeatedly
❌ Assuming instead of verifying
❌ Would NOT bet $1000 (honest answer: no)
DECISION: ❌ WORK IS NOT COMPLETE
Critical issues found:
1. Module not imported in builder.py
2. No runtime execution proof
3. No integration test
DO NOT mark as done. Fix integration first.Requirements
必要条件
Tools Required
所需工具
- None (this is a mental framework)
- 无(这是一个思维框架)
Knowledge Required
所需知识
- Understanding of what "done" means in your domain
- Willingness to be honest with yourself
- Ability to distinguish vague from specific answers
- 理解你所在领域中“已完成”的定义
- 愿意对自己诚实
- 能够区分模糊回答和具体回答
Mindset Required
所需心态
- Intellectual honesty - Admit when you don't know
- Rigor - Don't accept vague answers from yourself
- Patience - Take time to verify properly
- Courage - Admit incompleteness vs rushing to "done"
- 理智诚实 - 承认自己不知道的事情
- 严谨性 - 不接受自己给出的模糊回答
- 耐心 - 花时间进行充分验证
- 勇气 - 承认工作未完成,而非急于标记为“已完成”
Red Flags to Avoid
需要避免的警示信号
Do Not
不要
- ❌ Accept vague answers from yourself
- ❌ Use "should", "probably", "I think" language
- ❌ Rush through questions to mark done faster
- ❌ Skip questions that feel uncomfortable
- ❌ Trust memory instead of current verification
- ❌ Assume connection without grep proof
- ❌ Claim execution without logs
- ❌ Rely on unit tests alone for integration work
- ❌ 接受自己给出的模糊回答
- ❌ 使用“应该”“可能”“我认为”这类表述
- ❌ 为了更快标记为已完成而仓促回答问题
- ❌ 跳过让你感到不适的问题
- ❌ 相信记忆而非当前的验证结果
- ❌ 假设已连接但不用grove命令验证
- ❌ 没有日志就宣称已执行
- ❌ 仅依赖单元测试来验证集成工作
Do
要
- ✅ Answer all Four Questions with specific details
- ✅ Replace assumptions with evidence
- ✅ Be honest about gaps and uncertainties
- ✅ Verify current state, don't trust memory
- ✅ Show concrete proof (line numbers, logs, output)
- ✅ Admit incompleteness when found
- ✅ Fix gaps before marking complete
- ✅ Use this framework for EVERY completion claim
- ✅ 用具体细节回答所有四个必问问题
- ✅ 用证据替代假设
- ✅ 诚实地面对差距和不确定性
- ✅ 验证当前状态,不相信记忆
- ✅ 展示具体证据(行号、日志、输出)
- ✅ 发现不完整时勇于承认
- ✅ 标记为已完成之前先修复问题
- ✅ 对每一项完成宣称都使用该框架
Notes
说明
- This skill was created in response to ADR-013 (2025-12-07)
- The pattern: Self-deception about completeness led to orphaned code
- This skill provides the mental framework BEFORE technical verification
- Pair this with for full coverage
quality-verify-implementation-complete - The Four Questions are the MINIMUM bar, not the complete verification
- Honesty with yourself is the foundation of quality work
Remember: The person you're most likely to deceive is yourself. These questions force honesty.
- 该框架是为响应ADR-013(2025-12-07)而创建的
- 背景:对完成度的自我欺骗导致了孤立代码的产生
- 该框架在技术验证之前提供思维层面的准备
- 与配合使用可实现全面覆盖
quality-verify-implementation-complete - 四个必问问题是最低标准,而非完整的验证流程
- 对自己诚实是高质量工作的基础
**请记住:**你最容易欺骗的人就是自己。这些问题能迫使你保持诚实。