model-first-reasoning
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseModel-First Reasoning (MFR)
Model-First Reasoning (MFR)
A rigorous methodology that REQUIRES constructing an explicit problem MODEL before any reasoning or implementation. The model becomes a frozen contract that governs all downstream work.
Based on Kumar & Rana (2025), "Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling" (arXiv:2512.14474)
这是一种严谨的方法论,要求在进行任何推理或实现之前,先构建一个明确的问题模型。该模型将成为一份固定契约,指导所有后续工作。
基于Kumar & Rana(2025)的论文《Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling》(arXiv:2512.14474)
Why MFR Works
为什么MFR有效
Hallucination is not merely the generation of false statements—it is a symptom of reasoning performed without a clearly defined model of the problem space.
Reasoning does not create structure; it operates on structure. When that structure is implicit or unstable, reasoning becomes unreliable. MFR provides "soft symbolic grounding"—enough structure to stabilize reasoning without imposing rigid formalism.
幻觉不仅仅是生成错误陈述——它是在未明确定义问题空间模型的情况下进行推理所导致的症状。
推理不会创造结构,而是在结构之上运作。当结构是隐含的或不稳定的时,推理就会变得不可靠。MFR提供了“软符号接地”——足够的结构来稳定推理,同时不会强加僵化的形式主义。
Core Principle
核心原则
Phase 1 produces the MODEL. Phase 2 reasons/implements ONLY within the model.
This prevents the common failure mode where reasoning introduces ad-hoc decisions, missing constraints, or invented behavior not grounded in the problem definition.
第一阶段生成MODEL(模型)。第二阶段仅在模型范围内进行推理/实现。
这避免了常见的失败模式:推理过程中引入临时决策、遗漏约束,或产生未基于问题定义的虚构行为。
Non-Negotiable Rules
不可协商的规则
- Phase 1 (Model) produces NO code, no solution steps—only the formal model
- Phase 2 (Implement) may NOT introduce new entities, state, actions, or constraints
- If you need something not in the model: output exactly + what to add, then STOP
MODEL INCOMPLETE - No invented APIs or dependencies. If not provided, either ask (unknowns) or create a stub clearly marked
STUB
- **第一阶段(建模)**不生成任何代码,也不提供解决方案步骤——仅产出形式化模型
- **第二阶段(实现)**不得引入新的实体、状态、操作或约束
- 如果需要模型中没有的内容:精确输出加上需要添加的内容,然后停止
MODEL INCOMPLETE - 不得虚构API或依赖项。如果未提供相关信息,要么询问(未知项),要么创建明确标记为的存根
STUB
The Model as Contract
作为契约的模型
After creating the model, run a MODEL AUDIT before coding:
创建模型后,在编码前运行MODEL AUDIT(模型审计):
Audit Checks
审计检查项
| Check | Description |
|---|---|
| Coverage | Every user requirement is represented in exactly one of: a constraint, the goal/acceptance criteria, or an action precondition/effect |
| Operability | Every operation your plan would require is present as an action |
| Consistency | Constraints don't contradict each other; action effects don't violate invariants |
| Testability | Every constraint has ≥1 test oracle |
If any audit check fails, revise the model (still Phase 1) until it passes.
| 检查项 | 描述 |
|---|---|
| 覆盖度 | 每个用户需求都必须在以下任一内容中得到体现:约束条件、目标/验收标准,或操作前置条件/效果 |
| 可操作性 | 计划中需要执行的每个操作都必须作为一个已定义的操作存在 |
| 一致性 | 约束条件之间不存在矛盾;操作效果不得违反不变量 |
| 可测试性 | 每个约束条件都有至少1个测试预言机 |
如果任何审计检查项未通过,修改模型(仍处于第一阶段)直至通过。
Freeze Rule
冻结规则
Once the audit passes, treat the model as read-only source of truth.
If later you discover missing info during implementation:
- Emit a (minimal change)
MODEL PATCH - Restart Phase 2 from scratch using the updated model
审计通过后,将模型视为只读的事实来源。
如果在实现过程中后来发现缺失的信息:
- 输出(最小化修改)
MODEL PATCH - 使用更新后的模型,从头重新开始第二阶段
Validation
验证
After creating the model, write it to and run the validator:
model.jsonbash
python scripts/validate-model.py model.jsonExit codes:
- = Valid, ready for Phase 2
0 - = Invalid structure (fix and retry)
1 - = Valid but has unknowns (STOP after Phase 1)
2
创建模型后,将其写入并运行验证器:
model.jsonbash
python scripts/validate-model.py model.json退出码:
- = 验证通过,可进入第二阶段
0 - = 结构无效(修复后重试)
1 - = 结构有效但存在未知项(第一阶段后停止)
2
Output Format
输出格式
Phase 1: MODEL
第一阶段:MODEL(模型)
The model may be expressed in natural language, semi-structured text, or JSON. Flexibility improves compliance—what matters is that the representation is explicit, inspectable, and stable.
For code generation tasks, the structured format below is recommended. Use MODEL_TEMPLATE.json as a reference:
json
{
"deliverable": {
"description": "What we're building",
"files_expected": ["path/to/file.ts", ...]
},
"entities": [
{"name": "EntityName", "description": "...", "properties": [...]}
],
"state_variables": [
{"name": "varName", "type": "...", "initial": "...", "description": "..."}
],
"actions": [
{
"name": "actionName",
"description": "...",
"preconditions": ["..."],
"effects": ["..."],
"parameters": [...]
}
],
"constraints": [
{"id": "C1", "statement": "...", "type": "invariant|precondition|postcondition"}
],
"initial_state": ["description of starting conditions"],
"goal": ["acceptance criteria"],
"assumptions": ["things we assume to be true"],
"unknowns": ["questions that must be answered before proceeding"],
"requirement_trace": [
{
"requirement": "<verbatim from user>",
"represented_as": "goal|constraint|action",
"ref": "C1|action_name|goal_item"
}
],
"test_oracles": [
{"id": "T1", "maps_to": ["C1"], "description": "how to verify constraint"}
]
}Critical: If is non-empty, STOP after Phase 1. Do not implement until unknowns are resolved.
unknowns模型可以用自然语言、半结构化文本或JSON表示。灵活性有助于提升合规性——关键在于表示方式要明确、可检查且稳定。
对于代码生成任务,推荐使用以下结构化格式。以MODEL_TEMPLATE.json作为参考:
json
{
"deliverable": {
"description": "What we're building",
"files_expected": ["path/to/file.ts", ...]
},
"entities": [
{"name": "EntityName", "description": "...", "properties": [...]}
],
"state_variables": [
{"name": "varName", "type": "...", "initial": "...", "description": "..."}
],
"actions": [
{
"name": "actionName",
"description": "...",
"preconditions": ["..."],
"effects": ["..."],
"parameters": [...]
}
],
"constraints": [
{"id": "C1", "statement": "...", "type": "invariant|precondition|postcondition"}
],
"initial_state": ["description of starting conditions"],
"goal": ["acceptance criteria"],
"assumptions": ["things we assume to be true"],
"unknowns": ["questions that must be answered before proceeding"],
"requirement_trace": [
{
"requirement": "<verbatim from user>",
"represented_as": "goal|constraint|action",
"ref": "C1|action_name|goal_item"
}
],
"test_oracles": [
{"id": "T1", "maps_to": ["C1"], "description": "how to verify constraint"}
]
}关键提示:如果不为空,在第一阶段后停止。解决所有未知项后再进行实现。
unknownsPhase 1.5: MODEL AUDIT
第一阶段.5:MODEL AUDIT(模型审计)
Return:
json
{
"audit_pass": true|false,
"issues": [
{"type": "coverage|operability|consistency|testability", "detail": "..."}
]
}If is false, STOP and return to Phase 1 to revise the model.
audit_pass返回:
json
{
"audit_pass": true|false,
"issues": [
{"type": "coverage|operability|consistency|testability", "detail": "..."}
]
}如果为false,停止并返回第一阶段修改模型。
audit_passPhase 2: IMPLEMENTATION
第二阶段:IMPLEMENTATION(实现)
Using ONLY the frozen model:
仅使用已冻结的模型:
A) PLAN
A) 计划
Numbered steps where each step must be an instance of a defined action:
Step 1: [action_name]
- Preconditions check: [list which preconditions are satisfied]
- Effects applied: [what state changes]
- Constraints check: [C1, C2, ...]编号步骤,每个步骤必须是已定义操作的实例:
Step 1: [action_name]
- Preconditions check: [list which preconditions are satisfied]
- Effects applied: [what state changes]
- Constraints check: [C1, C2, ...]B) CODE
B) 代码
Create all files in :
deliverable.files_expected| Model Element | Code Translation |
|---|---|
| entities / state_variables | Types, interfaces, data models |
| actions | Functions/modules with validation + explicit failure modes |
| constraints | Runtime checks, defensive parsing, invariants |
创建中列出的所有文件:
deliverable.files_expected| 模型元素 | 代码映射 |
|---|---|
| entities / state_variables | 类型、接口、数据模型 |
| actions | 包含验证逻辑+明确失败模式的函数/模块 |
| constraints | 运行时检查、防御性解析、不变量 |
C) TESTS
C) 测试
Implement all . Every constraint must be covered by ≥1 test.
test_oracles实现所有。每个约束条件必须至少被1个测试覆盖。
test_oraclesD) VERIFICATION MAP
D) 验证映射
For each constraint, document:
- Where it is enforced in code (file:line)
- Which tests cover it
针对每个约束条件,记录:
- 代码中强制执行该约束的位置(文件:行号)
- 哪些测试覆盖了该约束
When to Use MFR
MFR的适用场景
MFR is most valuable for:
- Complex state machines — where transitions must be valid
- Business logic with invariants — rules that must never be violated
- Data transformations — where input/output contracts matter
- Multi-step workflows — with dependencies between steps
- Safety-critical features — where bugs have high cost
- Collaborative specifications — where the model serves as documentation
When NOT to use: Simple, single-step tasks where the overhead of explicit modeling exceeds its benefit.
MFR在以下场景中价值最高:
- 复杂状态机——其中状态转换必须合法
- 带有不变量的业务逻辑——永远不能违反的规则
- 数据转换——其中输入/输出契约至关重要
- 多步骤工作流——步骤之间存在依赖关系
- 安全关键功能—— bug会带来高成本的场景
- 协作规范——模型可作为文档使用的场景
不适用场景:简单的单步任务,此时显式建模的开销超过其收益。
Relationship to Other Reasoning Strategies
与其他推理策略的关系
MFR is complementary, not competing:
- With Chain-of-Thought: Use CoT within Phase 2 for step-by-step reasoning over the model
- With ReAct: Treat the model as persistent state that actions operate on
- With planning agents: The model provides the domain specification that planners reason over
MFR provides a foundational layer that improves robustness of any reasoning strategy in constraint-heavy domains.
MFR是互补的,而非竞争关系:
- 与思维链(Chain-of-Thought)结合:在第二阶段使用CoT对模型进行逐步推理
- 与ReAct结合:将模型视为操作所基于的持久化状态
- 与规划Agent结合:模型为规划器提供推理所需的领域规范
MFR提供了一个基础层,可提升任何推理策略在约束密集型领域的鲁棒性。
Example Workflow
示例工作流
User: Build a shopping cart that enforces: max 10 items, no duplicate SKUs,
total can't exceed $1000
You: [PHASE 1: MODEL]
{
"entities": [
{"name": "CartItem", "properties": ["sku", "price", "quantity"]}
],
"constraints": [
{"id": "C1", "statement": "cart.items.length <= 10", "type": "invariant"},
{"id": "C2", "statement": "all SKUs in cart are unique", "type": "invariant"},
{"id": "C3", "statement": "sum(items.price * items.quantity) <= 1000", "type": "invariant"}
],
"actions": [
{
"name": "addItem",
"preconditions": ["cart.items.length < 10", "sku not in cart", "new total <= 1000"],
"effects": ["item added to cart"]
}
],
...
}
[PHASE 1.5: MODEL AUDIT]
{
"audit_pass": true,
"issues": []
}
[PHASE 2: IMPLEMENTATION]
// Now implementing strictly from the model...用户:构建一个购物车,需满足以下约束:最多10件商品,SKU不能重复,总价不能超过1000美元
你:[第一阶段:MODEL]
{
"entities": [
{"name": "CartItem", "properties": ["sku", "price", "quantity"]}
],
"constraints": [
{"id": "C1", "statement": "cart.items.length <= 10", "type": "invariant"},
{"id": "C2", "statement": "all SKUs in cart are unique", "type": "invariant"},
{"id": "C3", "statement": "sum(items.price * items.quantity) <= 1000", "type": "invariant"}
],
"actions": [
{
"name": "addItem",
"preconditions": ["cart.items.length < 10", "sku not in cart", "new total <= 1000"],
"effects": ["item added to cart"]
}
],
...
}
[第一阶段.5:MODEL AUDIT]
{
"audit_pass": true,
"issues": []
}
[第二阶段:实现]
// 现在严格基于模型进行实现...Remember
请记住
The model is not overhead—it IS the specification. Most failures in complex reasoning are representational, not inferential: the reasoning was fine, but it operated on an incomplete or unstable understanding of the problem.
By externalizing the model, we make assumptions inspectable, constraints enforceable, and errors diagnosable. The model becomes the contract between intent and implementation.
Model first. Then reason. Never invert this.
模型不是额外开销——它本身就是规范。复杂推理中的大多数失败是表示层面的问题,而非推理层面:推理过程本身没问题,但它基于对问题的不完整或不稳定的理解。
通过将模型外部化,我们使假设变得可检查,约束变得可执行,错误变得可诊断。模型成为了意图与实现之间的契约。
先建模,再推理。永远不要颠倒这个顺序。