ai-reasoning

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Build AI That Reasons Through Hard Problems

构建可解决复杂问题的推理型AI

Guide the user through making AI solve problems that need more than a simple answer. When a task requires planning, multi-step logic, or choosing the right approach, basic prompting fails. DSPy gives you composable reasoning strategies.

引导用户打造能够解决超出简单回答范畴问题的AI。当任务需要规划、多步骤逻辑或选择合适方法时，基础提示词会失效。DSPy为你提供可组合的推理策略。

Step 1: Does the task need advanced reasoning?

步骤1：任务是否需要高级推理？

Use this decision tree:

Task type	Example	Best approach
Simple lookup / classification	"Is this email spam?"	`dspy.Predict`
Needs explanation or logic	"Why did the build fail?"	`dspy.ChainOfThought`
Math, counting, computation	"What's the total after discounts?"	`dspy.ProgramOfThought`
Needs to compare approaches	"Which database is best for this?"	`dspy.MultiChainComparison`
Complex multi-step, novel problems	"Plan a migration strategy"	Self-Discovery pattern

If the user isn't sure, start with
ChainOfThought
— it's the right default for most tasks.

使用以下决策树判断：

任务类型	示例	最佳方法
简单查询/分类	"这封邮件是垃圾邮件吗？"	`dspy.Predict`
需要解释或逻辑	"构建失败的原因是什么？"	`dspy.ChainOfThought`
数学、计数、计算	"折扣后的总金额是多少？"	`dspy.ProgramOfThought`
需要对比方案	"哪种数据库最适合这个场景？"	`dspy.MultiChainComparison`
复杂多步骤、新颖问题	"规划迁移策略"	Self-Discovery模式

如果用户不确定，从
ChainOfThought
开始——它是大多数任务的默认最优选择。

Step 2: Basic reasoning patterns

步骤2：基础推理模式

ChainOfThought — think step by step

ChainOfThought — 逐步推理

The workhorse. Adds intermediate reasoning before the final answer:

python

import dspy

class AnalyzeBug(dspy.Signature):
    """Analyze the bug report and determine root cause."""
    bug_report: str = dspy.InputField(desc="The bug report with error details")
    root_cause: str = dspy.OutputField(desc="The most likely root cause")
    fix_suggestion: str = dspy.OutputField(desc="Suggested fix")

analyzer = dspy.ChainOfThought(AnalyzeBug)
result = analyzer(bug_report="Users see 500 errors after deploying v2.3...")
print(result.reasoning)  # shows step-by-step thinking
print(result.root_cause)

这是核心工具。在给出最终答案前添加中间推理步骤：

python

import dspy

class AnalyzeBug(dspy.Signature):
    """Analyze the bug report and determine root cause."""
    bug_report: str = dspy.InputField(desc="The bug report with error details")
    root_cause: str = dspy.OutputField(desc="The most likely root cause")
    fix_suggestion: str = dspy.OutputField(desc="Suggested fix")

analyzer = dspy.ChainOfThought(AnalyzeBug)
result = analyzer(bug_report="Users see 500 errors after deploying v2.3...")
print(result.reasoning)  # shows step-by-step thinking
print(result.root_cause)

ProgramOfThought — write code to compute the answer

ProgramOfThought — 编写代码计算答案

When the answer requires calculation, let the AI write and execute code:

python

class CalculateMetrics(dspy.Signature):
    """Calculate business metrics from the provided data."""
    data_description: str = dspy.InputField(desc="Description of the data and what to calculate")
    result: str = dspy.OutputField(desc="The calculated result")

calculator = dspy.ProgramOfThought(CalculateMetrics)
result = calculator(data_description="Revenue was $50k in Jan, $63k in Feb, $58k in March. What's the average monthly growth rate?")

ProgramOfThought

generates Python code, runs it in a sandbox, and returns the output. Use this for anything involving math, dates, data manipulation, or counting.

当答案需要计算时，让AI编写并执行代码：

python

class CalculateMetrics(dspy.Signature):
    """Calculate business metrics from the provided data."""
    data_description: str = dspy.InputField(desc="Description of the data and what to calculate")
    result: str = dspy.OutputField(desc="The calculated result")

calculator = dspy.ProgramOfThought(CalculateMetrics)
result = calculator(data_description="Revenue was $50k in Jan, $63k in Feb, $58k in March. What's the average monthly growth rate?")

ProgramOfThought

会生成Python代码，在沙箱中运行并返回输出。适用于所有涉及数学、日期、数据处理或计数的场景。

MultiChainComparison — generate multiple answers, pick the best

MultiChainComparison — 生成多个答案，选择最优解

When quality matters more than speed, reason multiple ways and compare:

python

class RecommendApproach(dspy.Signature):
    """Recommend the best technical approach for this problem."""
    problem: str = dspy.InputField()
    recommendation: str = dspy.OutputField()

recommender = dspy.MultiChainComparison(RecommendApproach)
result = recommender(problem="We need to add real-time notifications to our app")

当质量优先于速度时，通过多种方式推理并对比：

python

class RecommendApproach(dspy.Signature):
    """Recommend the best technical approach for this problem."""
    problem: str = dspy.InputField()
    recommendation: str = dspy.OutputField()

recommender = dspy.MultiChainComparison(RecommendApproach)
result = recommender(problem="We need to add real-time notifications to our app")

Internally generates multiple chains of thought, then picks the best

undefined

undefined

When to use each

各模式适用场景

python

class SmartReasoner(dspy.Module):
    """Route to the best reasoning strategy based on the task."""
    def __init__(self):
        self.classify = dspy.Predict("question -> task_type: str")
        self.cot = dspy.ChainOfThought("question -> answer")
        self.pot = dspy.ProgramOfThought("question -> answer")
        self.mcc = dspy.MultiChainComparison("question -> answer")

    def forward(self, question):
        task_type = self.classify(question=question).task_type.lower()

        if "math" in task_type or "calcul" in task_type or "count" in task_type:
            return self.pot(question=question)
        elif "compare" in task_type or "recommend" in task_type or "best" in task_type:
            return self.mcc(question=question)
        else:
            return self.cot(question=question)

python

class SmartReasoner(dspy.Module):
    """Route to the best reasoning strategy based on the task."""
    def __init__(self):
        self.classify = dspy.Predict("question -> task_type: str")
        self.cot = dspy.ChainOfThought("question -> answer")
        self.pot = dspy.ProgramOfThought("question -> answer")
        self.mcc = dspy.MultiChainComparison("question -> answer")

    def forward(self, question):
        task_type = self.classify(question=question).task_type.lower()

        if "math" in task_type or "calcul" in task_type or "count" in task_type:
            return self.pot(question=question)
        elif "compare" in task_type or "recommend" in task_type or "best" in task_type:
            return self.mcc(question=question)
        else:
            return self.cot(question=question)

Step 3: Self-Discovery pattern

步骤3：Self-Discovery模式

For genuinely hard problems where the AI needs to figure out how to think, not just think harder. Inspired by Self-Discover prompting research.

The 4-stage pipeline:

Select — pick relevant reasoning strategies from a library
Adapt — tailor those strategies to the specific task
Plan — create a structured reasoning plan
Execute — follow the plan to produce the answer

python

from pydantic import BaseModel, Field

针对真正棘手的问题，AI需要先明确如何思考，而不仅仅是更努力地思考。灵感来自Self-Discover提示词研究。

四阶段流程：

选择 — 从库中挑选相关推理策略
适配 — 调整策略以适配特定任务
规划 — 创建结构化推理计划
执行 — 按照计划生成答案

python

from pydantic import BaseModel, Field

Reasoning strategy library

REASONING_STRATEGIES = [ "Break the problem into smaller sub-problems", "Think about edge cases and exceptions", "Work backwards from the desired outcome", "Consider analogies to simpler problems", "Identify constraints and requirements first", "Generate multiple hypotheses and evaluate each", "Think about what information is missing", "Check if the problem has been solved before in a different context", "Separate facts from assumptions", "Consider the problem from different stakeholder perspectives", ]

class SelectStrategies(dspy.Signature): """Select the most relevant reasoning strategies for this task.""" task: str = dspy.InputField(desc="The problem to solve") available_strategies: list[str] = dspy.InputField() selected_strategies: list[str] = dspy.OutputField( desc="2-4 most relevant strategies for this task" )

class AdaptStrategies(dspy.Signature): """Adapt the selected strategies to this specific task.""" task: str = dspy.InputField() strategies: list[str] = dspy.InputField(desc="Selected reasoning strategies") adapted_strategies: list[str] = dspy.OutputField( desc="Strategies rewritten for this specific problem" )

class ReasoningStep(BaseModel): step_number: int strategy: str = Field(description="Which reasoning strategy this step uses") description: str = Field(description="What to do in this step")

class CreatePlan(dspy.Signature): """Create a structured step-by-step reasoning plan.""" task: str = dspy.InputField() adapted_strategies: list[str] = dspy.InputField() plan: list[ReasoningStep] = dspy.OutputField(desc="Ordered reasoning steps")

class ExecutePlan(dspy.Signature): """Execute the reasoning plan to solve the task.""" task: str = dspy.InputField() plan: list[ReasoningStep] = dspy.InputField() step_results: list[str] = dspy.OutputField(desc="Result of each reasoning step") final_answer: str = dspy.OutputField(desc="The final answer based on all reasoning")

class SelfDiscoveryReasoner(dspy.Module): def init(self): self.select = dspy.ChainOfThought(SelectStrategies) self.adapt = dspy.ChainOfThought(AdaptStrategies) self.plan = dspy.ChainOfThought(CreatePlan) self.execute = dspy.ChainOfThought(ExecutePlan)

def forward(self, task):
    # Stage 1: Select relevant strategies
    selected = self.select(
        task=task,
        available_strategies=REASONING_STRATEGIES,
    ).selected_strategies

    # Stage 2: Adapt to this task
    adapted = self.adapt(
        task=task,
        strategies=selected,
    ).adapted_strategies

    # Stage 3: Create reasoning plan
    plan = self.plan(
        task=task,
        adapted_strategies=adapted,
    ).plan

    # Stage 4: Execute the plan
    result = self.execute(task=task, plan=plan)

    return dspy.Prediction(
        strategies=selected,
        plan=plan,
        step_results=result.step_results,
        answer=result.final_answer,
    )

undefined

class ReasoningStep(BaseModel): step_number: int strategy: str = Field(description="Which reasoning strategy this step uses") description: str = Field(description="What to do in this step")

def forward(self, task):
    # Stage 1: Select relevant strategies
    selected = self.select(
        task=task,
        available_strategies=REASONING_STRATEGIES,
    ).selected_strategies

    # Stage 2: Adapt to this task
    adapted = self.adapt(
        task=task,
        strategies=selected,
    ).adapted_strategies

    # Stage 3: Create reasoning plan
    plan = self.plan(
        task=task,
        adapted_strategies=adapted,
    ).plan

    # Stage 4: Execute the plan
    result = self.execute(task=task, plan=plan)

    return dspy.Prediction(
        strategies=selected,
        plan=plan,
        step_results=result.step_results,
        answer=result.final_answer,
    )

undefined

Step 4: Structured reasoning plans

步骤4：结构化推理计划

For complex tasks, force the AI to show its work in a structured format:

python

class ReasoningTrace(BaseModel):
    step: str = Field(description="What this reasoning step does")
    observation: str = Field(description="What was observed or concluded")
    confidence: float = Field(description="0.0-1.0 confidence in this step")

class StructuredReasoner(dspy.Module):
    def __init__(self):
        self.reason = dspy.ChainOfThought(ReasonWithTrace)

    def forward(self, question):
        result = self.reason(question=question)

        # Validate reasoning quality
        dspy.Suggest(
            len(result.trace) >= 2,
            "Show at least 2 reasoning steps — don't jump to conclusions"
        )
        dspy.Suggest(
            all(step.confidence > 0.3 for step in result.trace),
            "Low-confidence steps should be reconsidered"
        )

        return result

class ReasonWithTrace(dspy.Signature):
    """Solve the problem step by step, showing reasoning at each stage."""
    question: str = dspy.InputField()
    trace: list[ReasoningTrace] = dspy.OutputField(desc="Step-by-step reasoning trace")
    answer: str = dspy.OutputField(desc="Final answer based on the reasoning trace")

针对复杂任务，强制AI以结构化格式展示推理过程：

python

class ReasoningTrace(BaseModel):
    step: str = Field(description="What this reasoning step does")
    observation: str = Field(description="What was observed or concluded")
    confidence: float = Field(description="0.0-1.0 confidence in this step")

class StructuredReasoner(dspy.Module):
    def __init__(self):
        self.reason = dspy.ChainOfThought(ReasonWithTrace)

    def forward(self, question):
        result = self.reason(question=question)

        # Validate reasoning quality
        dspy.Suggest(
            len(result.trace) >= 2,
            "Show at least 2 reasoning steps — don't jump to conclusions"
        )
        dspy.Suggest(
            all(step.confidence > 0.3 for step in result.trace),
            "Low-confidence steps should be reconsidered"
        )

        return result

class ReasonWithTrace(dspy.Signature):
    """Solve the problem step by step, showing reasoning at each stage."""
    question: str = dspy.InputField()
    trace: list[ReasoningTrace] = dspy.OutputField(desc="Step-by-step reasoning trace")
    answer: str = dspy.OutputField(desc="Final answer based on the reasoning trace")

Step 5: Evaluate reasoning quality

步骤5：评估推理质量

Don't just check the final answer — evaluate the reasoning process:

不要只检查最终答案——还要评估推理过程：

Judge intermediate steps

评判中间步骤

python

class JudgeReasoning(dspy.Signature):
    """Judge whether the reasoning process is sound."""
    question: str = dspy.InputField()
    reasoning_steps: list[str] = dspy.InputField(desc="The steps taken to reach the answer")
    answer: str = dspy.InputField()
    steps_are_logical: bool = dspy.OutputField(desc="Each step follows from the previous")
    no_logical_leaps: bool = dspy.OutputField(desc="No unjustified jumps in reasoning")
    answer_follows: bool = dspy.OutputField(desc="The answer follows from the reasoning")

def reasoning_quality_metric(example, prediction, trace=None):
    # Check final answer correctness
    correct = prediction.answer.strip().lower() == example.answer.strip().lower()

    # Also check reasoning quality
    judge = dspy.Predict(JudgeReasoning)
    quality = judge(
        question=example.question,
        reasoning_steps=prediction.step_results if hasattr(prediction, 'step_results') else [prediction.reasoning],
        answer=prediction.answer,
    )

    reasoning_score = (
        quality.steps_are_logical + quality.no_logical_leaps + quality.answer_follows
    ) / 3

    # Weight: 60% correct answer, 40% good reasoning
    return (0.6 * correct) + (0.4 * reasoning_score)

python

class JudgeReasoning(dspy.Signature):
    """Judge whether the reasoning process is sound."""
    question: str = dspy.InputField()
    reasoning_steps: list[str] = dspy.InputField(desc="The steps taken to reach the answer")
    answer: str = dspy.InputField()
    steps_are_logical: bool = dspy.OutputField(desc="Each step follows from the previous")
    no_logical_leaps: bool = dspy.OutputField(desc="No unjustified jumps in reasoning")
    answer_follows: bool = dspy.OutputField(desc="The answer follows from the reasoning")

def reasoning_quality_metric(example, prediction, trace=None):
    # Check final answer correctness
    correct = prediction.answer.strip().lower() == example.answer.strip().lower()

    # Also check reasoning quality
    judge = dspy.Predict(JudgeReasoning)
    quality = judge(
        question=example.question,
        reasoning_steps=prediction.step_results if hasattr(prediction, 'step_results') else [prediction.reasoning],
        answer=prediction.answer,
    )

    reasoning_score = (
        quality.steps_are_logical + quality.no_logical_leaps + quality.answer_follows
    ) / 3

    # Weight: 60% correct answer, 40% good reasoning
    return (0.6 * correct) + (0.4 * reasoning_score)

Compare reasoning approaches

对比推理方法

Test which reasoning strategy works best for your task:

python

from dspy.evaluate import Evaluate

evaluator = Evaluate(devset=devset, metric=reasoning_quality_metric, num_threads=4)

测试哪种推理策略最适合你的任务：

python

from dspy.evaluate import Evaluate

evaluator = Evaluate(devset=devset, metric=reasoning_quality_metric, num_threads=4)

Test different approaches

cot = dspy.ChainOfThought("question -> answer") pot = dspy.ProgramOfThought("question -> answer") self_disc = SelfDiscoveryReasoner()

print("ChainOfThought:", evaluator(cot)) print("ProgramOfThought:", evaluator(pot)) print("SelfDiscovery:", evaluator(self_disc))

undefined

cot = dspy.ChainOfThought("question -> answer") pot = dspy.ProgramOfThought("question -> answer") self_disc = SelfDiscoveryReasoner()

print("ChainOfThought:", evaluator(cot)) print("ProgramOfThought:", evaluator(pot)) print("SelfDiscovery:", evaluator(self_disc))

undefined

Step 6: Optimize reasoning

步骤6：优化推理

BootstrapFewShot per stage

分阶段BootstrapFewShot

For multi-stage reasoning (like Self-Discovery), optimize each stage:

python

optimizer = dspy.BootstrapFewShot(
    metric=reasoning_quality_metric,
    max_bootstrapped_demos=4,
)
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

针对多阶段推理（如Self-Discovery），优化每个阶段：

python

optimizer = dspy.BootstrapFewShot(
    metric=reasoning_quality_metric,
    max_bootstrapped_demos=4,
)
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

MIPROv2 for instruction tuning

MIPROv2用于指令调优

Automatically discover better instructions for the reasoning prompts:

python

optimizer = dspy.MIPROv2(metric=reasoning_quality_metric, auto="medium")
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

自动为推理提示词发现更优指令：

python

optimizer = dspy.MIPROv2(metric=reasoning_quality_metric, auto="medium")
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

GEPA for reflective analysis

GEPA用于反思分析

GEPA analyzes traces of successful and failed attempts to generate better instructions:

python

optimizer = dspy.GEPA(metric=reasoning_quality_metric)
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

GEPA分析成功和失败尝试的轨迹，生成更优指令：

python

optimizer = dspy.GEPA(metric=reasoning_quality_metric)
optimized = optimizer.compile(SelfDiscoveryReasoner(), trainset=trainset)

Key patterns

核心模式总结

Default to ChainOfThought — it's the right choice for most tasks that need reasoning
ProgramOfThought for computation — let the AI write code for math, dates, counting
MultiChainComparison for high stakes — generate multiple answers and pick the best
Self-Discovery for novel problems — dynamically select how to think, not just what to think
Evaluate the reasoning, not just the answer — good reasoning produces reliably correct answers
Structured traces — JSON reasoning steps make debugging and optimization easier

默认使用ChainOfThought——它是大多数需要推理任务的正确选择
计算类任务用ProgramOfThought——让AI为数学、日期、计数场景编写代码
高风险场景用MultiChainComparison——生成多个答案并选择最优解
新颖问题用Self-Discovery——动态选择思考方式，而非仅思考内容
评估推理过程而非仅答案——优质推理能产生可靠的正确答案
结构化轨迹——JSON格式的推理步骤便于调试和优化

Additional resources

额外资源

For worked examples (complex questions, data analysis, planning), see examples.md
Need AI to call APIs and use tools? Use
```
/ai-taking-actions
```
Need multi-step pipelines with predetermined stages? Use
```
/ai-building-pipelines
```
Next:
```
/ai-improving-accuracy
```
to measure and improve your reasoning system

如需实战示例（复杂问题、数据分析、规划），请查看examples.md
需要AI调用API和使用工具？请使用
```
/ai-taking-actions
```
需要预设阶段的多步骤流水线？请使用
```
/ai-building-pipelines
```
下一步：
```
/ai-improving-accuracy
```
测量并优化你的推理系统