dspy-optimize-anything

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

GEPA optimize_anything

Goal

目标

Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.

借助GEPA的反射式进化搜索驱动的声明式API，优化任何可表示为文本的制品——包括代码、提示词、Agent架构、矢量图形、配置文件等。

When to Use

适用场景

Beyond prompt optimization — optimizing code, configs, SVGs, scheduling policies, etc.
Single hard problems — circle packing, kernel generation, algorithm discovery
Batch related problems — CUDA kernels, code generation tasks with cross-transfer
Generalization — agent skills, policies, or prompts that must transfer to unseen inputs
When you can express quality as a score and provide diagnostic feedback (ASI)

超越提示词优化——优化代码、配置文件、SVG、调度策略等
单一复杂问题——圆形排列、内核生成、算法发现
批量关联问题——CUDA内核、具备跨迁移能力的代码生成任务
泛化能力——需迁移至未知输入的Agent技能、策略或提示词
当你可以将质量表达为分数并提供**可操作辅助信息（ASI）**时

Inputs

输入参数

Input	Type	Description
`seed_candidate`	`str \| dict[str, str] \| None`	Starting artifact text, or `None` for seedless mode
`evaluator`	`Callable`	Returns score (higher=better), optionally with ASI dict
`dataset`	`list \| None`	Training examples (for multi-task and generalization modes)
`valset`	`list \| None`	Validation set (for generalization mode)
`objective`	`str \| None`	Natural language description of what to optimize for
`background`	`str \| None`	Domain knowledge and constraints
`config`	`GEPAConfig \| None`	Engine, reflection, and tracking settings

输入参数	类型	描述
`seed_candidate`	`str \| dict[str, str] \| None`	初始制品文本，无初始值时设为 `None` 启用无种子模式
`evaluator`	`Callable`	返回分数（分数越高越好），可选返回ASI字典
`dataset`	`list \| None`	训练示例（用于多任务和泛化模式）
`valset`	`list \| None`	验证集（用于泛化模式）
`objective`	`str \| None`	优化目标的自然语言描述
`background`	`str \| None`	领域知识与约束条件
`config`	`GEPAConfig \| None`	引擎、反射及跟踪设置

Outputs

输出结果

Output	Type	Description
`result.best_candidate`	`str \| dict`	Best optimized artifact

输出结果	类型	描述
`result.best_candidate`	`str \| dict`	优化后的最优制品

Workflow

工作流程

Phase 1: Install

阶段1：安装

bash

pip install gepa

bash

pip install gepa

Phase 2: Define Evaluator with ASI

阶段2：定义带ASI的评估器

The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.

Simple evaluator (score only):

python

import gepa.optimize_anything as oa

def evaluate(candidate: str) -> float:
    score, diagnostic = run_my_system(candidate)
    oa.log(f"Error: {diagnostic}")  # captured as ASI
    return score

Rich evaluator (score + structured ASI):

python

def evaluate(candidate: str) -> tuple[float, dict]:
    result = execute_code(candidate)
    return result.score, {
        "Error": result.stderr,
        "Output": result.stdout,
        "Runtime": f"{result.time_ms:.1f}ms",
    }

ASI can include open-ended text, structured data, multi-objectives (via

scores

), or images (via

gepa.Image

) for vision-capable LLMs.

评估器为候选制品打分，并返回可操作辅助信息（ASI）——即指导LLM生成器进行反思的诊断反馈。

简单评估器（仅返回分数）：

python

import gepa.optimize_anything as oa

def evaluate(candidate: str) -> float:
    score, diagnostic = run_my_system(candidate)
    oa.log(f"Error: {diagnostic}")  # 被捕获为ASI
    return score

增强版评估器（返回分数+结构化ASI）：

python

def evaluate(candidate: str) -> tuple[float, dict]:
    result = execute_code(candidate)
    return result.score, {
        "Error": result.stderr,
        "Output": result.stdout,
        "Runtime": f"{result.time_ms:.1f}ms",
    }

ASI可包含开放式文本、结构化数据、多目标（通过

scores

），或针对支持视觉的LLM的图片（通过

gepa.Image

）。

Phase 3: Choose Optimization Mode

阶段3：选择优化模式

Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
)

Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=tasks,
)

Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=train,
    valset=val,
)

Seedless mode: Describe what you need instead of providing a seed.

python

result = oa.optimize_anything(
    evaluator=evaluate,
    objective="Generate a Python function `reverse()` that reverses a string.",
)

模式1——单任务搜索： 解决单个复杂问题，无需数据集。

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
)

模式2——多任务搜索： 解决一批关联问题，支持跨任务迁移。

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=tasks,
)

模式3——泛化模式： 构建可迁移至未知问题的技能/提示词/策略。

python

result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=train,
    valset=val,
)

无种子模式： 无需提供初始制品，直接描述需求。

python

result = oa.optimize_anything(
    evaluator=evaluate,
    objective="Generate a Python function `reverse()` that reverses a string.",
)

Phase 4: Use Results

阶段4：使用结果

python

print(result.best_candidate)

python

print(result.best_candidate)

Production Example

生产环境示例

python

import gepa.optimize_anything as oa
from gepa import Image
import logging

logger = logging.getLogger(__name__)

python

import gepa.optimize_anything as oa
from gepa import Image
import logging

logger = logging.getLogger(__name__)

---------- SVG optimization with VLM feedback ----------

---------- 结合VLM反馈的SVG优化 ----------

GOAL = "a pelican riding a bicycle" VLM = "vertex_ai/gemini-3-flash-preview"

VISUAL_ASPECTS = [ {"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"}, {"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"}, {"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"}, {"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"}, ]

def evaluate(candidate, example): """Render SVG, score with a VLM, return (score, ASI).""" image = render_image(candidate["svg_code"]) # via cairosvg score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])

return score, {
    "RenderedSVG": Image(base64_data=image, media_type="image/png"),
    "Feedback": feedback,
}

result = oa.optimize_anything( seed_candidate={"svg_code": "<svg>...</svg>"}, evaluator=evaluate, dataset=VISUAL_ASPECTS, background=f"Optimize SVG source code depicting '{GOAL}'. " "Improve anatomy, composition, and visual quality.", )

logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")

GOAL = "a pelican riding a bicycle" VLM = "vertex_ai/gemini-3-flash-preview"

def evaluate(candidate, example): """渲染SVG，用VLM打分，返回(分数, ASI)。""" image = render_image(candidate["svg_code"]) # 通过cairosvg实现 score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])

return score, {
    "RenderedSVG": Image(base64_data=image, media_type="image/png"),
    "Feedback": feedback,
}

logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")

---------- Code optimization (single-task) ----------

---------- 代码优化（单任务） ----------

def evaluate_solver(candidate: str) -> tuple[float, dict]: """Evaluate a Python solver for a mathematical optimization problem.""" import subprocess, json

proc = subprocess.run(
    ["python", "-c", candidate],
    capture_output=True, text=True, timeout=30,
)

if proc.returncode != 0:
    oa.log(f"Runtime error: {proc.stderr}")
    return 0.0, {"Error": proc.stderr}

try:
    output = json.loads(proc.stdout)
    return output["score"], {
        "Output": output.get("solution"),
        "Runtime": f"{output.get('time_ms', 0):.1f}ms",
    }
except (json.JSONDecodeError, KeyError) as e:
    oa.log(f"Parse error: {e}")
    return 0.0, {"Error": str(e), "Stdout": proc.stdout}

result = oa.optimize_anything( evaluator=evaluate_solver, objective="Write a Python solver for the bin packing problem that " "minimizes the number of bins. Output JSON with 'score' and 'solution'.", background="Use first-fit-decreasing as a starting heuristic. " "Higher score = fewer bins used.", )

print(result.best_candidate)

def evaluate_solver(candidate: str) -> tuple[float, dict]: """评估数学优化问题的Python求解器。""" import subprocess, json

proc = subprocess.run(
    ["python", "-c", candidate],
    capture_output=True, text=True, timeout=30,
)

if proc.returncode != 0:
    oa.log(f"Runtime error: {proc.stderr}")
    return 0.0, {"Error": proc.stderr}

try:
    output = json.loads(proc.stdout)
    return output["score"], {
        "Output": output.get("solution"),
        "Runtime": f"{output.get('time_ms', 0):.1f}ms",
    }
except (json.JSONDecodeError, KeyError) as e:
    oa.log(f"Parse error: {e}")
    return 0.0, {"Error": str(e), "Stdout": proc.stdout}

print(result.best_candidate)

---------- Agent architecture generalization ----------

---------- Agent架构泛化 ----------

def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]: """Run an agent architecture on a task and score it.""" exec_globals = {} exec(candidate, exec_globals) agent_fn = exec_globals.get("solve")

if agent_fn is None:
    return 0.0, {"Error": "No `solve` function defined"}

try:
    prediction = agent_fn(example["input"])
    correct = prediction == example["expected"]
    score = 1.0 if correct else 0.0
    feedback = "Correct" if correct else (
        f"Expected '{example['expected']}', got '{prediction}'"
    )
    return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
    return 0.0, {"Error": str(e)}

result = oa.optimize_anything( seed_candidate="def solve(input):\n return input", evaluator=evaluate_agent, dataset=train_tasks, valset=val_tasks, background="Discover a Python agent function

solve(input)

that " "generalizes across unseen reasoning tasks.", )

print(result.best_candidate)

undefined

def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]: """在任务上运行Agent架构并打分。""" exec_globals = {} exec(candidate, exec_globals) agent_fn = exec_globals.get("solve")

if agent_fn is None:
    return 0.0, {"Error": "No `solve` function defined"}

try:
    prediction = agent_fn(example["input"])
    correct = prediction == example["expected"]
    score = 1.0 if correct else 0.0
    feedback = "Correct" if correct else (
        f"Expected '{example['expected']}', got '{prediction}'"
    )
    return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
    return 0.0, {"Error": str(e)}

result = oa.optimize_anything( seed_candidate="def solve(input):\n return input", evaluator=evaluate_agent, dataset=train_tasks, valset=val_tasks, background="Discover a Python agent function

solve(input)

that " "generalizes across unseen reasoning tasks.", )

print(result.best_candidate)

undefined

Integration with DSPy

与DSPy集成

optimize_anything

complements DSPy's built-in optimizers. Use DSPy optimizers (GEPA, MIPROv2, BootstrapFewShot) for DSPy programs, and

optimize_anything

for arbitrary text artifacts outside DSPy:

python

import dspy
import gepa.optimize_anything as oa

optimize_anything

可补充DSPy内置的优化器。使用DSPy优化器（GEPA、MIPROv2、BootstrapFewShot）优化DSPy程序，使用

optimize_anything

优化DSPy之外的任意文本制品：

python

import dspy
import gepa.optimize_anything as oa

DSPy program optimization (use dspy.GEPA)

DSPy程序优化（使用dspy.GEPA）

optimizer = dspy.GEPA( metric=gepa_metric, reflection_lm=dspy.LM("openai/gpt-4o"), auto="medium", ) compiled = optimizer.compile(agent, trainset=trainset)

Non-DSPy artifact optimization (use optimize_anything)

非DSPy制品优化（使用optimize_anything）

result = oa.optimize_anything( seed_candidate=my_config_yaml, evaluator=eval_config, background="Optimize Kubernetes scheduling policy for cost.", )

undefined

result = oa.optimize_anything( seed_candidate=my_config_yaml, evaluator=eval_config, background="Optimize Kubernetes scheduling policy for cost.", )

undefined

Best Practices

最佳实践

Rich ASI — The more diagnostic feedback you provide, the better the proposer can reason about improvements
Use
oa.log()
— Route prints to the proposer as ASI instead of stdout
Structured returns — Return
```
(score, dict)
```
tuples for multi-faceted diagnostics
Seedless for exploration — Use
```
objective=
```
when the solution space is large and unfamiliar
Background context — Provide domain knowledge via
```
background=
```
to constrain the search
Generalization mode — Always provide
```
valset
```
when the artifact must transfer to unseen inputs
Images as ASI — Use
```
gepa.Image
```
to pass rendered outputs to vision-capable LLMs

丰富ASI信息——提供的诊断反馈越详细，生成器越能精准地进行优化改进
使用
oa.log()
——将打印信息作为ASI传递给生成器，而非输出到标准输出
结构化返回——返回
```
(分数, 字典)
```
元组以提供多维度诊断信息
无种子模式用于探索——当解决方案空间较大且不熟悉时，使用
```
objective=
```
参数
背景上下文——通过
```
background=
```
参数提供领域知识，约束搜索范围
泛化模式——当制品需要迁移至未知输入时，务必提供
```
valset
```
验证集
图片作为ASI——使用
```
gepa.Image
```
将渲染后的输出传递给支持视觉的LLM

Limitations

局限性

Requires the
```
gepa
```
package (
```
pip install gepa
```
)
Evaluator must be deterministic or low-variance for stable optimization
Compute cost scales with number of candidates explored
Single-task mode does not generalize; use mode 3 with
```
valset
```
for transfer
Currently powered by GEPA backend; API is backend-agnostic for future strategies

需安装
```
gepa
```
包（
```
pip install gepa
```
）
评估器需具备确定性或低方差，以保证优化稳定性
计算成本随探索的候选制品数量增加而上升
单任务模式不具备泛化能力；如需迁移能力，请使用模式3并搭配
```
valset
```
当前由GEPA后端驱动；API为后端无关设计，未来将支持更多策略