dspy-optimize-anything
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGEPA optimize_anything
GEPA optimize_anything
Goal
目标
Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
借助GEPA的反射式进化搜索驱动的声明式API,优化任何可表示为文本的制品——包括代码、提示词、Agent架构、矢量图形、配置文件等。
When to Use
适用场景
- Beyond prompt optimization — optimizing code, configs, SVGs, scheduling policies, etc.
- Single hard problems — circle packing, kernel generation, algorithm discovery
- Batch related problems — CUDA kernels, code generation tasks with cross-transfer
- Generalization — agent skills, policies, or prompts that must transfer to unseen inputs
- When you can express quality as a score and provide diagnostic feedback (ASI)
- 超越提示词优化——优化代码、配置文件、SVG、调度策略等
- 单一复杂问题——圆形排列、内核生成、算法发现
- 批量关联问题——CUDA内核、具备跨迁移能力的代码生成任务
- 泛化能力——需迁移至未知输入的Agent技能、策略或提示词
- 当你可以将质量表达为分数并提供**可操作辅助信息(ASI)**时
Inputs
输入参数
| Input | Type | Description |
|---|---|---|
| | Starting artifact text, or |
| | Returns score (higher=better), optionally with ASI dict |
| | Training examples (for multi-task and generalization modes) |
| | Validation set (for generalization mode) |
| | Natural language description of what to optimize for |
| | Domain knowledge and constraints |
| | Engine, reflection, and tracking settings |
| 输入参数 | 类型 | 描述 |
|---|---|---|
| | 初始制品文本,无初始值时设为 |
| | 返回分数(分数越高越好),可选返回ASI字典 |
| | 训练示例(用于多任务和泛化模式) |
| | 验证集(用于泛化模式) |
| | 优化目标的自然语言描述 |
| | 领域知识与约束条件 |
| | 引擎、反射及跟踪设置 |
Outputs
输出结果
| Output | Type | Description |
|---|---|---|
| | Best optimized artifact |
| 输出结果 | 类型 | 描述 |
|---|---|---|
| | 优化后的最优制品 |
Workflow
工作流程
Phase 1: Install
阶段1:安装
bash
pip install gepabash
pip install gepaPhase 2: Define Evaluator with ASI
阶段2:定义带ASI的评估器
The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.
Simple evaluator (score only):
python
import gepa.optimize_anything as oa
def evaluate(candidate: str) -> float:
score, diagnostic = run_my_system(candidate)
oa.log(f"Error: {diagnostic}") # captured as ASI
return scoreRich evaluator (score + structured ASI):
python
def evaluate(candidate: str) -> tuple[float, dict]:
result = execute_code(candidate)
return result.score, {
"Error": result.stderr,
"Output": result.stdout,
"Runtime": f"{result.time_ms:.1f}ms",
}ASI can include open-ended text, structured data, multi-objectives (via ), or images (via ) for vision-capable LLMs.
scoresgepa.Image评估器为候选制品打分,并返回可操作辅助信息(ASI)——即指导LLM生成器进行反思的诊断反馈。
简单评估器(仅返回分数):
python
import gepa.optimize_anything as oa
def evaluate(candidate: str) -> float:
score, diagnostic = run_my_system(candidate)
oa.log(f"Error: {diagnostic}") # 被捕获为ASI
return score增强版评估器(返回分数+结构化ASI):
python
def evaluate(candidate: str) -> tuple[float, dict]:
result = execute_code(candidate)
return result.score, {
"Error": result.stderr,
"Output": result.stdout,
"Runtime": f"{result.time_ms:.1f}ms",
}ASI可包含开放式文本、结构化数据、多目标(通过),或针对支持视觉的LLM的图片(通过)。
scoresgepa.ImagePhase 3: Choose Optimization Mode
阶段3:选择优化模式
Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
)Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=tasks,
)Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=train,
valset=val,
)Seedless mode: Describe what you need instead of providing a seed.
python
result = oa.optimize_anything(
evaluator=evaluate,
objective="Generate a Python function `reverse()` that reverses a string.",
)模式1——单任务搜索: 解决单个复杂问题,无需数据集。
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
)模式2——多任务搜索: 解决一批关联问题,支持跨任务迁移。
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=tasks,
)模式3——泛化模式: 构建可迁移至未知问题的技能/提示词/策略。
python
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=train,
valset=val,
)无种子模式: 无需提供初始制品,直接描述需求。
python
result = oa.optimize_anything(
evaluator=evaluate,
objective="Generate a Python function `reverse()` that reverses a string.",
)Phase 4: Use Results
阶段4:使用结果
python
print(result.best_candidate)python
print(result.best_candidate)Production Example
生产环境示例
python
import gepa.optimize_anything as oa
from gepa import Image
import logging
logger = logging.getLogger(__name__)python
import gepa.optimize_anything as oa
from gepa import Image
import logging
logger = logging.getLogger(__name__)---------- SVG optimization with VLM feedback ----------
---------- 结合VLM反馈的SVG优化 ----------
GOAL = "a pelican riding a bicycle"
VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [
{"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"},
{"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"},
{"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"},
{"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"},
]
def evaluate(candidate, example):
"""Render SVG, score with a VLM, return (score, ASI)."""
image = render_image(candidate["svg_code"]) # via cairosvg
score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
"RenderedSVG": Image(base64_data=image, media_type="image/png"),
"Feedback": feedback,
}result = oa.optimize_anything(
seed_candidate={"svg_code": "<svg>...</svg>"},
evaluator=evaluate,
dataset=VISUAL_ASPECTS,
background=f"Optimize SVG source code depicting '{GOAL}'. "
"Improve anatomy, composition, and visual quality.",
)
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")
GOAL = "a pelican riding a bicycle"
VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [
{"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"},
{"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"},
{"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"},
{"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"},
]
def evaluate(candidate, example):
"""渲染SVG,用VLM打分,返回(分数, ASI)。"""
image = render_image(candidate["svg_code"]) # 通过cairosvg实现
score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
"RenderedSVG": Image(base64_data=image, media_type="image/png"),
"Feedback": feedback,
}result = oa.optimize_anything(
seed_candidate={"svg_code": "<svg>...</svg>"},
evaluator=evaluate,
dataset=VISUAL_ASPECTS,
background=f"Optimize SVG source code depicting '{GOAL}'. "
"Improve anatomy, composition, and visual quality.",
)
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")
---------- Code optimization (single-task) ----------
---------- 代码优化(单任务) ----------
def evaluate_solver(candidate: str) -> tuple[float, dict]:
"""Evaluate a Python solver for a mathematical optimization problem."""
import subprocess, json
proc = subprocess.run(
["python", "-c", candidate],
capture_output=True, text=True, timeout=30,
)
if proc.returncode != 0:
oa.log(f"Runtime error: {proc.stderr}")
return 0.0, {"Error": proc.stderr}
try:
output = json.loads(proc.stdout)
return output["score"], {
"Output": output.get("solution"),
"Runtime": f"{output.get('time_ms', 0):.1f}ms",
}
except (json.JSONDecodeError, KeyError) as e:
oa.log(f"Parse error: {e}")
return 0.0, {"Error": str(e), "Stdout": proc.stdout}result = oa.optimize_anything(
evaluator=evaluate_solver,
objective="Write a Python solver for the bin packing problem that "
"minimizes the number of bins. Output JSON with 'score' and 'solution'.",
background="Use first-fit-decreasing as a starting heuristic. "
"Higher score = fewer bins used.",
)
print(result.best_candidate)
def evaluate_solver(candidate: str) -> tuple[float, dict]:
"""评估数学优化问题的Python求解器。"""
import subprocess, json
proc = subprocess.run(
["python", "-c", candidate],
capture_output=True, text=True, timeout=30,
)
if proc.returncode != 0:
oa.log(f"Runtime error: {proc.stderr}")
return 0.0, {"Error": proc.stderr}
try:
output = json.loads(proc.stdout)
return output["score"], {
"Output": output.get("solution"),
"Runtime": f"{output.get('time_ms', 0):.1f}ms",
}
except (json.JSONDecodeError, KeyError) as e:
oa.log(f"Parse error: {e}")
return 0.0, {"Error": str(e), "Stdout": proc.stdout}result = oa.optimize_anything(
evaluator=evaluate_solver,
objective="Write a Python solver for the bin packing problem that "
"minimizes the number of bins. Output JSON with 'score' and 'solution'.",
background="Use first-fit-decreasing as a starting heuristic. "
"Higher score = fewer bins used.",
)
print(result.best_candidate)
---------- Agent architecture generalization ----------
---------- Agent架构泛化 ----------
def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]:
"""Run an agent architecture on a task and score it."""
exec_globals = {}
exec(candidate, exec_globals)
agent_fn = exec_globals.get("solve")
if agent_fn is None:
return 0.0, {"Error": "No `solve` function defined"}
try:
prediction = agent_fn(example["input"])
correct = prediction == example["expected"]
score = 1.0 if correct else 0.0
feedback = "Correct" if correct else (
f"Expected '{example['expected']}', got '{prediction}'"
)
return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
return 0.0, {"Error": str(e)}result = oa.optimize_anything(
seed_candidate="def solve(input):\n return input",
evaluator=evaluate_agent,
dataset=train_tasks,
valset=val_tasks,
background="Discover a Python agent function that "
"generalizes across unseen reasoning tasks.",
)
solve(input)print(result.best_candidate)
undefineddef evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]:
"""在任务上运行Agent架构并打分。"""
exec_globals = {}
exec(candidate, exec_globals)
agent_fn = exec_globals.get("solve")
if agent_fn is None:
return 0.0, {"Error": "No `solve` function defined"}
try:
prediction = agent_fn(example["input"])
correct = prediction == example["expected"]
score = 1.0 if correct else 0.0
feedback = "Correct" if correct else (
f"Expected '{example['expected']}', got '{prediction}'"
)
return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
return 0.0, {"Error": str(e)}result = oa.optimize_anything(
seed_candidate="def solve(input):\n return input",
evaluator=evaluate_agent,
dataset=train_tasks,
valset=val_tasks,
background="Discover a Python agent function that "
"generalizes across unseen reasoning tasks.",
)
solve(input)print(result.best_candidate)
undefinedIntegration with DSPy
与DSPy集成
optimize_anythingoptimize_anythingpython
import dspy
import gepa.optimize_anything as oaoptimize_anythingoptimize_anythingpython
import dspy
import gepa.optimize_anything as oaDSPy program optimization (use dspy.GEPA)
DSPy程序优化(使用dspy.GEPA)
optimizer = dspy.GEPA(
metric=gepa_metric,
reflection_lm=dspy.LM("openai/gpt-4o"),
auto="medium",
)
compiled = optimizer.compile(agent, trainset=trainset)
optimizer = dspy.GEPA(
metric=gepa_metric,
reflection_lm=dspy.LM("openai/gpt-4o"),
auto="medium",
)
compiled = optimizer.compile(agent, trainset=trainset)
Non-DSPy artifact optimization (use optimize_anything)
非DSPy制品优化(使用optimize_anything)
result = oa.optimize_anything(
seed_candidate=my_config_yaml,
evaluator=eval_config,
background="Optimize Kubernetes scheduling policy for cost.",
)
undefinedresult = oa.optimize_anything(
seed_candidate=my_config_yaml,
evaluator=eval_config,
background="Optimize Kubernetes scheduling policy for cost.",
)
undefinedBest Practices
最佳实践
- Rich ASI — The more diagnostic feedback you provide, the better the proposer can reason about improvements
- Use — Route prints to the proposer as ASI instead of stdout
oa.log() - Structured returns — Return tuples for multi-faceted diagnostics
(score, dict) - Seedless for exploration — Use when the solution space is large and unfamiliar
objective= - Background context — Provide domain knowledge via to constrain the search
background= - Generalization mode — Always provide when the artifact must transfer to unseen inputs
valset - Images as ASI — Use to pass rendered outputs to vision-capable LLMs
gepa.Image
- 丰富ASI信息——提供的诊断反馈越详细,生成器越能精准地进行优化改进
- 使用——将打印信息作为ASI传递给生成器,而非输出到标准输出
oa.log() - 结构化返回——返回元组以提供多维度诊断信息
(分数, 字典) - 无种子模式用于探索——当解决方案空间较大且不熟悉时,使用参数
objective= - 背景上下文——通过参数提供领域知识,约束搜索范围
background= - 泛化模式——当制品需要迁移至未知输入时,务必提供验证集
valset - 图片作为ASI——使用将渲染后的输出传递给支持视觉的LLM
gepa.Image
Limitations
局限性
- Requires the package (
gepa)pip install gepa - Evaluator must be deterministic or low-variance for stable optimization
- Compute cost scales with number of candidates explored
- Single-task mode does not generalize; use mode 3 with for transfer
valset - Currently powered by GEPA backend; API is backend-agnostic for future strategies
- 需安装包(
gepa)pip install gepa - 评估器需具备确定性或低方差,以保证优化稳定性
- 计算成本随探索的候选制品数量增加而上升
- 单任务模式不具备泛化能力;如需迁移能力,请使用模式3并搭配
valset - 当前由GEPA后端驱动;API为后端无关设计,未来将支持更多策略