dspy-optimize-anything

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

GEPA optimize_anything

GEPA optimize_anything

Goal

目标

Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
借助GEPA的反射式进化搜索驱动的声明式API,优化任何可表示为文本的制品——包括代码、提示词、Agent架构、矢量图形、配置文件等。

When to Use

适用场景

  • Beyond prompt optimization — optimizing code, configs, SVGs, scheduling policies, etc.
  • Single hard problems — circle packing, kernel generation, algorithm discovery
  • Batch related problems — CUDA kernels, code generation tasks with cross-transfer
  • Generalization — agent skills, policies, or prompts that must transfer to unseen inputs
  • When you can express quality as a score and provide diagnostic feedback (ASI)
  • 超越提示词优化——优化代码、配置文件、SVG、调度策略等
  • 单一复杂问题——圆形排列、内核生成、算法发现
  • 批量关联问题——CUDA内核、具备跨迁移能力的代码生成任务
  • 泛化能力——需迁移至未知输入的Agent技能、策略或提示词
  • 当你可以将质量表达为分数并提供**可操作辅助信息(ASI)**时

Inputs

输入参数

InputTypeDescription
seed_candidate
str | dict[str, str] | None
Starting artifact text, or
None
for seedless mode
evaluator
Callable
Returns score (higher=better), optionally with ASI dict
dataset
list | None
Training examples (for multi-task and generalization modes)
valset
list | None
Validation set (for generalization mode)
objective
str | None
Natural language description of what to optimize for
background
str | None
Domain knowledge and constraints
config
GEPAConfig | None
Engine, reflection, and tracking settings
输入参数类型描述
seed_candidate
str | dict[str, str] | None
初始制品文本,无初始值时设为
None
启用无种子模式
evaluator
Callable
返回分数(分数越高越好),可选返回ASI字典
dataset
list | None
训练示例(用于多任务和泛化模式)
valset
list | None
验证集(用于泛化模式)
objective
str | None
优化目标的自然语言描述
background
str | None
领域知识与约束条件
config
GEPAConfig | None
引擎、反射及跟踪设置

Outputs

输出结果

OutputTypeDescription
result.best_candidate
str | dict
Best optimized artifact
输出结果类型描述
result.best_candidate
str | dict
优化后的最优制品

Workflow

工作流程

Phase 1: Install

阶段1:安装

bash
pip install gepa
bash
pip install gepa

Phase 2: Define Evaluator with ASI

阶段2:定义带ASI的评估器

The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.
Simple evaluator (score only):
python
import gepa.optimize_anything as oa

def evaluate(candidate: str) -> float:
    score, diagnostic = run_my_system(candidate)
    oa.log(f"Error: {diagnostic}")  # captured as ASI
    return score
Rich evaluator (score + structured ASI):
python
def evaluate(candidate: str) -> tuple[float, dict]:
    result = execute_code(candidate)
    return result.score, {
        "Error": result.stderr,
        "Output": result.stdout,
        "Runtime": f"{result.time_ms:.1f}ms",
    }
ASI can include open-ended text, structured data, multi-objectives (via
scores
), or images (via
gepa.Image
) for vision-capable LLMs.
评估器为候选制品打分,并返回可操作辅助信息(ASI)——即指导LLM生成器进行反思的诊断反馈。
简单评估器(仅返回分数):
python
import gepa.optimize_anything as oa

def evaluate(candidate: str) -> float:
    score, diagnostic = run_my_system(candidate)
    oa.log(f"Error: {diagnostic}")  # 被捕获为ASI
    return score
增强版评估器(返回分数+结构化ASI):
python
def evaluate(candidate: str) -> tuple[float, dict]:
    result = execute_code(candidate)
    return result.score, {
        "Error": result.stderr,
        "Output": result.stdout,
        "Runtime": f"{result.time_ms:.1f}ms",
    }
ASI可包含开放式文本、结构化数据、多目标(通过
scores
),或针对支持视觉的LLM的图片(通过
gepa.Image
)。

Phase 3: Choose Optimization Mode

阶段3:选择优化模式

Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
)
Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=tasks,
)
Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=train,
    valset=val,
)
Seedless mode: Describe what you need instead of providing a seed.
python
result = oa.optimize_anything(
    evaluator=evaluate,
    objective="Generate a Python function `reverse()` that reverses a string.",
)
模式1——单任务搜索: 解决单个复杂问题,无需数据集。
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
)
模式2——多任务搜索: 解决一批关联问题,支持跨任务迁移。
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=tasks,
)
模式3——泛化模式: 构建可迁移至未知问题的技能/提示词/策略。
python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=train,
    valset=val,
)
无种子模式: 无需提供初始制品,直接描述需求。
python
result = oa.optimize_anything(
    evaluator=evaluate,
    objective="Generate a Python function `reverse()` that reverses a string.",
)

Phase 4: Use Results

阶段4:使用结果

python
print(result.best_candidate)
python
print(result.best_candidate)

Production Example

生产环境示例

python
import gepa.optimize_anything as oa
from gepa import Image
import logging

logger = logging.getLogger(__name__)
python
import gepa.optimize_anything as oa
from gepa import Image
import logging

logger = logging.getLogger(__name__)

---------- SVG optimization with VLM feedback ----------

---------- 结合VLM反馈的SVG优化 ----------

GOAL = "a pelican riding a bicycle" VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [ {"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"}, {"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"}, {"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"}, {"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"}, ]
def evaluate(candidate, example): """Render SVG, score with a VLM, return (score, ASI).""" image = render_image(candidate["svg_code"]) # via cairosvg score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
    "RenderedSVG": Image(base64_data=image, media_type="image/png"),
    "Feedback": feedback,
}
result = oa.optimize_anything( seed_candidate={"svg_code": "<svg>...</svg>"}, evaluator=evaluate, dataset=VISUAL_ASPECTS, background=f"Optimize SVG source code depicting '{GOAL}'. " "Improve anatomy, composition, and visual quality.", )
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")
GOAL = "a pelican riding a bicycle" VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [ {"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"}, {"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"}, {"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"}, {"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"}, ]
def evaluate(candidate, example): """渲染SVG,用VLM打分,返回(分数, ASI)。""" image = render_image(candidate["svg_code"]) # 通过cairosvg实现 score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
    "RenderedSVG": Image(base64_data=image, media_type="image/png"),
    "Feedback": feedback,
}
result = oa.optimize_anything( seed_candidate={"svg_code": "<svg>...</svg>"}, evaluator=evaluate, dataset=VISUAL_ASPECTS, background=f"Optimize SVG source code depicting '{GOAL}'. " "Improve anatomy, composition, and visual quality.", )
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")

---------- Code optimization (single-task) ----------

---------- 代码优化(单任务) ----------

def evaluate_solver(candidate: str) -> tuple[float, dict]: """Evaluate a Python solver for a mathematical optimization problem.""" import subprocess, json
proc = subprocess.run(
    ["python", "-c", candidate],
    capture_output=True, text=True, timeout=30,
)

if proc.returncode != 0:
    oa.log(f"Runtime error: {proc.stderr}")
    return 0.0, {"Error": proc.stderr}

try:
    output = json.loads(proc.stdout)
    return output["score"], {
        "Output": output.get("solution"),
        "Runtime": f"{output.get('time_ms', 0):.1f}ms",
    }
except (json.JSONDecodeError, KeyError) as e:
    oa.log(f"Parse error: {e}")
    return 0.0, {"Error": str(e), "Stdout": proc.stdout}
result = oa.optimize_anything( evaluator=evaluate_solver, objective="Write a Python solver for the bin packing problem that " "minimizes the number of bins. Output JSON with 'score' and 'solution'.", background="Use first-fit-decreasing as a starting heuristic. " "Higher score = fewer bins used.", )
print(result.best_candidate)
def evaluate_solver(candidate: str) -> tuple[float, dict]: """评估数学优化问题的Python求解器。""" import subprocess, json
proc = subprocess.run(
    ["python", "-c", candidate],
    capture_output=True, text=True, timeout=30,
)

if proc.returncode != 0:
    oa.log(f"Runtime error: {proc.stderr}")
    return 0.0, {"Error": proc.stderr}

try:
    output = json.loads(proc.stdout)
    return output["score"], {
        "Output": output.get("solution"),
        "Runtime": f"{output.get('time_ms', 0):.1f}ms",
    }
except (json.JSONDecodeError, KeyError) as e:
    oa.log(f"Parse error: {e}")
    return 0.0, {"Error": str(e), "Stdout": proc.stdout}
result = oa.optimize_anything( evaluator=evaluate_solver, objective="Write a Python solver for the bin packing problem that " "minimizes the number of bins. Output JSON with 'score' and 'solution'.", background="Use first-fit-decreasing as a starting heuristic. " "Higher score = fewer bins used.", )
print(result.best_candidate)

---------- Agent architecture generalization ----------

---------- Agent架构泛化 ----------

def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]: """Run an agent architecture on a task and score it.""" exec_globals = {} exec(candidate, exec_globals) agent_fn = exec_globals.get("solve")
if agent_fn is None:
    return 0.0, {"Error": "No `solve` function defined"}

try:
    prediction = agent_fn(example["input"])
    correct = prediction == example["expected"]
    score = 1.0 if correct else 0.0
    feedback = "Correct" if correct else (
        f"Expected '{example['expected']}', got '{prediction}'"
    )
    return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
    return 0.0, {"Error": str(e)}
result = oa.optimize_anything( seed_candidate="def solve(input):\n return input", evaluator=evaluate_agent, dataset=train_tasks, valset=val_tasks, background="Discover a Python agent function
solve(input)
that " "generalizes across unseen reasoning tasks.", )
print(result.best_candidate)
undefined
def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]: """在任务上运行Agent架构并打分。""" exec_globals = {} exec(candidate, exec_globals) agent_fn = exec_globals.get("solve")
if agent_fn is None:
    return 0.0, {"Error": "No `solve` function defined"}

try:
    prediction = agent_fn(example["input"])
    correct = prediction == example["expected"]
    score = 1.0 if correct else 0.0
    feedback = "Correct" if correct else (
        f"Expected '{example['expected']}', got '{prediction}'"
    )
    return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
    return 0.0, {"Error": str(e)}
result = oa.optimize_anything( seed_candidate="def solve(input):\n return input", evaluator=evaluate_agent, dataset=train_tasks, valset=val_tasks, background="Discover a Python agent function
solve(input)
that " "generalizes across unseen reasoning tasks.", )
print(result.best_candidate)
undefined

Integration with DSPy

与DSPy集成

optimize_anything
complements DSPy's built-in optimizers. Use DSPy optimizers (GEPA, MIPROv2, BootstrapFewShot) for DSPy programs, and
optimize_anything
for arbitrary text artifacts outside DSPy:
python
import dspy
import gepa.optimize_anything as oa
optimize_anything
可补充DSPy内置的优化器。使用DSPy优化器(GEPA、MIPROv2、BootstrapFewShot)优化DSPy程序,使用
optimize_anything
优化DSPy之外的任意文本制品:
python
import dspy
import gepa.optimize_anything as oa

DSPy program optimization (use dspy.GEPA)

DSPy程序优化(使用dspy.GEPA)

optimizer = dspy.GEPA( metric=gepa_metric, reflection_lm=dspy.LM("openai/gpt-4o"), auto="medium", ) compiled = optimizer.compile(agent, trainset=trainset)
optimizer = dspy.GEPA( metric=gepa_metric, reflection_lm=dspy.LM("openai/gpt-4o"), auto="medium", ) compiled = optimizer.compile(agent, trainset=trainset)

Non-DSPy artifact optimization (use optimize_anything)

非DSPy制品优化(使用optimize_anything)

result = oa.optimize_anything( seed_candidate=my_config_yaml, evaluator=eval_config, background="Optimize Kubernetes scheduling policy for cost.", )
undefined
result = oa.optimize_anything( seed_candidate=my_config_yaml, evaluator=eval_config, background="Optimize Kubernetes scheduling policy for cost.", )
undefined

Best Practices

最佳实践

  1. Rich ASI — The more diagnostic feedback you provide, the better the proposer can reason about improvements
  2. Use
    oa.log()
    — Route prints to the proposer as ASI instead of stdout
  3. Structured returns — Return
    (score, dict)
    tuples for multi-faceted diagnostics
  4. Seedless for exploration — Use
    objective=
    when the solution space is large and unfamiliar
  5. Background context — Provide domain knowledge via
    background=
    to constrain the search
  6. Generalization mode — Always provide
    valset
    when the artifact must transfer to unseen inputs
  7. Images as ASI — Use
    gepa.Image
    to pass rendered outputs to vision-capable LLMs
  1. 丰富ASI信息——提供的诊断反馈越详细,生成器越能精准地进行优化改进
  2. 使用
    oa.log()
    ——将打印信息作为ASI传递给生成器,而非输出到标准输出
  3. 结构化返回——返回
    (分数, 字典)
    元组以提供多维度诊断信息
  4. 无种子模式用于探索——当解决方案空间较大且不熟悉时,使用
    objective=
    参数
  5. 背景上下文——通过
    background=
    参数提供领域知识,约束搜索范围
  6. 泛化模式——当制品需要迁移至未知输入时,务必提供
    valset
    验证集
  7. 图片作为ASI——使用
    gepa.Image
    将渲染后的输出传递给支持视觉的LLM

Limitations

局限性

  • Requires the
    gepa
    package (
    pip install gepa
    )
  • Evaluator must be deterministic or low-variance for stable optimization
  • Compute cost scales with number of candidates explored
  • Single-task mode does not generalize; use mode 3 with
    valset
    for transfer
  • Currently powered by GEPA backend; API is backend-agnostic for future strategies
  • 需安装
    gepa
    包(
    pip install gepa
  • 评估器需具备确定性或低方差,以保证优化稳定性
  • 计算成本随探索的候选制品数量增加而上升
  • 单任务模式不具备泛化能力;如需迁移能力,请使用模式3并搭配
    valset
  • 当前由GEPA后端驱动;API为后端无关设计,未来将支持更多策略