opik-optimizer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Opik Optimizer

Opik Optimizer

Purpose

用途

Design, run, and interpret Opik Optimizer workflows for prompts, tools, and model parameters with consistent dataset/metric wiring and reproducible evaluation.
设计、运行并解读Opik Optimizer工作流,用于提示词、工具和模型参数的优化,同时实现一致的数据集/指标关联与可复现的评估。

When to use

使用场景

Use this skill when a user asks for:
  • Choosing and configuring Opik Optimizer algorithms for prompt/agent optimization.
  • Writing
    ChatPrompt
    -based optimization runs and custom metric functions.
  • Optimizing with tools (function calling or MCP), selected prompt roles, or prompt segments.
  • Tuning LLM call parameters with
    optimize_parameter
    .
  • Comparing optimizer outputs and interpreting
    OptimizationResult
    .
当用户有以下需求时,可使用该技能:
  • 选择并配置Opik Optimizer算法以优化提示词/Agent。
  • 编写基于
    ChatPrompt
    的优化任务以及自定义指标函数。
  • 结合工具(函数调用或MCP)、选定的提示词角色或提示词片段进行优化。
  • 使用
    optimize_parameter
    调优LLM调用参数。
  • 对比优化器输出并解读
    OptimizationResult

Workflow

工作流

  1. Select optimizer strategy (
    MetaPromptOptimizer
    ,
    FewShotBayesianOptimizer
    ,
    HRPO
    , etc.) based on the target optimization goal.
  2. Build prompt/dataset/metric wiring and validate placeholder-field alignment.
  3. Run prompt, tool, or parameter optimization with explicit controls (
    n_threads
    ,
    n_samples
    ,
    max_trials
    , seed).
  4. Inspect
    OptimizationResult
    and compare score deltas against initial baselines.
  5. Summarize recommendations, risks, and next experiments.
  1. 根据目标优化目标选择优化器策略(
    MetaPromptOptimizer
    FewShotBayesianOptimizer
    HRPO
    等)。
  2. 构建提示词/数据集/指标关联,验证占位符与字段的对齐性。
  3. 运行提示词、工具或参数优化,并设置明确的控制参数(
    n_threads
    n_samples
    max_trials
    、随机种子)。
  4. 检查
    OptimizationResult
    ,对比分数变化与初始基准值。
  5. 总结建议、风险以及后续实验计划。

Inputs

输入参数

  • Target optimization objective (prompt/tool/parameter) and success metric.
  • Dataset source and expected schema fields.
  • Model/provider constraints and runtime limits.
  • Optional scope constraints (
    optimize_prompts
    segments, tool fields, project names).
  • 目标优化对象(提示词/工具/参数)与成功指标。
  • 数据集来源与预期 schema 字段。
  • 模型/供应商约束与运行时限制。
  • 可选的范围约束(
    optimize_prompts
    片段、工具字段、项目名称)。

Outputs

输出结果

  • Optimizer run configuration and rationale.
  • Result interpretation (
    score
    ,
    initial_score
    , history trends).
  • Recommended next changes and follow-up experiment plan.
Use the reference files in this skill for details before implementing code:
  • references/algorithms.md
  • references/prompt_agent_workflow.md
  • references/example_patterns.md
  • 优化器运行配置与设计依据。
  • 结果解读(
    score
    initial_score
    、历史趋势)。
  • 推荐的后续修改方案与跟进实验计划。
在实现代码前,可参考本技能中的参考文件获取详细信息:
  • references/algorithms.md
  • references/prompt_agent_workflow.md
  • references/example_patterns.md

Opik Optimizer quickstart

Opik Optimizer 快速入门

  1. Install and import:
bash
pip install opik-optimizer
python
from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer
from opik_optimizer import datasets
  1. Build a prompt and metric:
python
from opik.evaluation.metrics import LevenshteinRatio

prompt = ChatPrompt(
    system="You are a concise answerer.",
    user="{question}",
)

def metric(dataset_item: dict, output: str) -> float:
    return LevenshteinRatio().score(
        reference=dataset_item["answer"],
        output=output,
    ).value
  1. Load dataset and run:
python
dataset = datasets.hotpot(count=30)

result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=metric,
    n_samples=20,
    max_trials=10,
)
result.display()
  1. 安装并导入:
bash
pip install opik-optimizer
python
from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer
from opik_optimizer import datasets
  1. 构建提示词与指标:
python
from opik.evaluation.metrics import LevenshteinRatio

prompt = ChatPrompt(
    system="You are a concise answerer.",
    user="{question}",
)

def metric(dataset_item: dict, output: str) -> float:
    return LevenshteinRatio().score(
        reference=dataset_item["answer"],
        output=output,
    ).value
  1. 加载数据集并运行:
python
dataset = datasets.hotpot(count=30)

result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=metric,
    n_samples=20,
    max_trials=10,
)
result.display()

Core workflow you should follow

需遵循的核心工作流

  1. Pick optimizer class:
    • Few-shot examples + Bayesian selection:
      FewShotBayesianOptimizer
    • LLM meta-reasoning:
      MetaPromptOptimizer
    • Genetic + MOO / LLM crossover:
      EvolutionaryOptimizer
    • Hierarchical reflective diagnostics:
      HierarchicalReflectiveOptimizer
      (
      HRPO
      )
    • Pareto-based genetic strategy:
      GepaOptimizer
    • Parameter tuning only:
      ParameterOptimizer
  2. Define a single
    ChatPrompt
    (or dict of prompts for multi-prompt cases).
  3. Provide a dataset from
    opik_optimizer.datasets
    .
  4. Provide metric callable with signature
    (dataset_item, llm_output) -> float
    (or
    ScoreResult
    /list of
    ScoreResult
    ).
  5. Set optimizer controls (
    n_threads
    ,
    n_samples
    ,
    max_trials
    , seed, etc.).
  6. Run one of:
    • optimize_prompt(...)
      for prompt/system behavior changes.
    • optimize_parameter(...)
      for model-call hyperparameters.
  7. Inspect
    OptimizationResult
    (
    score
    ,
    initial_score
    ,
    history
    ,
    optimization_id
    ,
    get_optimized_parameters
    ).
  1. 选择优化器类:
    • 少样本示例+贝叶斯选择:
      FewShotBayesianOptimizer
    • LLM元推理:
      MetaPromptOptimizer
    • 遗传+多目标优化/LLM交叉:
      EvolutionaryOptimizer
    • 分层反射诊断:
      HierarchicalReflectiveOptimizer
      HRPO
    • 基于帕累托的遗传策略:
      GepaOptimizer
    • 仅参数调优:
      ParameterOptimizer
  2. 定义单个
    ChatPrompt
    (多提示词场景下可使用提示词字典)。
  3. opik_optimizer.datasets
    中获取数据集。
  4. 提供符合签名
    (dataset_item, llm_output) -> float
    的指标可调用对象(或
    ScoreResult
    /
    ScoreResult
    列表)。
  5. 设置优化器控制参数(
    n_threads
    n_samples
    max_trials
    、随机种子等)。
  6. 运行以下方法之一:
    • optimize_prompt(...)
      :用于修改提示词/系统行为。
    • optimize_parameter(...)
      :用于模型调用超参数调优。
  7. 检查
    OptimizationResult
    score
    initial_score
    history
    optimization_id
    get_optimized_parameters
    )。

Key execution details to enforce

需遵守的关键执行细节

  • Prefer explicit
    project_name
    for Opik tracking if you are using org-level observability.
  • Keep placeholders in prompts aligned with dataset fields (for example
    {question}
    ).
  • Start with
    optimize_prompts="system"
    or
    "user"
    when scope should be constrained.
  • Keep
    model
    names in
    MetaPrompt
    /
    reasoning
    calls provider-compatible for your account.
  • Validate multimodal input payloads by preserving non-empty content segments only.
  • For small datasets, use
    n_samples
    and
    n_samples_strategy
    carefully; over-allocation auto-falls back to full set.
  • 若使用组织级可观测性,建议显式设置
    project_name
    以用于Opik追踪。
  • 确保提示词中的占位符与数据集字段对齐(例如
    {question}
    )。
  • 当需要限制范围时,从
    optimize_prompts="system"
    "user"
    开始。
  • 确保
    MetaPrompt
    /
    reasoning
    调用中的
    model
    名称与你的账户供应商兼容。
  • 仅保留非空内容片段,以验证多模态输入负载。
  • 对于小型数据集,需谨慎使用
    n_samples
    n_samples_strategy
    ;分配过多时会自动回退到完整数据集。

Tooling and segment-based control

工具与基于片段的控制

  • Tools can be optimized with MCP/function schema fields, not only by changing prompt wording.
  • For fine-grained text updates, use
    optimize_prompts
    values and helper functions from
    prompt_segments
    :
    • extract_prompt_segments(ChatPrompt)
      to inspect stable segment IDs.
    • apply_segment_updates(ChatPrompt, updates)
      for deterministic edits.
  • Tool optimization is distinct from prompt optimization.
Runnable examples live upstream in the Opik repo:
If you need local runnable scripts, vendor the upstream examples into a
scripts/
folder and keep references one level deep.
  • 可通过MCP/函数 schema 字段优化工具,而不仅仅是修改提示词措辞。
  • 如需细粒度文本更新,可使用
    optimize_prompts
    值以及
    prompt_segments
    中的辅助函数:
    • extract_prompt_segments(ChatPrompt)
      :用于检查稳定的片段ID。
    • apply_segment_updates(ChatPrompt, updates)
      :用于确定性编辑。
  • 工具优化与提示词优化是不同的操作。
可运行的示例位于Opik仓库的上游:
如果需要本地可运行脚本,可将上游示例复制到
scripts/
文件夹中,并保持引用路径为一级深度。

Common mistakes to avoid

需避免的常见错误

  • Passing empty dataset or mismatched placeholder names.
  • Mixing deprecated constructor arg
    num_threads
    with
    n_threads
    .
  • Assuming tool optimization is the same as agent function-calling optimization.
  • Running
    ParameterOptimizer.optimize_prompt
    (it raises and should not be used).
  • 传入空数据集或占位符名称不匹配。
  • 同时使用已弃用的构造函数参数
    num_threads
    n_threads
  • 认为工具优化与Agent函数调用优化是相同的操作。
  • 运行
    ParameterOptimizer.optimize_prompt
    (该方法会抛出异常,不应被使用)。

Next actions

后续操作

  • For in-depth behavior and per-class parameter tables:
    references/algorithms.md
  • For exact
    optimize_prompt
    signatures, prompts, tool constraints, and result usage:
    references/prompt_agent_workflow.md
  • For pattern examples and source-backed workflows:
    references/example_patterns.md
  • 如需了解深入的行为与每类参数表:
    references/algorithms.md
  • 如需获取
    optimize_prompt
    的确切签名、提示词、工具约束以及结果使用方法:
    references/prompt_agent_workflow.md
  • 如需模式示例与基于源码的工作流:
    references/example_patterns.md