shinka-run

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Shinka Run CLI Skill

Shinka Run CLI 技能

Run a batch of program mutations using ShinkaEvolve's CLI interface.
使用ShinkaEvolve的CLI接口运行一批程序变异任务。

When to Use

使用场景

Use this skill when:
  • evaluate.py
    and
    initial.<ext>
    already exist
  • The user wants to run code evolution using the ShinkaEvolve/Shinka library
  • You want configurable program evolution runs using explicit CLI args
Do not use this skill when:
  • You need to scaffold a new task from scratch (use
    shinka-setup
    )
在以下场景使用该技能:
  • 已存在
    evaluate.py
    initial.<ext>
    文件
  • 用户希望使用ShinkaEvolve/Shinka库进行代码演化
  • 你需要通过明确的CLI参数配置程序演化运行
请勿在以下场景使用该技能:
  • 需要从头搭建新任务(请使用
    shinka-setup

What is ShinkaEvolve?

什么是ShinkaEvolve?

A framework developed by SakanaAI that combines LLMs with evolutionary algorithms to propose program mutations, that are then evaluated and archived. The goal is to optimize for performance and discover novel scientific insights.
由SakanaAI开发的框架,将大语言模型(LLMs)与进化算法相结合,提出程序变异方案,随后对这些方案进行评估和归档。其目标是优化性能并发现新颖的科学见解。

Workflow

工作流程

  1. Inspect task directory
bash
ls -la <task_dir>
Confirm
evaluate.py
and
initial.<ext>
exist.
  1. Inspect CLI reference quickly
bash
shinka_run --help
  1. Check model availability before proposing a run
bash
shinka_models
shinka_models --verbose
Validate the exact run config against
shinka_models
:
  • Mutation models: every entry in
    evo.llm_models
    must appear in the
    llm
    list.
  • Meta recommendation models: if
    evo.meta_rec_interval
    is set and
    evo.meta_llm_models
    is set, every meta model must appear in the
    llm
    list.
  • Prompt evolution models: if
    evo.evolve_prompts=true
    , use
    evo.prompt_llm_models
    when provided, otherwise
    evo.llm_models
    ; every selected model must appear in the
    llm
    list.
  • Embedding model: if
    evo.embedding_model
    is set, it must appear in the
    embedding
    list.
  • Local OpenAI-compatible models are allowed for LLMs and embeddings via
    local/<model>@http(s)://host[:port]/v1
    , and these local models are not expected to appear in
    shinka_models
    .
Important runtime rules:
  • Do not assume meta recommendations fall back to
    evo.llm_models
    . In the current runner, meta recommendations are only enabled when
    evo.meta_llm_models
    is explicitly set.
  • Prompt evolution does fall back to
    evo.llm_models
    when
    evo.prompt_llm_models
    is unset.
  • Treat
    local/<model>@http(s)://host[:port]/v1
    values as an explicit exception to the
    shinka_models
    membership check. Instead, confirm the local endpoint URL and serving status separately before running.
  • If any required model is missing from
    shinka_models
    , stop and ask the user to either change the config or set the missing credentials first.
  1. Confirm first-batch configuration with the user
  • Minimum: budget scope, generation count, critical overrides.
  • Explicitly confirm the mutation LLMs, meta recommendation LLMs, prompt evolution LLMs, and embedding model after checking them against
    shinka_models
    .
  • If unclear, ask before running.
  • Do not override any non-confirmed arguments.
  1. Launch main run with explicit knobs
bash
shinka_run \
  --task-dir <task_dir> \
  --results_dir <results_dir> \
  --num_generations 40 \
  --set db.num_islands=3 \
  --set job.time=00:10:00 \
  --set evo.task_sys_msg='<task-specific system message guiding search>'\
  --set evo.llm_models='["gpt-5-mini","gpt-5-nano"]' \
  --set evo.meta_llm_models='["gpt-5-mini"]' \
  --set evo.prompt_llm_models='["gpt-5-mini"]' \
  --set evo.embedding_model='text-embedding-3-small' \
  # Concurrency settings for parallel sampling and evaluation
  --max-evaluation-jobs 2 \
  --max-proposal-jobs 2 \
  --max-db-workers 2
  1. Verify outputs before handoff
bash
ls -la <results_dir>
Expect artifacts like run log, generation folders, and SQLite DBs.
  1. Between-batch handoff (unless explicitly autonomous)
  • Summarize outcomes from the finished batch.
  • Ask user for the next batch config before running again.
  • Explicitly ask: "What new directions should we push next batch? Please include algorithm ideas, constraints, and failure modes to avoid."
  • Turn user feedback into a revised system prompt and pass it via
    --set evo.task_sys_msg=...
    in the next
    shinka_run
    call.
  • If the prompt is long/multiline, put it in a config file and use
    --config-fname
    instead of shell-escaping.
  • Unless the user explicitly wants a fresh run/fork, keep the same
    --results_dir
    for follow-up batches.
Example next-batch command with feedback-driven prompt:
bash
shinka_run \
  --task-dir <task_dir> \
  --results_dir <results_dir> \
  --num_generations 20 \
  --set evo.task_sys_msg='<new system prompt derived from user feedback>' \
  --set db.num_islands=3
  1. 检查任务目录
bash
ls -la <task_dir>
确认
evaluate.py
initial.<ext>
存在。
  1. 快速查看CLI参考文档
bash
shinka_run --help
  1. 在启动运行前检查模型可用性
bash
shinka_models
shinka_models --verbose
根据
shinka_models
验证运行配置的准确性:
  • 变异模型:
    evo.llm_models
    中的每个条目必须出现在
    llm
    列表中。
  • 元推荐模型:如果设置了
    evo.meta_rec_interval
    evo.meta_llm_models
    ,则每个元模型必须出现在
    llm
    列表中。
  • 提示词演化模型:如果
    evo.evolve_prompts=true
    ,若提供了
    evo.prompt_llm_models
    则使用该模型,否则使用
    evo.llm_models
    ;所选的每个模型必须出现在
    llm
    列表中。
  • 嵌入模型:如果设置了
    evo.embedding_model
    ,则它必须出现在
    embedding
    列表中。
  • 本地兼容OpenAI的模型可通过
    local/<model>@http(s)://host[:port]/v1
    用于LLM和嵌入,这些本地模型无需出现在
    shinka_models
    中。
重要运行时规则:
  • 不要假设元推荐会回退到
    evo.llm_models
    。在当前运行器中,只有当显式设置
    evo.meta_llm_models
    时,元推荐才会启用。
  • 当未设置
    evo.prompt_llm_models
    时,提示词演化会回退到
    evo.llm_models
  • local/<model>@http(s)://host[:port]/v1
    值视为
    shinka_models
    成员检查的明确例外。相反,在运行前需单独确认本地端点URL和服务状态。
  • 如果任何所需模型未在
    shinka_models
    中存在,请停止操作并询问用户是修改配置还是先设置缺失的凭据。
  1. 与用户确认首批运行配置
  • 最低要求:预算范围、生成次数、关键覆盖参数。
  • 在对照
    shinka_models
    检查后,明确确认变异LLM、元推荐LLM、提示词演化LLM和嵌入模型。
  • 若有疑问,在运行前询问用户。
  • 不要覆盖任何未确认的参数。
  1. 使用明确参数启动主运行
bash
shinka_run \
  --task-dir <task_dir> \
  --results_dir <results_dir> \
  --num_generations 40 \
  --set db.num_islands=3 \
  --set job.time=00:10:00 \
  --set evo.task_sys_msg='<task-specific system message guiding search>'\
  --set evo.llm_models='["gpt-5-mini","gpt-5-nano"]' \
  --set evo.meta_llm_models='["gpt-5-mini"]' \
  --set evo.prompt_llm_models='["gpt-5-mini"]' \
  --set evo.embedding_model='text-embedding-3-small' \
  # Concurrency settings for parallel sampling and evaluation
  --max-evaluation-jobs 2 \
  --max-proposal-jobs 2 \
  --max-db-workers 2
  1. 移交前验证输出
bash
ls -la <results_dir>
预期会有运行日志、生成文件夹和SQLite数据库等产物。
  1. 批次间移交(除非明确设置为自主模式)
  • 总结已完成批次的结果。
  • 在再次运行前询问用户下一批次的配置。
  • 明确询问:“下一批次我们应该朝着什么新方向推进?请提供算法思路、约束条件以及需要避免的失败模式。”
  • 将用户反馈转化为修订后的系统提示词,并通过下一次
    shinka_run
    调用中的
    --set evo.task_sys_msg=...
    参数传递。
  • 如果提示词较长/多行,将其放入配置文件中,使用
    --config-fname
    而非shell转义。
  • 除非用户明确要求全新运行/分支,否则后续批次保持相同的
    --results_dir
示例:基于反馈提示词的下一批次命令
bash
shinka_run \
  --task-dir <task_dir> \
  --results_dir <results_dir> \
  --num_generations 20 \
  --set evo.task_sys_msg='<new system prompt derived from user feedback>' \
  --set db.num_islands=3

Batch Control Policy (Required)

批次控制策略(必填)

Treat one
shinka_run
invocation as one batch of program evaluations/generations.
  • Default mode: human-in-the-loop between batches.
  • After each batch and before the first, always ask the user what configuration to run next (budget,
    --num_generations
    , model/settings overrides, concurrency, islands, output path).
  • Do not start the next batch until the user confirms the next config.
  • Keep
    --results_dir
    fixed across continuation batches so Shinka can reload prior results.
  • Exception: if the user explicitly asks for fully autonomous execution, you may continue across batches without re-asking between runs.
将一次
shinka_run
调用视为一批程序评估/生成任务。
  • 默认模式:批次间需人工介入。
  • 在每一批次完成后以及首次运行前,务必询问用户下一次运行的配置(预算、
    --num_generations
    、模型/设置覆盖、并发数、岛屿数、输出路径)。
  • 在用户确认下一批次配置前,不要启动运行。
  • 在连续批次中保持
    --results_dir
    固定,以便Shinka能够重新加载之前的结果。
  • 例外情况:如果用户明确要求完全自主执行,则可在批次间无需再次询问即可继续运行。