llm-tuning-patterns

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

LLM Tuning Patterns

LLM调优模式

Evidence-based patterns for configuring LLM parameters, based on APOLLO and Godel-Prover research.

基于APOLLO和Godel-Prover研究的、有实证依据的LLM参数配置模式。

Pattern

模式说明

Different tasks require different LLM configurations. Use these evidence-based settings.

不同任务需要不同的LLM配置。请使用这些经过实证验证的设置。

Theorem Proving / Formal Reasoning

定理证明/形式化推理

Based on APOLLO parity analysis:

Parameter	Value	Rationale
max_tokens	4096	Proofs need space for chain-of-thought
temperature	0.6	Higher creativity for tactic exploration
top_p	0.95	Allow diverse proof paths

基于APOLLO等价性分析：

参数	取值	依据
max_tokens	4096	证明过程需要足够空间存放思维链
temperature	0.6	更高的创造性有助于探索多种策略
top_p	0.95	允许生成多样化的证明路径

Proof Plan Prompt

证明规划提示词

Always request a proof plan before tactics:

Given the theorem to prove:
[theorem statement]

First, write a high-level proof plan explaining your approach.
Then, suggest Lean 4 tactics to implement each step.

The proof plan (chain-of-thought) significantly improves tactic quality.

在生成策略前，务必先要求生成证明规划：

Given the theorem to prove:
[theorem statement]

First, write a high-level proof plan explaining your approach.
Then, suggest Lean 4 tactics to implement each step.

证明规划（思维链）能显著提升策略质量。

Parallel Sampling

并行采样

For hard proofs, use parallel sampling:

Generate N=8-32 candidate proof attempts
Use best-of-N selection
Each sample at temperature 0.6-0.8

对于复杂证明，使用并行采样：

生成N=8-32个候选证明尝试
采用最优N选1策略
每个样本的temperature设置为0.6-0.8

Code Generation

代码生成

Parameter	Value	Rationale
max_tokens	2048	Sufficient for most functions
temperature	0.2-0.4	Prefer deterministic output

参数	取值	依据
max_tokens	2048	足以满足大多数函数的生成需求
temperature	0.2-0.4	倾向于生成确定性输出

Creative / Exploration Tasks

创意/探索类任务

Parameter	Value	Rationale
max_tokens	4096	Space for exploration
temperature	0.8-1.0	Maximum creativity

参数	取值	依据
max_tokens	4096	为探索过程提供足够空间
temperature	0.8-1.0	最大化创造性

Anti-Patterns

反模式

Too low tokens for proofs: 512 tokens truncates chain-of-thought
Too low temperature for proofs: 0.2 misses creative tactic paths
No proof plan: Jumping to tactics without planning reduces success rate

证明任务的token数设置过低：512个token会截断思维链
证明任务的temperature设置过低：0.2的取值会错过有创意的策略路径
未使用证明规划：直接生成策略而不先规划会降低成功率

Source Sessions

来源记录

This session: APOLLO parity - increased max_tokens 512->4096, temp 0.2->0.6
This session: Added proof plan prompt for chain-of-thought before tactics

本次调整：APOLLO等价性分析 - 将max_tokens从512提升至4096，temperature从0.2提升至0.6
本次调整：新增证明规划提示词，在生成策略前先生成思维链