llm-tuning-patterns
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLLM Tuning Patterns
LLM调优模式
Evidence-based patterns for configuring LLM parameters, based on APOLLO and Godel-Prover research.
基于APOLLO和Godel-Prover研究的、有实证依据的LLM参数配置模式。
Pattern
模式说明
Different tasks require different LLM configurations. Use these evidence-based settings.
不同任务需要不同的LLM配置。请使用这些经过实证验证的设置。
Theorem Proving / Formal Reasoning
定理证明/形式化推理
Based on APOLLO parity analysis:
| Parameter | Value | Rationale |
|---|---|---|
| max_tokens | 4096 | Proofs need space for chain-of-thought |
| temperature | 0.6 | Higher creativity for tactic exploration |
| top_p | 0.95 | Allow diverse proof paths |
基于APOLLO等价性分析:
| 参数 | 取值 | 依据 |
|---|---|---|
| max_tokens | 4096 | 证明过程需要足够空间存放思维链 |
| temperature | 0.6 | 更高的创造性有助于探索多种策略 |
| top_p | 0.95 | 允许生成多样化的证明路径 |
Proof Plan Prompt
证明规划提示词
Always request a proof plan before tactics:
Given the theorem to prove:
[theorem statement]
First, write a high-level proof plan explaining your approach.
Then, suggest Lean 4 tactics to implement each step.The proof plan (chain-of-thought) significantly improves tactic quality.
在生成策略前,务必先要求生成证明规划:
Given the theorem to prove:
[theorem statement]
First, write a high-level proof plan explaining your approach.
Then, suggest Lean 4 tactics to implement each step.证明规划(思维链)能显著提升策略质量。
Parallel Sampling
并行采样
For hard proofs, use parallel sampling:
- Generate N=8-32 candidate proof attempts
- Use best-of-N selection
- Each sample at temperature 0.6-0.8
对于复杂证明,使用并行采样:
- 生成N=8-32个候选证明尝试
- 采用最优N选1策略
- 每个样本的temperature设置为0.6-0.8
Code Generation
代码生成
| Parameter | Value | Rationale |
|---|---|---|
| max_tokens | 2048 | Sufficient for most functions |
| temperature | 0.2-0.4 | Prefer deterministic output |
| 参数 | 取值 | 依据 |
|---|---|---|
| max_tokens | 2048 | 足以满足大多数函数的生成需求 |
| temperature | 0.2-0.4 | 倾向于生成确定性输出 |
Creative / Exploration Tasks
创意/探索类任务
| Parameter | Value | Rationale |
|---|---|---|
| max_tokens | 4096 | Space for exploration |
| temperature | 0.8-1.0 | Maximum creativity |
| 参数 | 取值 | 依据 |
|---|---|---|
| max_tokens | 4096 | 为探索过程提供足够空间 |
| temperature | 0.8-1.0 | 最大化创造性 |
Anti-Patterns
反模式
- Too low tokens for proofs: 512 tokens truncates chain-of-thought
- Too low temperature for proofs: 0.2 misses creative tactic paths
- No proof plan: Jumping to tactics without planning reduces success rate
- 证明任务的token数设置过低:512个token会截断思维链
- 证明任务的temperature设置过低:0.2的取值会错过有创意的策略路径
- 未使用证明规划:直接生成策略而不先规划会降低成功率
Source Sessions
来源记录
- This session: APOLLO parity - increased max_tokens 512->4096, temp 0.2->0.6
- This session: Added proof plan prompt for chain-of-thought before tactics
- 本次调整:APOLLO等价性分析 - 将max_tokens从512提升至4096,temperature从0.2提升至0.6
- 本次调整:新增证明规划提示词,在生成策略前先生成思维链