tinker
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTinker API - LLM Fine-Tuning
Tinker API - LLM 微调
Overview
概述
Tinker is a training API for large language models from Thinking Machines Lab. It provides:
- Supervised Fine-Tuning (SFT): Train models on instruction/completion pairs
- Reinforcement Learning (RL): PPO and policy gradient losses; cookbook patterns include GRPO-like group rollouts/advantage centering
- Vision-Language Models: VLM support via Qwen3-VL
- LoRA Training: Efficient parameter-efficient fine-tuning
Two abstraction levels:
- Tinker Cookbook: High-level patterns with automatic training loops
- Low-Level API: Manual control for custom training logic
Tinker是Thinking Machines Lab推出的面向大语言模型的训练API,它提供以下能力:
- 监督微调(SFT):基于指令/补全对训练模型
- 强化学习(RL):支持PPO和策略梯度损失;Cookbook模式包含类GRPO的分组rollout/优势中心化功能
- 视觉语言模型:通过Qwen3-VL提供VLM支持
- LoRA训练:高效的参数高效微调能力
提供两个抽象层级:
- Tinker Cookbook:自带自动训练循环的高层级使用模式
- 低层级API:支持自定义训练逻辑的手动控制能力
Quick Reference
快速参考
| Topic | Reference |
|---|---|
| Setup & Core Concepts | Getting Started |
| API Classes & Types | API Reference |
| Supervised Learning | Supervised Learning |
| RL Training | Reinforcement Learning |
| Loss Functions | Loss Functions |
| Chat Templates | Rendering |
| Models & LoRA | Models & LoRA |
| Example Scripts | Recipes |
| 主题 | 参考文档 |
|---|---|
| 安装与核心概念 | 快速入门 |
| API类与类型 | API参考 |
| 监督学习 | 监督学习 |
| 强化学习训练 | 强化学习 |
| 损失函数 | 损失函数 |
| 聊天模板 | 渲染 |
| 模型与LoRA | 模型与LoRA |
| 示例脚本 | 实践教程 |
Installation
安装
bash
pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_herebash
pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_hereMinimal Example
最简示例
python
import numpy as np
import tinker
from tinker import typespython
import numpy as np
import tinker
from tinker import typesCreate clients
Create clients
service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
base_model="Qwen/Qwen3-30B-A3B", rank=32
)
tokenizer = training_client.get_tokenizer()
service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
base_model="Qwen/Qwen3-30B-A3B", rank=32
)
tokenizer = training_client.get_tokenizer()
Prepare data
Prepare data
prompt = "English: hello\nPig Latin:"
completion = " ello-hay\n"
prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True)
completion_tokens = tokenizer.encode(completion, add_special_tokens=False)
tokens = prompt_tokens + completion_tokens
weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32)
target_tokens = np.array(tokens[1:], dtype=np.int64)
datum = types.Datum(
model_input=types.ModelInput.from_ints(tokens=tokens[:-1]),
loss_fn_inputs={
"target_tokens": target_tokens,
"weights": weights[1:]
}
)
prompt = "English: hello\nPig Latin:"
completion = " ello-hay\n"
prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True)
completion_tokens = tokenizer.encode(completion, add_special_tokens=False)
tokens = prompt_tokens + completion_tokens
weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32)
target_tokens = np.array(tokens[1:], dtype=np.int64)
datum = types.Datum(
model_input=types.ModelInput.from_ints(tokens=tokens[:-1]),
loss_fn_inputs={
"target_tokens": target_tokens,
"weights": weights[1:]
}
)
Train
Train
fwdbwd = training_client.forward_backward([datum], "cross_entropy")
optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4))
fwdbwd.result(); optim.result()
fwdbwd = training_client.forward_backward([datum], "cross_entropy")
optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4))
fwdbwd.result(); optim.result()
Sample
Sample
sampling_client = training_client.save_weights_and_get_sampling_client(name="v1")
result = sampling_client.sample(
prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)),
sampling_params=types.SamplingParams(max_tokens=20),
num_samples=1
).result()
print(tokenizer.decode(result.sequences[0].tokens))
undefinedsampling_client = training_client.save_weights_and_get_sampling_client(name="v1")
result = sampling_client.sample(
prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)),
sampling_params=types.SamplingParams(max_tokens=20),
num_samples=1
).result()
print(tokenizer.decode(result.sequences[0].tokens))
undefinedCommon Imports
常用导入
python
undefinedpython
undefinedLow-level API
Low-level API
import tinker
from tinker import types
from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams
import tinker
from tinker import types
from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams
Cookbook (high-level)
Cookbook (high-level)
import chz
import asyncio
from tinker_cookbook.supervised import train
from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig
from tinker_cookbook.supervised.data import (
SupervisedDatasetFromHFDataset,
StreamingSupervisedDatasetFromHFDataset,
FromConversationFileBuilder,
conversation_to_datum,
)
from tinker_cookbook.renderers import get_renderer, TrainOnWhat
from tinker_cookbook.model_info import get_recommended_renderer_name
from tinker_cookbook.tokenizer_utils import get_tokenizer
undefinedimport chz
import asyncio
from tinker_cookbook.supervised import train
from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig
from tinker_cookbook.supervised.data import (
SupervisedDatasetFromHFDataset,
StreamingSupervisedDatasetFromHFDataset,
FromConversationFileBuilder,
conversation_to_datum,
)
from tinker_cookbook.renderers import get_renderer, TrainOnWhat
from tinker_cookbook.model_info import get_recommended_renderer_name
from tinker_cookbook.tokenizer_utils import get_tokenizer
undefinedWhen to Use What
场景选型指南
| Scenario | Approach |
|---|---|
| Standard SFT with HF/JSONL data | Cookbook |
| Custom preprocessing | Custom |
| Large datasets (>1M) | |
| RL / GRPO | Cookbook RL patterns |
| Research / custom loops | Low-level |
| Vision-language | Qwen3-VL + |
| 场景 | 方案 |
|---|---|
| 使用HF/JSONL格式数据的标准SFT任务 | Cookbook |
| 自定义预处理 | 自定义 |
| 超大数据集(>100万条) | |
| RL / GRPO任务 | Cookbook RL模式 |
| 研究场景/自定义训练循环 | 低层级 |
| 视觉语言任务 | Qwen3-VL + |
External Resources
外部资源
- Documentation: https://tinker-docs.thinkingmachines.ai/
- Cookbook Repo: https://github.com/thinking-machines-lab/tinker-cookbook
- Console: https://tinker-console.thinkingmachines.ai