tinker

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Tinker API - LLM Fine-Tuning

Tinker API - LLM 微调

Overview

概述

Tinker is a training API for large language models from Thinking Machines Lab. It provides:
  • Supervised Fine-Tuning (SFT): Train models on instruction/completion pairs
  • Reinforcement Learning (RL): PPO and policy gradient losses; cookbook patterns include GRPO-like group rollouts/advantage centering
  • Vision-Language Models: VLM support via Qwen3-VL
  • LoRA Training: Efficient parameter-efficient fine-tuning
Two abstraction levels:
  • Tinker Cookbook: High-level patterns with automatic training loops
  • Low-Level API: Manual control for custom training logic
Tinker是Thinking Machines Lab推出的面向大语言模型的训练API,它提供以下能力:
  • 监督微调(SFT):基于指令/补全对训练模型
  • 强化学习(RL):支持PPO和策略梯度损失;Cookbook模式包含类GRPO的分组rollout/优势中心化功能
  • 视觉语言模型:通过Qwen3-VL提供VLM支持
  • LoRA训练:高效的参数高效微调能力
提供两个抽象层级:
  • Tinker Cookbook:自带自动训练循环的高层级使用模式
  • 低层级API:支持自定义训练逻辑的手动控制能力

Quick Reference

快速参考

TopicReference
Setup & Core ConceptsGetting Started
API Classes & TypesAPI Reference
Supervised LearningSupervised Learning
RL TrainingReinforcement Learning
Loss FunctionsLoss Functions
Chat TemplatesRendering
Models & LoRAModels & LoRA
Example ScriptsRecipes
主题参考文档
安装与核心概念快速入门
API类与类型API参考
监督学习监督学习
强化学习训练强化学习
损失函数损失函数
聊天模板渲染
模型与LoRA模型与LoRA
示例脚本实践教程

Installation

安装

bash
pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_here
bash
pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_here

Minimal Example

最简示例

python
import numpy as np
import tinker
from tinker import types
python
import numpy as np
import tinker
from tinker import types

Create clients

Create clients

service_client = tinker.ServiceClient() training_client = service_client.create_lora_training_client( base_model="Qwen/Qwen3-30B-A3B", rank=32 ) tokenizer = training_client.get_tokenizer()
service_client = tinker.ServiceClient() training_client = service_client.create_lora_training_client( base_model="Qwen/Qwen3-30B-A3B", rank=32 ) tokenizer = training_client.get_tokenizer()

Prepare data

Prepare data

prompt = "English: hello\nPig Latin:" completion = " ello-hay\n" prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True) completion_tokens = tokenizer.encode(completion, add_special_tokens=False) tokens = prompt_tokens + completion_tokens weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32) target_tokens = np.array(tokens[1:], dtype=np.int64)
datum = types.Datum( model_input=types.ModelInput.from_ints(tokens=tokens[:-1]), loss_fn_inputs={ "target_tokens": target_tokens, "weights": weights[1:] } )
prompt = "English: hello\nPig Latin:" completion = " ello-hay\n" prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True) completion_tokens = tokenizer.encode(completion, add_special_tokens=False) tokens = prompt_tokens + completion_tokens weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32) target_tokens = np.array(tokens[1:], dtype=np.int64)
datum = types.Datum( model_input=types.ModelInput.from_ints(tokens=tokens[:-1]), loss_fn_inputs={ "target_tokens": target_tokens, "weights": weights[1:] } )

Train

Train

fwdbwd = training_client.forward_backward([datum], "cross_entropy") optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4)) fwdbwd.result(); optim.result()
fwdbwd = training_client.forward_backward([datum], "cross_entropy") optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4)) fwdbwd.result(); optim.result()

Sample

Sample

sampling_client = training_client.save_weights_and_get_sampling_client(name="v1") result = sampling_client.sample( prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)), sampling_params=types.SamplingParams(max_tokens=20), num_samples=1 ).result() print(tokenizer.decode(result.sequences[0].tokens))
undefined
sampling_client = training_client.save_weights_and_get_sampling_client(name="v1") result = sampling_client.sample( prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)), sampling_params=types.SamplingParams(max_tokens=20), num_samples=1 ).result() print(tokenizer.decode(result.sequences[0].tokens))
undefined

Common Imports

常用导入

python
undefined
python
undefined

Low-level API

Low-level API

import tinker from tinker import types from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams
import tinker from tinker import types from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams

Cookbook (high-level)

Cookbook (high-level)

import chz import asyncio from tinker_cookbook.supervised import train from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig from tinker_cookbook.supervised.data import ( SupervisedDatasetFromHFDataset, StreamingSupervisedDatasetFromHFDataset, FromConversationFileBuilder, conversation_to_datum, ) from tinker_cookbook.renderers import get_renderer, TrainOnWhat from tinker_cookbook.model_info import get_recommended_renderer_name from tinker_cookbook.tokenizer_utils import get_tokenizer
undefined
import chz import asyncio from tinker_cookbook.supervised import train from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig from tinker_cookbook.supervised.data import ( SupervisedDatasetFromHFDataset, StreamingSupervisedDatasetFromHFDataset, FromConversationFileBuilder, conversation_to_datum, ) from tinker_cookbook.renderers import get_renderer, TrainOnWhat from tinker_cookbook.model_info import get_recommended_renderer_name from tinker_cookbook.tokenizer_utils import get_tokenizer
undefined

When to Use What

场景选型指南

ScenarioApproach
Standard SFT with HF/JSONL dataCookbook
ChatDatasetBuilder
+
tinker_cookbook.supervised.train.main()
Custom preprocessingCustom
SupervisedDataset
class
Large datasets (>1M)
StreamingSupervisedDatasetFromHFDataset
RL / GRPOCookbook RL patterns
Research / custom loopsLow-level
forward_backward()
+
optim_step()
Vision-languageQwen3-VL +
ImageChunk
场景方案
使用HF/JSONL格式数据的标准SFT任务Cookbook
ChatDatasetBuilder
+
tinker_cookbook.supervised.train.main()
自定义预处理自定义
SupervisedDataset
超大数据集(>100万条)
StreamingSupervisedDatasetFromHFDataset
RL / GRPO任务Cookbook RL模式
研究场景/自定义训练循环低层级
forward_backward()
+
optim_step()
视觉语言任务Qwen3-VL +
ImageChunk

External Resources

外部资源