tinker-training-cost
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTinker Training Cost Calculator
Tinker训练成本计算器
Calculate training costs for Tinker fine-tuning jobs by tokenizing your dataset with the correct model tokenizer and applying current pricing.
通过使用对应模型的tokenizer对数据集进行分词,并应用当前定价,计算Tinker微调任务的训练成本。
Quick Start
快速开始
Use the bundled script to calculate training costs:
bash
undefined使用附带的脚本计算训练成本:
bash
undefinedList available models and pricing
列出可用模型及定价
python scripts/calculate_cost.py --list-models
python scripts/calculate_cost.py --list-models
Calculate cost for a JSONL dataset
计算JSONL数据集的训练成本
python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3
python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3
Output as JSON
以JSON格式输出结果
python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json
The script:
1. Loads the correct tokenizer for the selected model
2. Counts tokens in your JSONL file (supports chat, text, and instruction formats)
3. Calculates the estimated training costpython scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json
该脚本会:
1. 加载所选模型对应的正确tokenizer
2. 统计JSONL文件中的token数量(支持对话、文本和指令格式)
3. 估算训练成本Cost Formula
成本计算公式
Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000Where:
- = tokens in your training dataset (from tokenization)
total_tokens - = number of training passes (default: 3)
epochs - = model-specific training rate from pricing table
train_price_per_million
Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000其中:
- = 训练数据集中的token数量(来自分词统计)
total_tokens - = 训练轮次(默认值:3)
epochs - = 定价表中对应模型的训练费率
train_price_per_million
Tinker Pricing
Tinker定价
All prices as of January 5, 2026 Source: https://thinkingmachines.ai/tinker/
All prices are in USD per million tokens.
| Category | Description |
|---|---|
| Prefill | Processing input context (inference) |
| Sample | Generating output tokens (inference) |
| Train | Training/fine-tuning tokens |
所有价格截至2026年1月5日 来源:https://thinkingmachines.ai/tinker/
所有价格均为每百万token的美元价格。
| 类别 | 描述 |
|---|---|
| Prefill | 处理输入上下文(推理) |
| Sample | 生成输出token(推理) |
| Train | 训练/微调token |
Qwen Models
Qwen模型
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Qwen3-4B-Instruct-2507 | $0.07 | $0.22 | $0.22 |
| Qwen3-8B | $0.13 | $0.40 | $0.40 |
| Qwen3-30B-A3B | $0.12 | $0.30 | $0.36 |
| Qwen3-VL-30B-A3B-Instruct | $0.18 | $0.44 | $0.53 |
| Qwen3-32B | $0.49 | $1.47 | $1.47 |
| Qwen3-235B-Instruct-2507 | $0.68 | $1.70 | $2.04 |
| Qwen3-VL-235B-A22B-Instruct | $1.02 | $2.56 | $3.07 |
| 模型 | Prefill | Sample | Train |
|---|---|---|---|
| Qwen3-4B-Instruct-2507 | $0.07 | $0.22 | $0.22 |
| Qwen3-8B | $0.13 | $0.40 | $0.40 |
| Qwen3-30B-A3B | $0.12 | $0.30 | $0.36 |
| Qwen3-VL-30B-A3B-Instruct | $0.18 | $0.44 | $0.53 |
| Qwen3-32B | $0.49 | $1.47 | $1.47 |
| Qwen3-235B-Instruct-2507 | $0.68 | $1.70 | $2.04 |
| Qwen3-VL-235B-A22B-Instruct | $1.02 | $2.56 | $3.07 |
Llama Models
Llama模型
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Llama-3.2-1B | $0.03 | $0.09 | $0.09 |
| Llama-3.2-3B | $0.06 | $0.18 | $0.18 |
| Llama-3.1-8B | $0.13 | $0.40 | $0.40 |
| Llama-3.1-70B | $1.05 | $3.16 | $3.16 |
| 模型 | Prefill | Sample | Train |
|---|---|---|---|
| Llama-3.2-1B | $0.03 | $0.09 | $0.09 |
| Llama-3.2-3B | $0.06 | $0.18 | $0.18 |
| Llama-3.1-8B | $0.13 | $0.40 | $0.40 |
| Llama-3.1-70B | $1.05 | $3.16 | $3.16 |
DeepSeek Models
DeepSeek模型
| Model | Prefill | Sample | Train |
|---|---|---|---|
| DeepSeek-V3.1 | $1.13 | $2.81 | $3.38 |
| 模型 | Prefill | Sample | Train |
|---|---|---|---|
| DeepSeek-V3.1 | $1.13 | $2.81 | $3.38 |
GPT-OSS Models
GPT-OSS模型
| Model | Prefill | Sample | Train |
|---|---|---|---|
| GPT-OSS-120B | $0.18 | $0.44 | $0.52 |
| GPT-OSS-20B | $0.12 | $0.30 | $0.36 |
| 模型 | Prefill | Sample | Train |
|---|---|---|---|
| GPT-OSS-120B | $0.18 | $0.44 | $0.52 |
| GPT-OSS-20B | $0.12 | $0.30 | $0.36 |
Moonshot Models
Moonshot模型
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Kimi-K2-Thinking | $0.98 | $2.44 | $2.93 |
| 模型 | Prefill | Sample | Train |
|---|---|---|---|
| Kimi-K2-Thinking | $0.98 | $2.44 | $2.93 |
Model-to-Tokenizer Mapping
模型与Tokenizer对应关系
Use the correct HuggingFace tokenizer for accurate token counting:
| Model | HuggingFace Tokenizer |
|---|---|
| Qwen3-4B-Instruct-2507 | |
| Qwen3-8B | |
| Qwen3-30B-A3B | |
| Qwen3-32B | |
| Qwen3-235B-Instruct-2507 | |
| Qwen3-VL-* | |
| Llama-3.2-1B | |
| Llama-3.2-3B | |
| Llama-3.1-8B | |
| Llama-3.1-70B | |
| DeepSeek-V3.1 | |
| GPT-OSS-* | |
| Kimi-K2-Thinking | |
使用正确的HuggingFace tokenizer以获得精准的token统计结果:
| 模型 | HuggingFace Tokenizer |
|---|---|
| Qwen3-4B-Instruct-2507 | |
| Qwen3-8B | |
| Qwen3-30B-A3B | |
| Qwen3-32B | |
| Qwen3-235B-Instruct-2507 | |
| Qwen3-VL-* | |
| Llama-3.2-1B | |
| Llama-3.2-3B | |
| Llama-3.1-8B | |
| Llama-3.1-70B | |
| DeepSeek-V3.1 | |
| GPT-OSS-* | |
| Kimi-K2-Thinking | |
Tokenization
分词处理
The bundled handles tokenization automatically. For custom use:
scripts/calculate_cost.pypython
from transformers import AutoTokenizer附带的会自动处理分词。如需自定义使用:
scripts/calculate_cost.pypython
from transformers import AutoTokenizerLoad the correct tokenizer for your model
加载对应模型的正确tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)
Count tokens
统计token数量
token_count = len(tokenizer.encode("Your training text here"))
undefinedtoken_count = len(tokenizer.encode("Your training text here"))
undefinedSupported JSONL Formats
支持的JSONL格式
The script handles these training data formats:
Chat format (recommended):
json
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}Text format:
json
{"text": "Your training text here"}Instruction format (Alpaca-style):
json
{"instruction": "...", "input": "...", "output": "..."}该脚本支持以下训练数据格式:
对话格式(推荐):
json
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}文本格式:
json
{"text": "Your training text here"}指令格式(Alpaca风格):
json
{"instruction": "...", "input": "...", "output": "..."}Quick Cost Examples
成本计算示例
Example 1: Qwen3-8B on 1M tokens, 3 epochs
示例1:Qwen3-8B处理100万token,3轮训练
Dataset tokens: 1,000,000
Training tokens: 1,000,000 × 3 = 3,000,000
Cost: 3.0M × $0.40/M = $1.20数据集token数:1,000,000
训练token数:1,000,000 × 3 = 3,000,000
成本:3.0M × $0.40/M = $1.20Example 2: Llama-3.1-70B on 5M tokens, 2 epochs
示例2:Llama-3.1-70B处理500万token,2轮训练
Dataset tokens: 5,000,000
Training tokens: 5,000,000 × 2 = 10,000,000
Cost: 10.0M × $3.16/M = $31.60数据集token数:5,000,000
训练token数:5,000,000 × 2 = 10,000,000
成本:10.0M × $3.16/M = $31.60Example 3: Qwen3-235B on 2M tokens, 4 epochs
示例3:Qwen3-235B处理200万token,4轮训练
Dataset tokens: 2,000,000
Training tokens: 2,000,000 × 4 = 8,000,000
Cost: 8.0M × $2.04/M = $16.32数据集token数:2,000,000
训练token数:2,000,000 × 4 = 8,000,000
成本:8.0M × $2.04/M = $16.32Important Notes
重要说明
- LoRA Fine-Tuning: Tinker uses Low-Rank Adaptation (LoRA), not full fine-tuning
- Token Counting: Always use the model's native tokenizer for accurate counts - different tokenizers produce different token counts for the same text
- Vision Models: VL models have higher costs due to image processing overhead
- trust_remote_code: Required for some tokenizers (Qwen, DeepSeek)
- LoRA微调:Tinker采用低秩适配(LoRA),而非全量微调
- Token统计:始终使用模型原生的tokenizer以获得精准统计结果——不同tokenizer对同一文本的分词结果不同
- 视觉模型:VL模型因图像处理开销,成本更高
- trust_remote_code:部分tokenizer(如Qwen、DeepSeek)需要开启此选项