tinker-training-cost

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Tinker Training Cost Calculator

Tinker训练成本计算器

Calculate training costs for Tinker fine-tuning jobs by tokenizing your dataset with the correct model tokenizer and applying current pricing.
通过使用对应模型的tokenizer对数据集进行分词,并应用当前定价,计算Tinker微调任务的训练成本。

Quick Start

快速开始

Use the bundled script to calculate training costs:
bash
undefined
使用附带的脚本计算训练成本:
bash
undefined

List available models and pricing

列出可用模型及定价

python scripts/calculate_cost.py --list-models
python scripts/calculate_cost.py --list-models

Calculate cost for a JSONL dataset

计算JSONL数据集的训练成本

python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3
python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3

Output as JSON

以JSON格式输出结果

python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json

The script:
1. Loads the correct tokenizer for the selected model
2. Counts tokens in your JSONL file (supports chat, text, and instruction formats)
3. Calculates the estimated training cost
python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json

该脚本会:
1. 加载所选模型对应的正确tokenizer
2. 统计JSONL文件中的token数量(支持对话、文本和指令格式)
3. 估算训练成本

Cost Formula

成本计算公式

Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000
Where:
  • total_tokens
    = tokens in your training dataset (from tokenization)
  • epochs
    = number of training passes (default: 3)
  • train_price_per_million
    = model-specific training rate from pricing table

Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000
其中:
  • total_tokens
    = 训练数据集中的token数量(来自分词统计)
  • epochs
    = 训练轮次(默认值:3)
  • train_price_per_million
    = 定价表中对应模型的训练费率

Tinker Pricing

Tinker定价

All prices as of January 5, 2026 Source: https://thinkingmachines.ai/tinker/
All prices are in USD per million tokens.
CategoryDescription
PrefillProcessing input context (inference)
SampleGenerating output tokens (inference)
TrainTraining/fine-tuning tokens
所有价格截至2026年1月5日 来源:https://thinkingmachines.ai/tinker/
所有价格均为每百万token的美元价格
类别描述
Prefill处理输入上下文(推理)
Sample生成输出token(推理)
Train训练/微调token

Qwen Models

Qwen模型

ModelPrefillSampleTrain
Qwen3-4B-Instruct-2507$0.07$0.22$0.22
Qwen3-8B$0.13$0.40$0.40
Qwen3-30B-A3B$0.12$0.30$0.36
Qwen3-VL-30B-A3B-Instruct$0.18$0.44$0.53
Qwen3-32B$0.49$1.47$1.47
Qwen3-235B-Instruct-2507$0.68$1.70$2.04
Qwen3-VL-235B-A22B-Instruct$1.02$2.56$3.07
模型PrefillSampleTrain
Qwen3-4B-Instruct-2507$0.07$0.22$0.22
Qwen3-8B$0.13$0.40$0.40
Qwen3-30B-A3B$0.12$0.30$0.36
Qwen3-VL-30B-A3B-Instruct$0.18$0.44$0.53
Qwen3-32B$0.49$1.47$1.47
Qwen3-235B-Instruct-2507$0.68$1.70$2.04
Qwen3-VL-235B-A22B-Instruct$1.02$2.56$3.07

Llama Models

Llama模型

ModelPrefillSampleTrain
Llama-3.2-1B$0.03$0.09$0.09
Llama-3.2-3B$0.06$0.18$0.18
Llama-3.1-8B$0.13$0.40$0.40
Llama-3.1-70B$1.05$3.16$3.16
模型PrefillSampleTrain
Llama-3.2-1B$0.03$0.09$0.09
Llama-3.2-3B$0.06$0.18$0.18
Llama-3.1-8B$0.13$0.40$0.40
Llama-3.1-70B$1.05$3.16$3.16

DeepSeek Models

DeepSeek模型

ModelPrefillSampleTrain
DeepSeek-V3.1$1.13$2.81$3.38
模型PrefillSampleTrain
DeepSeek-V3.1$1.13$2.81$3.38

GPT-OSS Models

GPT-OSS模型

ModelPrefillSampleTrain
GPT-OSS-120B$0.18$0.44$0.52
GPT-OSS-20B$0.12$0.30$0.36
模型PrefillSampleTrain
GPT-OSS-120B$0.18$0.44$0.52
GPT-OSS-20B$0.12$0.30$0.36

Moonshot Models

Moonshot模型

ModelPrefillSampleTrain
Kimi-K2-Thinking$0.98$2.44$2.93

模型PrefillSampleTrain
Kimi-K2-Thinking$0.98$2.44$2.93

Model-to-Tokenizer Mapping

模型与Tokenizer对应关系

Use the correct HuggingFace tokenizer for accurate token counting:
ModelHuggingFace Tokenizer
Qwen3-4B-Instruct-2507
Qwen/Qwen3-4B
Qwen3-8B
Qwen/Qwen3-8B
Qwen3-30B-A3B
Qwen/Qwen3-30B-A3B
Qwen3-32B
Qwen/Qwen3-32B
Qwen3-235B-Instruct-2507
Qwen/Qwen3-235B-A22B-Instruct
Qwen3-VL-*
Qwen/Qwen2.5-VL-7B-Instruct
(shared VL tokenizer)
Llama-3.2-1B
meta-llama/Llama-3.2-1B-Instruct
Llama-3.2-3B
meta-llama/Llama-3.2-3B-Instruct
Llama-3.1-8B
meta-llama/Llama-3.1-8B-Instruct
Llama-3.1-70B
meta-llama/Llama-3.1-70B-Instruct
DeepSeek-V3.1
deepseek-ai/DeepSeek-V3
GPT-OSS-*
Qwen/Qwen3-8B
(compatible tokenizer)
Kimi-K2-Thinking
moonshotai/Kimi-K2-Instruct

使用正确的HuggingFace tokenizer以获得精准的token统计结果:
模型HuggingFace Tokenizer
Qwen3-4B-Instruct-2507
Qwen/Qwen3-4B
Qwen3-8B
Qwen/Qwen3-8B
Qwen3-30B-A3B
Qwen/Qwen3-30B-A3B
Qwen3-32B
Qwen/Qwen3-32B
Qwen3-235B-Instruct-2507
Qwen/Qwen3-235B-A22B-Instruct
Qwen3-VL-*
Qwen/Qwen2.5-VL-7B-Instruct
(共享VL tokenizer)
Llama-3.2-1B
meta-llama/Llama-3.2-1B-Instruct
Llama-3.2-3B
meta-llama/Llama-3.2-3B-Instruct
Llama-3.1-8B
meta-llama/Llama-3.1-8B-Instruct
Llama-3.1-70B
meta-llama/Llama-3.1-70B-Instruct
DeepSeek-V3.1
deepseek-ai/DeepSeek-V3
GPT-OSS-*
Qwen/Qwen3-8B
(兼容tokenizer)
Kimi-K2-Thinking
moonshotai/Kimi-K2-Instruct

Tokenization

分词处理

The bundled
scripts/calculate_cost.py
handles tokenization automatically. For custom use:
python
from transformers import AutoTokenizer
附带的
scripts/calculate_cost.py
会自动处理分词。如需自定义使用:
python
from transformers import AutoTokenizer

Load the correct tokenizer for your model

加载对应模型的正确tokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)

Count tokens

统计token数量

token_count = len(tokenizer.encode("Your training text here"))
undefined
token_count = len(tokenizer.encode("Your training text here"))
undefined

Supported JSONL Formats

支持的JSONL格式

The script handles these training data formats:
Chat format (recommended):
json
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
Text format:
json
{"text": "Your training text here"}
Instruction format (Alpaca-style):
json
{"instruction": "...", "input": "...", "output": "..."}

该脚本支持以下训练数据格式:
对话格式(推荐):
json
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
文本格式
json
{"text": "Your training text here"}
指令格式(Alpaca风格):
json
{"instruction": "...", "input": "...", "output": "..."}

Quick Cost Examples

成本计算示例

Example 1: Qwen3-8B on 1M tokens, 3 epochs

示例1:Qwen3-8B处理100万token,3轮训练

Dataset tokens: 1,000,000
Training tokens: 1,000,000 × 3 = 3,000,000
Cost: 3.0M × $0.40/M = $1.20
数据集token数:1,000,000
训练token数:1,000,000 × 3 = 3,000,000
成本:3.0M × $0.40/M = $1.20

Example 2: Llama-3.1-70B on 5M tokens, 2 epochs

示例2:Llama-3.1-70B处理500万token,2轮训练

Dataset tokens: 5,000,000
Training tokens: 5,000,000 × 2 = 10,000,000
Cost: 10.0M × $3.16/M = $31.60
数据集token数:5,000,000
训练token数:5,000,000 × 2 = 10,000,000
成本:10.0M × $3.16/M = $31.60

Example 3: Qwen3-235B on 2M tokens, 4 epochs

示例3:Qwen3-235B处理200万token,4轮训练

Dataset tokens: 2,000,000
Training tokens: 2,000,000 × 4 = 8,000,000
Cost: 8.0M × $2.04/M = $16.32

数据集token数:2,000,000
训练token数:2,000,000 × 4 = 8,000,000
成本:8.0M × $2.04/M = $16.32

Important Notes

重要说明

  1. LoRA Fine-Tuning: Tinker uses Low-Rank Adaptation (LoRA), not full fine-tuning
  2. Token Counting: Always use the model's native tokenizer for accurate counts - different tokenizers produce different token counts for the same text
  3. Vision Models: VL models have higher costs due to image processing overhead
  4. trust_remote_code: Required for some tokenizers (Qwen, DeepSeek)
  1. LoRA微调:Tinker采用低秩适配(LoRA),而非全量微调
  2. Token统计:始终使用模型原生的tokenizer以获得精准统计结果——不同tokenizer对同一文本的分词结果不同
  3. 视觉模型:VL模型因图像处理开销,成本更高
  4. trust_remote_code:部分tokenizer(如Qwen、DeepSeek)需要开启此选项