tinker-training-cost

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Tinker Training Cost Calculator

Tinker训练成本计算器

Calculate training costs for Tinker fine-tuning jobs by tokenizing your dataset with the correct model tokenizer and applying current pricing.

通过使用对应模型的tokenizer对数据集进行分词，并应用当前定价，计算Tinker微调任务的训练成本。

Quick Start

快速开始

Use the bundled script to calculate training costs:

bash

undefined

使用附带的脚本计算训练成本：

bash

undefined

List available models and pricing

列出可用模型及定价

python scripts/calculate_cost.py --list-models

Calculate cost for a JSONL dataset

计算JSONL数据集的训练成本

python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3

Output as JSON

以JSON格式输出结果

python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json


The script:
1. Loads the correct tokenizer for the selected model
2. Counts tokens in your JSONL file (supports chat, text, and instruction formats)
3. Calculates the estimated training cost

python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json


该脚本会：
1. 加载所选模型对应的正确tokenizer
2. 统计JSONL文件中的token数量（支持对话、文本和指令格式）
3. 估算训练成本

Cost Formula

成本计算公式

Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000

Where:

```
total_tokens
```
= tokens in your training dataset (from tokenization)
```
epochs
```
= number of training passes (default: 3)
```
train_price_per_million
```
= model-specific training rate from pricing table

Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000

其中：

```
total_tokens
```
= 训练数据集中的token数量（来自分词统计）
```
epochs
```
= 训练轮次（默认值：3）
```
train_price_per_million
```
= 定价表中对应模型的训练费率

Tinker Pricing

Tinker定价

All prices as of January 5, 2026 Source: https://thinkingmachines.ai/tinker/

All prices are in USD per million tokens.

Category	Description
Prefill	Processing input context (inference)
Sample	Generating output tokens (inference)
Train	Training/fine-tuning tokens

所有价格截至2026年1月5日 来源：https://thinkingmachines.ai/tinker/

所有价格均为每百万token的美元价格。

类别	描述
Prefill	处理输入上下文（推理）
Sample	生成输出token（推理）
Train	训练/微调token

Qwen Models

Qwen模型

Model	Prefill	Sample	Train
Qwen3-4B-Instruct-2507	$0.07	$0.22	$0.22
Qwen3-8B	$0.13	$0.40	$0.40
Qwen3-30B-A3B	$0.12	$0.30	$0.36
Qwen3-VL-30B-A3B-Instruct	$0.18	$0.44	$0.53
Qwen3-32B	$0.49	$1.47	$1.47
Qwen3-235B-Instruct-2507	$0.68	$1.70	$2.04
Qwen3-VL-235B-A22B-Instruct	$1.02	$2.56	$3.07

模型	Prefill	Sample	Train
Qwen3-4B-Instruct-2507	$0.07	$0.22	$0.22
Qwen3-8B	$0.13	$0.40	$0.40
Qwen3-30B-A3B	$0.12	$0.30	$0.36
Qwen3-VL-30B-A3B-Instruct	$0.18	$0.44	$0.53
Qwen3-32B	$0.49	$1.47	$1.47
Qwen3-235B-Instruct-2507	$0.68	$1.70	$2.04
Qwen3-VL-235B-A22B-Instruct	$1.02	$2.56	$3.07

Llama Models

Llama模型

Model	Prefill	Sample	Train
Llama-3.2-1B	$0.03	$0.09	$0.09
Llama-3.2-3B	$0.06	$0.18	$0.18
Llama-3.1-8B	$0.13	$0.40	$0.40
Llama-3.1-70B	$1.05	$3.16	$3.16

模型	Prefill	Sample	Train
Llama-3.2-1B	$0.03	$0.09	$0.09
Llama-3.2-3B	$0.06	$0.18	$0.18
Llama-3.1-8B	$0.13	$0.40	$0.40
Llama-3.1-70B	$1.05	$3.16	$3.16

DeepSeek Models

DeepSeek模型

Model	Prefill	Sample	Train
DeepSeek-V3.1	$1.13	$2.81	$3.38

模型	Prefill	Sample	Train
DeepSeek-V3.1	$1.13	$2.81	$3.38

GPT-OSS Models

GPT-OSS模型

Model	Prefill	Sample	Train
GPT-OSS-120B	$0.18	$0.44	$0.52
GPT-OSS-20B	$0.12	$0.30	$0.36

模型	Prefill	Sample	Train
GPT-OSS-120B	$0.18	$0.44	$0.52
GPT-OSS-20B	$0.12	$0.30	$0.36

Moonshot Models

Moonshot模型

Model	Prefill	Sample	Train
Kimi-K2-Thinking	$0.98	$2.44	$2.93

模型	Prefill	Sample	Train
Kimi-K2-Thinking	$0.98	$2.44	$2.93

Model-to-Tokenizer Mapping

模型与Tokenizer对应关系

Use the correct HuggingFace tokenizer for accurate token counting:

Model	HuggingFace Tokenizer
Qwen3-4B-Instruct-2507	`Qwen/Qwen3-4B`
Qwen3-8B	`Qwen/Qwen3-8B`
Qwen3-30B-A3B	`Qwen/Qwen3-30B-A3B`
Qwen3-32B	`Qwen/Qwen3-32B`
Qwen3-235B-Instruct-2507	`Qwen/Qwen3-235B-A22B-Instruct`
Qwen3-VL-*	`Qwen/Qwen2.5-VL-7B-Instruct` (shared VL tokenizer)
Llama-3.2-1B	`meta-llama/Llama-3.2-1B-Instruct`
Llama-3.2-3B	`meta-llama/Llama-3.2-3B-Instruct`
Llama-3.1-8B	`meta-llama/Llama-3.1-8B-Instruct`
Llama-3.1-70B	`meta-llama/Llama-3.1-70B-Instruct`
DeepSeek-V3.1	`deepseek-ai/DeepSeek-V3`
GPT-OSS-*	`Qwen/Qwen3-8B` (compatible tokenizer)
Kimi-K2-Thinking	`moonshotai/Kimi-K2-Instruct`

使用正确的HuggingFace tokenizer以获得精准的token统计结果：

模型	HuggingFace Tokenizer
Qwen3-4B-Instruct-2507	`Qwen/Qwen3-4B`
Qwen3-8B	`Qwen/Qwen3-8B`
Qwen3-30B-A3B	`Qwen/Qwen3-30B-A3B`
Qwen3-32B	`Qwen/Qwen3-32B`
Qwen3-235B-Instruct-2507	`Qwen/Qwen3-235B-A22B-Instruct`
Qwen3-VL-*	`Qwen/Qwen2.5-VL-7B-Instruct` (共享VL tokenizer)
Llama-3.2-1B	`meta-llama/Llama-3.2-1B-Instruct`
Llama-3.2-3B	`meta-llama/Llama-3.2-3B-Instruct`
Llama-3.1-8B	`meta-llama/Llama-3.1-8B-Instruct`
Llama-3.1-70B	`meta-llama/Llama-3.1-70B-Instruct`
DeepSeek-V3.1	`deepseek-ai/DeepSeek-V3`
GPT-OSS-*	`Qwen/Qwen3-8B` (兼容tokenizer)
Kimi-K2-Thinking	`moonshotai/Kimi-K2-Instruct`

Tokenization

分词处理

The bundled

scripts/calculate_cost.py

handles tokenization automatically. For custom use:

python

from transformers import AutoTokenizer

附带的

scripts/calculate_cost.py

会自动处理分词。如需自定义使用：

python

from transformers import AutoTokenizer

Load the correct tokenizer for your model

加载对应模型的正确tokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)

Count tokens

统计token数量

token_count = len(tokenizer.encode("Your training text here"))

undefined

token_count = len(tokenizer.encode("Your training text here"))

undefined

Supported JSONL Formats

支持的JSONL格式

The script handles these training data formats:

Chat format (recommended):

json

{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

Text format:

json

{"text": "Your training text here"}

Instruction format (Alpaca-style):

json

{"instruction": "...", "input": "...", "output": "..."}

该脚本支持以下训练数据格式：

对话格式（推荐）：

json

{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

文本格式：

json

{"text": "Your training text here"}

指令格式（Alpaca风格）：

json

{"instruction": "...", "input": "...", "output": "..."}

Quick Cost Examples

成本计算示例

Example 1: Qwen3-8B on 1M tokens, 3 epochs

示例1：Qwen3-8B处理100万token，3轮训练

Dataset tokens: 1,000,000
Training tokens: 1,000,000 × 3 = 3,000,000
Cost: 3.0M × $0.40/M = $1.20

数据集token数：1,000,000
训练token数：1,000,000 × 3 = 3,000,000
成本：3.0M × $0.40/M = $1.20

Example 2: Llama-3.1-70B on 5M tokens, 2 epochs

示例2：Llama-3.1-70B处理500万token，2轮训练

Dataset tokens: 5,000,000
Training tokens: 5,000,000 × 2 = 10,000,000
Cost: 10.0M × $3.16/M = $31.60

数据集token数：5,000,000
训练token数：5,000,000 × 2 = 10,000,000
成本：10.0M × $3.16/M = $31.60

Example 3: Qwen3-235B on 2M tokens, 4 epochs

示例3：Qwen3-235B处理200万token，4轮训练

Dataset tokens: 2,000,000
Training tokens: 2,000,000 × 4 = 8,000,000
Cost: 8.0M × $2.04/M = $16.32

数据集token数：2,000,000
训练token数：2,000,000 × 4 = 8,000,000
成本：8.0M × $2.04/M = $16.32

Important Notes

重要说明

LoRA Fine-Tuning: Tinker uses Low-Rank Adaptation (LoRA), not full fine-tuning
Token Counting: Always use the model's native tokenizer for accurate counts - different tokenizers produce different token counts for the same text
Vision Models: VL models have higher costs due to image processing overhead
trust_remote_code: Required for some tokenizers (Qwen, DeepSeek)

LoRA微调：Tinker采用低秩适配（LoRA），而非全量微调
Token统计：始终使用模型原生的tokenizer以获得精准统计结果——不同tokenizer对同一文本的分词结果不同
视觉模型：VL模型因图像处理开销，成本更高
trust_remote_code：部分tokenizer（如Qwen、DeepSeek）需要开启此选项