toon-format

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Token-Oriented Object Notation (TOON)

Skill by ara.so — Daily 2026 Skills collection.

TOON is a compact, human-readable encoding of the JSON data model that minimizes tokens for LLM input. It combines YAML-style indentation for nested objects with CSV-style tabular layout for uniform arrays, achieving ~40% token reduction while maintaining or improving LLM comprehension accuracy.

由ara.so开发的技能——属于Daily 2026技能合集。

TOON是一种紧凑的、人类可读的JSON数据模型编码方式，可最小化LLM输入的令牌数量。它结合了YAML风格的缩进用于嵌套对象，以及CSV风格的表格布局用于统一数组，在保持甚至提升LLM理解准确率的同时，实现了约40%的令牌减少。

Installation

安装

bash

undefined

bash

undefined

npm

npm install @toon-format/toon

pnpm

pnpm add @toon-format/toon

yarn

yarn add @toon-format/toon

undefined

yarn add @toon-format/toon

undefined

CLI

命令行工具（CLI）

bash

undefined

bash

undefined

Install globally

全局安装

npm install -g @toon-format/toon

Convert JSON file to TOON

将JSON文件转换为TOON格式

toon encode input.json toon encode input.json -o output.toon

Convert TOON back to JSON

将TOON格式转换回JSON

toon decode input.toon toon decode input.toon -o output.json

Pipe support

支持管道操作

cat data.json | toon encode cat data.toon | toon decode

Pretty-print JSON output

格式化输出JSON

toon decode input.toon --pretty

Show token count comparison

显示令牌数量对比

toon encode input.json --stats

undefined

toon encode input.json --stats

undefined

Core API

核心API

encode / stringify

typescript

import { encode, decode } from '@toon-format/toon';

// Basic encoding (JSON → TOON string)
const data = {
  context: {
    task: 'Our favorite hikes together',
    location: 'Boulder',
    season: 'spring_2025',
  },
  friends: ['ana', 'luis', 'sam'],
  hikes: [
    { id: 1, name: 'Blue Lake Trail', distanceKm: 7.5, elevationGain: 320, companion: 'ana', wasSunny: true },
    { id: 2, name: 'Ridge Overlook', distanceKm: 9.2, elevationGain: 540, companion: 'luis', wasSunny: false },
    { id: 3, name: 'Wildflower Loop', distanceKm: 5.1, elevationGain: 180, companion: 'sam', wasSunny: true },
  ],
};

const toon = encode(data);
console.log(toon);
// context:
//   task: Our favorite hikes together
//   location: Boulder
//   season: spring_2025
// friends[3]: ana,luis,sam
// hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
//   1,Blue Lake Trail,7.5,320,ana,true
//   2,Ridge Overlook,9.2,540,luis,false
//   3,Wildflower Loop,5.1,180,sam,true

typescript

import { encode, decode } from '@toon-format/toon';

// 基础编码（JSON → TOON字符串）
const data = {
  context: {
    task: 'Our favorite hikes together',
    location: 'Boulder',
    season: 'spring_2025',
  },
  friends: ['ana', 'luis', 'sam'],
  hikes: [
    { id: 1, name: 'Blue Lake Trail', distanceKm: 7.5, elevationGain: 320, companion: 'ana', wasSunny: true },
    { id: 2, name: 'Ridge Overlook', distanceKm: 9.2, elevationGain: 540, companion: 'luis', wasSunny: false },
    { id: 3, name: 'Wildflower Loop', distanceKm: 5.1, elevationGain: 180, companion: 'sam', wasSunny: true },
  ],
};

const toon = encode(data);
console.log(toon);
// context:
//   task: Our favorite hikes together
//   location: Boulder
//   season: spring_2025
// friends[3]: ana,luis,sam
// hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
//   1,Blue Lake Trail,7.5,320,ana,true
//   2,Ridge Overlook,9.2,540,luis,false
//   3,Wildflower Loop,5.1,180,sam,true

decode / parse

typescript

import { decode } from '@toon-format/toon';

const toonString = `
context:
  task: Our favorite hikes together
  location: Boulder
friends[2]: ana,luis
hikes[2]{id,name,distanceKm}:
  1,Blue Lake Trail,7.5
  2,Ridge Overlook,9.2
`;

const parsed = decode(toonString);
// Returns the original JavaScript object
console.log(parsed.hikes[0].name); // 'Blue Lake Trail'

typescript

import { decode } from '@toon-format/toon';

const toonString = `
context:
  task: Our favorite hikes together
  location: Boulder
friends[2]: ana,luis
hikes[2]{id,name,distanceKm}:
  1,Blue Lake Trail,7.5
  2,Ridge Overlook,9.2
`;

const parsed = decode(toonString);
// 返回原始JavaScript对象
console.log(parsed.hikes[0].name); // 'Blue Lake Trail'

Encoding options

编码选项

typescript

import { encode } from '@toon-format/toon';

const toon = encode(data, {
  // Force all arrays to tabular format (default: auto-detect uniform arrays)
  tabular: 'always',

  // Never use tabular format
  // tabular: 'never',

  // Indent size for nested objects (default: 2)
  indent: 2,

  // Quote strings that contain special characters (default: auto)
  quoting: 'auto',
});

typescript

import { encode } from '@toon-format/toon';

const toon = encode(data, {
  // 强制所有数组使用表格格式（默认：自动检测统一数组）
  tabular: 'always',

  // 从不使用表格格式
  // tabular: 'never',

  // 嵌套对象的缩进大小（默认：2）
  indent: 2,

  // 对包含特殊字符的字符串添加引号（默认：自动）
  quoting: 'auto',
});

Format Overview

格式概述

Primitive scalars

原始标量

TOON encodes scalars the same way as YAML — unquoted when unambiguous:

name: Alice
age: 30
active: true
score: 98.6
nothing: null

TOON对标量的编码方式与YAML相同——在无歧义时不添加引号：

name: Alice
age: 30
active: true
score: 98.6
nothing: null

Nested objects (YAML-style indentation)

嵌套对象（YAML风格缩进）

user:
  name: Alice
  address:
    city: Boulder
    zip: 80301

user:
  name: Alice
  address:
    city: Boulder
    zip: 80301

Flat arrays (scalar items)

扁平数组（标量项）

Square brackets declare the array length, values are comma-separated:

tags[3]: typescript,llm,serialization
scores[4]: 10,20,30,40

方括号声明数组长度，值用逗号分隔：

tags[3]: typescript,llm,serialization
scores[4]: 10,20,30,40

Uniform object arrays (tabular format)

统一对象数组（表格格式）

Curly braces declare the field headers; each subsequent indented line is a row:

employees[3]{id,name,department,salary}:
  1,Alice,Engineering,95000
  2,Bob,Marketing,72000
  3,Carol,Engineering,102000

大括号声明字段头；后续每一行缩进内容为一条数据：

employees[3]{id,name,department,salary}:
  1,Alice,Engineering,95000
  2,Bob,Marketing,72000
  3,Carol,Engineering,102000

Quoting rules

引号规则

Values containing commas, colons, or newlines are quoted:

notes[2]: "hello, world","line1\nline2"
messages[1]{from,text}:
  alice,"See you at 3:00, okay?"

包含逗号、冒号或换行符的值需要添加引号：

notes[2]: "hello, world","line1\nline2"
messages[1]{from,text}:
  alice,"See you at 3:00, okay?"

Mixed nesting

混合嵌套

company:
  name: Acme Corp
  founded: 1987
  offices[2]: NYC,SF
  teams[2]{name,headcount}:
    Engineering,45
    Marketing,20

company:
  name: Acme Corp
  founded: 1987
  offices[2]: NYC,SF
  teams[2]{name,headcount}:
    Engineering,45
    Marketing,20

Using TOON with LLMs

在LLM中使用TOON

Direct prompt injection

直接提示注入

typescript

import { encode } from '@toon-format/toon';
import OpenAI from 'openai';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function queryWithToon(data: unknown, question: string) {
  const toon = encode(data);

  const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: [
          'You are a data analyst. The user will provide data in TOON format.',
          'TOON is a compact encoding of JSON: indentation = nesting,',
          'key[N]: v1,v2 = array of N scalars,',
          'key[N]{f1,f2}: rows = array of N objects with fields f1, f2.',
        ].join(' '),
      },
      {
        role: 'user',
        content: `Data:\n\`\`\`\n${toon}\n\`\`\`\n\nQuestion: ${question}`,
      },
    ],
  });

  return response.choices[0].message.content;
}

// Usage
const employees = [
  { id: 1, name: 'Alice', dept: 'Eng', salary: 95000 },
  { id: 2, name: 'Bob', dept: 'Marketing', salary: 72000 },
];

const answer = await queryWithToon(
  { employees },
  'Who has the highest salary?'
);

typescript

import { encode } from '@toon-format/toon';
import OpenAI from 'openai';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function queryWithToon(data: unknown, question: string) {
  const toon = encode(data);

  const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: [
          'You are a data analyst. The user will provide data in TOON format.',
          'TOON is a compact encoding of JSON: indentation = nesting,',
          'key[N]: v1,v2 = array of N scalars,',
          'key[N]{f1,f2}: rows = array of N objects with fields f1, f2.',
        ].join(' '),
      },
      {
        role: 'user',
        content: `Data:\n\`\`\`\n${toon}\n\`\`\`\n\nQuestion: ${question}`,
      },
    ],
  });

  return response.choices[0].message.content;
}

// 使用示例
const employees = [
  { id: 1, name: 'Alice', dept: 'Eng', salary: 95000 },
  { id: 2, name: 'Bob', dept: 'Marketing', salary: 72000 },
];

const answer = await queryWithToon(
  { employees },
  'Who has the highest salary?'
);

Anthropic / Claude

typescript

import { encode } from '@toon-format/toon';
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

async function analyzeWithClaude(data: unknown, prompt: string) {
  const toon = encode(data);

  const message = await client.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 1024,
    system:
      'Data is in TOON format: indented = nested objects, key[N]: vals = scalar array, key[N]{fields}: rows = object array.',
    messages: [
      {
        role: 'user',
        content: `\`\`\`toon\n${toon}\n\`\`\`\n\n${prompt}`,
      },
    ],
  });

  return message.content[0].type === 'text' ? message.content[0].text : null;
}

typescript

import { encode } from '@toon-format/toon';
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

async function analyzeWithClaude(data: unknown, prompt: string) {
  const toon = encode(data);

  const message = await client.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 1024,
    system:
      'Data is in TOON format: indented = nested objects, key[N]: vals = scalar array, key[N]{fields}: rows = object array.',
    messages: [
      {
        role: 'user',
        content: `\`\`\`toon\n${toon}\n\`\`\`\n\n${prompt}`,
      },
    ],
  });

  return message.content[0].type === 'text' ? message.content[0].text : null;
}

Token count comparison utility

令牌数量对比工具

typescript

import { encode } from '@toon-format/toon';
import { encode as gptEncode } from 'gpt-tokenizer';

function compareTokens(data: unknown) {
  const jsonStr = JSON.stringify(data);
  const toonStr = encode(data);

  const jsonTokens = gptEncode(jsonStr).length;
  const toonTokens = gptEncode(toonStr).length;
  const savings = (((jsonTokens - toonTokens) / jsonTokens) * 100).toFixed(1);

  console.log(`JSON:  ${jsonTokens} tokens`);
  console.log(`TOON:  ${toonTokens} tokens`);
  console.log(`Saved: ${savings}%`);

  return { jsonTokens, toonTokens, savings: parseFloat(savings) };
}

typescript

import { encode } from '@toon-format/toon';
import { encode as gptEncode } from 'gpt-tokenizer';

function compareTokens(data: unknown) {
  const jsonStr = JSON.stringify(data);
  const toonStr = encode(data);

  const jsonTokens = gptEncode(jsonStr).length;
  const toonTokens = gptEncode(toonStr).length;
  const savings = (((jsonTokens - toonTokens) / jsonTokens) * 100).toFixed(1);

  console.log(`JSON:  ${jsonTokens} tokens`);
  console.log(`TOON:  ${toonTokens} tokens`);
  console.log(`Saved: ${savings}%`);

  return { jsonTokens, toonTokens, savings: parseFloat(savings) };
}

Common Patterns

常见模式

Batch API calls with TOON

使用TOON进行批量API调用

typescript

import { encode } from '@toon-format/toon';

// Encode each record separately for independent LLM calls
function encodeRecords<T>(records: T[]): string[] {
  return records.map((r) => encode(r));
}

// Encode all records as one TOON document (most efficient for bulk)
function encodeAll<T>(records: T[], key = 'records'): string {
  return encode({ [key]: records });
}

typescript

import { encode } from '@toon-format/toon';

// 单独编码每条记录，用于独立LLM调用
function encodeRecords<T>(records: T[]): string[] {
  return records.map((r) => encode(r));
}

// 将所有记录编码为单个TOON文档（批量处理最高效）
function encodeAll<T>(records: T[], key = 'records'): string {
  return encode({ [key]: records });
}

RAG / retrieval context injection

RAG / 检索上下文注入

typescript

import { encode } from '@toon-format/toon';

interface SearchResult {
  id: string;
  title: string;
  snippet: string;
  score: number;
  url: string;
}

function buildRagContext(results: SearchResult[]): string {
  // TOON is ideal here — uniform objects collapse into a compact table
  return encode({ results });
}

// Output:
// results[5]{id,title,snippet,score,url}:
//   doc1,Introduction to TOON,...,0.95,https://...
//   doc2,TOON vs JSON,...,0.87,https://...

typescript

import { encode } from '@toon-format/toon';

interface SearchResult {
  id: string;
  title: string;
  snippet: string;
  score: number;
  url: string;
}

function buildRagContext(results: SearchResult[]): string {
  // TOON在此场景下非常理想——统一对象会压缩为紧凑的表格
  return encode({ results });
}

// 输出:
// results[5]{id,title,snippet,score,url}:
//   doc1,Introduction to TOON,...,0.95,https://...
//   doc2,TOON vs JSON,...,0.87,https://...

Streaming encode for large datasets

大型数据集的流式编码

typescript

import { encode } from '@toon-format/toon';
import { createReadStream, createWriteStream } from 'fs';

// For large JSON files: read → parse → encode → write
async function convertFile(inputPath: string, outputPath: string) {
  const raw = await fs.promises.readFile(inputPath, 'utf-8');
  const data = JSON.parse(raw);
  const toon = encode(data);
  await fs.promises.writeFile(outputPath, toon, 'utf-8');

  const jsonBytes = Buffer.byteLength(raw);
  const toonBytes = Buffer.byteLength(toon);
  console.log(`Reduced size by ${(((jsonBytes - toonBytes) / jsonBytes) * 100).toFixed(1)}%`);
}

typescript

import { encode } from '@toon-format/toon';
import { createReadStream, createWriteStream } from 'fs';

// 针对大型JSON文件：读取 → 解析 → 编码 → 写入
async function convertFile(inputPath: string, outputPath: string) {
  const raw = await fs.promises.readFile(inputPath, 'utf-8');
  const data = JSON.parse(raw);
  const toon = encode(data);
  await fs.promises.writeFile(outputPath, toon, 'utf-8');

  const jsonBytes = Buffer.byteLength(raw);
  const toonBytes = Buffer.byteLength(toon);
  console.log(`Reduced size by ${(((jsonBytes - toonBytes) / jsonBytes) * 100).toFixed(1)}%`);
}

Schema-aware encoding (TypeScript)

支持Schema的编码（TypeScript）

typescript

import { encode, decode } from '@toon-format/toon';

interface Employee {
  id: number;
  name: string;
  department: string;
  salary: number;
  active: boolean;
}

interface EmployeeReport {
  generatedAt: string;
  employees: Employee[];
}

// Encode is generic-friendly — pass any serializable object
const report: EmployeeReport = {
  generatedAt: new Date().toISOString(),
  employees: [
    { id: 1, name: 'Alice', department: 'Engineering', salary: 95000, active: true },
    { id: 2, name: 'Bob', department: 'Marketing', salary: 72000, active: true },
  ],
};

const toon = encode(report);

// Decode back with type assertion
const recovered = decode(toon) as EmployeeReport;
console.log(recovered.employees[0].name); // 'Alice'

typescript

import { encode, decode } from '@toon-format/toon';

interface Employee {
  id: number;
  name: string;
  department: string;
  salary: number;
  active: boolean;
}

interface EmployeeReport {
  generatedAt: string;
  employees: Employee[];
}

// encode支持泛型——可传入任何可序列化对象
const report: EmployeeReport = {
  generatedAt: new Date().toISOString(),
  employees: [
    { id: 1, name: 'Alice', department: 'Engineering', salary: 95000, active: true },
    { id: 2, name: 'Bob', department: 'Marketing', salary: 72000, active: true },
  ],
};

const toon = encode(report);

// 通过类型断言解码回原类型
const recovered = decode(toon) as EmployeeReport;
console.log(recovered.employees[0].name); // 'Alice'

Express middleware for TOON content-type

用于TOON内容类型的Express中间件

typescript

import express from 'express';
import { encode, decode } from '@toon-format/toon';

const app = express();

// Parse incoming TOON bodies
app.use((req, res, next) => {
  if (req.headers['content-type']?.startsWith('text/toon')) {
    let body = '';
    req.on('data', (chunk) => (body += chunk));
    req.on('end', () => {
      try {
        (req as any).toonBody = decode(body);
        next();
      } catch (e) {
        res.status(400).json({ error: 'Invalid TOON body' });
      }
    });
  } else {
    next();
  }
});

// Respond with TOON when client requests it
app.get('/api/employees', (req, res) => {
  const employees = [
    { id: 1, name: 'Alice', dept: 'Eng' },
    { id: 2, name: 'Bob', dept: 'Marketing' },
  ];

  if (req.headers.accept?.includes('text/toon')) {
    res.setHeader('Content-Type', 'text/toon; charset=utf-8');
    res.send(encode({ employees }));
  } else {
    res.json({ employees });
  }
});

typescript

import express from 'express';
import { encode, decode } from '@toon-format/toon';

const app = express();

// 解析传入的TOON请求体
app.use((req, res, next) => {
  if (req.headers['content-type']?.startsWith('text/toon')) {
    let body = '';
    req.on('data', (chunk) => (body += chunk));
    req.on('end', () => {
      try {
        (req as any).toonBody = decode(body);
        next();
      } catch (e) {
        res.status(400).json({ error: 'Invalid TOON body' });
      }
    });
  } else {
    next();
  }
});

// 当客户端请求时，以TOON格式响应
app.get('/api/employees', (req, res) => {
  const employees = [
    { id: 1, name: 'Alice', dept: 'Eng' },
    { id: 2, name: 'Bob', dept: 'Marketing' },
  ];

  if (req.headers.accept?.includes('text/toon')) {
    res.setHeader('Content-Type', 'text/toon; charset=utf-8');
    res.send(encode({ employees }));
  } else {
    res.json({ employees });
  }
});

When to Use TOON vs JSON

TOON与JSON的使用场景对比

Scenario	Recommendation
Uniform arrays of objects	✅ TOON (biggest savings)
Deeply nested / non-uniform	⚠️ Benchmark both; JSON-compact may win
Pure flat tabular data	Consider CSV (smaller) or TOON (structured)
Latency-critical (local models)	Benchmark TTFT + tokens/sec
Programmatic API calls	Keep JSON; encode to TOON only for LLM input
Semi-uniform (~40–60% tabular)	Benchmark; savings diminish

场景	推荐方案
统一对象数组	✅ TOON（节省最多令牌）
深度嵌套/非统一结构	⚠️ 对比两者性能；压缩JSON可能更优
纯扁平表格数据	考虑CSV（体积更小）或TOON（结构化更强）
延迟敏感场景（本地模型）	对比TTFT + 令牌处理速度
程序化API调用	保留JSON；仅在LLM输入时编码为TOON
半统一结构（约40–60%表格化）	对比性能；令牌节省效果会减弱

Troubleshooting

故障排除

Values with commas parse incorrectly

包含逗号的值解析错误

Wrap them in double quotes in your TOON string, or ensure

encode()

handles it automatically:

typescript

// encode() automatically quotes values containing commas
const data = { tags: ['hello, world', 'foo,bar'] };
encode(data);
// tags[2]: "hello, world","foo,bar"

在TOON字符串中用双引号包裹这些值，或确保

encode()

自动处理：

typescript

// encode()会自动为包含逗号的值添加引号
const data = { tags: ['hello, world', 'foo,bar'] };
encode(data);
// tags[2]: "hello, world","foo,bar"

Round-trip type loss (numbers vs strings)

往返转换时类型丢失（数字vs字符串）

TOON uses unquoted values for numbers and booleans. Ensure your data uses proper JS types before encoding — don't pass

"95000"

(string) when you mean

(number):

typescript

// ✅ Correct
{ salary: 95000, active: true }

// ❌ Will decode as string "95000" and string "true"
{ salary: '95000', active: 'true' }

TOON对数字和布尔值使用无引号格式。编码前确保数据使用正确的JS类型——不要在需要数字95000时传入字符串"95000"：

typescript

// ✅ 正确
{ salary: 95000, active: true }

// ❌ 解码后会变成字符串"95000"和"true"
{ salary: '95000', active: 'true' }

LLM misreads tabular rows

LLM误读表格行

Add a brief TOON format explanation to your system prompt:

TOON format rules:
- Indentation = nested object
- key[N]: v1,v2,v3 = array of N scalar values
- key[N]{field1,field2}: followed by N indented rows = array of objects

在系统提示中添加简短的TOON格式说明：

TOON格式规则：
- 缩进 = 嵌套对象
- key[N]: v1,v2,v3 = 包含N个标量值的数组
- key[N]{field1,field2}: 后续N行缩进内容 = 包含N个对象的数组，对象字段为field1、field2

CLI not found after global install

全局安装后CLI命令找不到

bash

undefined

bash

undefined

Verify global bin path is on your PATH

验证全局bin路径是否在你的PATH中

npm bin -g # or: npm root -g

npm bin -g # 或：npm root -g

Alternatively use npx

或者使用npx

npx @toon-format/toon encode input.json

undefined

npx @toon-format/toon encode input.json

undefined

Decoding fails on hand-written TOON

手写TOON解码失败

Common mistakes in hand-written TOON:

Missing length declaration:
```
items{id,name}:
```
→ must be
```
items[2]{id,name}:
```
Inconsistent indentation (mix of tabs/spaces)
Unquoted values containing
```
:
```
as first character

手写TOON的常见错误：

缺少长度声明：
```
items{id,name}:
```
→ 必须是
```
items[2]{id,name}:
```
缩进不一致（混合使用制表符和空格）
以冒号为第一个字符的值未添加引号