benchmarking

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Writing Benchmarks

编写基准测试

When to Use This Skill

何时使用本技能

  • Comparing implementations: Measuring old vs new approach after an optimization
  • Regression testing: Verifying a refactor doesn't degrade performance
  • Comparing with published version: Benchmarking workspace code against the latest published npm package
  • 对比实现方案: 优化后测量新旧方案的性能差异
  • 回归测试: 验证重构不会导致性能下降
  • 与已发布版本对比: 将工作区代码与最新发布的npm包进行基准测试

Do NOT Use This Skill For

请勿将本技能用于以下场景

  • General app-level performance optimization (use
    jazz-performance
    )
  • Profiling or debugging slow user-facing behavior
  • 通用应用级性能优化(请使用
    jazz-performance
  • 分析或调试用户侧的慢行为

Directory Structure

目录结构

All benchmarks live in the
bench/
directory at the repository root:
bench/
├── package.json              # Dependencies: cronometro, cojson, jazz-tools, vitest
├── jazz-tools/               # jazz-tools benchmarks
│   └── *.bench.ts
所有基准测试都存放在仓库根目录的
bench/
文件夹中:
bench/
├── package.json              # 依赖项:cronometro, cojson, jazz-tools, vitest
├── jazz-tools/               # jazz-tools的基准测试
│   └── *.bench.ts

File Naming

文件命名规则

Benchmark files follow the pattern:
<subject>.<operation>.bench.ts
Each file should focus on a single benchmark comparing multiple implementations (e.g.,
@latest
vs
@workspace
).
Examples:
  • comap.create.jazz-tools.bench.ts
    — benchmarks CoMap creation
  • filestream.getChunks.bench.ts
    — benchmarks FileStream.getChunks()
  • filestream.asBase64.bench.ts
    — benchmarks FileStream.asBase64()
  • binaryCoStream.write.bench.ts
    — benchmarks binary stream writes
基准测试文件遵循以下命名模式:
<subject>.<operation>.bench.ts
每个文件应聚焦于单个基准测试,对比多种实现方案(例如
@latest
vs
@workspace
)。
示例:
  • comap.create.jazz-tools.bench.ts
    — 基准测试CoMap创建操作
  • filestream.getChunks.bench.ts
    — 基准测试FileStream.getChunks()方法
  • filestream.asBase64.bench.ts
    — 基准测试FileStream.asBase64()方法
  • binaryCoStream.write.bench.ts
    — 基准测试二进制流写入操作

Benchmark Library: cronometro

基准测试库:cronometro

Benchmarks use cronometro, which runs each test in an isolated worker thread for accurate measurement.
基准测试使用cronometro,它会在独立的Worker线程中运行每个测试,以确保测量结果准确。

Basic Template

基础模板

ts
import cronometro from "cronometro";

const TOTAL_BYTES = 5 * 1024 * 1024;
let data: SomeType;

await cronometro(
  {
    "operation - @latest": {
      async before() {
        // Setup — runs once before the test iterations
        data = prepareTestData(TOTAL_BYTES);
      },
      test() {
        // The code being benchmarked — runs many times
        latestImplementation(data);
      },
      async after() {
        // Cleanup — runs once after all iterations
        cleanup();
      },
    },
    "operation - @workspace": {
      async before() {
        data = prepareTestData(TOTAL_BYTES);
      },
      test() {
        workspaceImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
  },
  {
    iterations: 50,
    warmup: true,
    print: {
      colors: true,
      compare: true,
    },
    onTestError: (testName: string, error: unknown) => {
      console.error(`\nError in test "${testName}":`);
      console.error(error);
    },
  },
);
ts
import cronometro from "cronometro";

const TOTAL_BYTES = 5 * 1024 * 1024;
let data: SomeType;

await cronometro(
  {
    "operation - @latest": {
      async before() {
        // 准备工作 — 在测试迭代前运行一次
        data = prepareTestData(TOTAL_BYTES);
      },
      test() {
        // 待基准测试的代码 — 会运行多次
        latestImplementation(data);
      },
      async after() {
        // 清理工作 — 在所有迭代完成后运行一次
        cleanup();
      },
    },
    "operation - @workspace": {
      async before() {
        data = prepareTestData(TOTAL_BYTES);
      },
      test() {
        workspaceImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
  },
  {
    iterations: 50,
    warmup: true,
    print: {
      colors: true,
      compare: true,
    },
    onTestError: (testName: string, error: unknown) => {
      console.error(`\nError in test "${testName}":`);
      console.error(error);
    },
  },
);

Single Cronometro Instance Per Benchmark

每个基准测试对应一个Cronometro实例

Each benchmark file should have a single
cronometro()
call
that compares multiple implementations of the same operation. This makes results easier to read and compare:
ts
import cronometro from "cronometro";

const TOTAL_BYTES = 5 * 1024 * 1024;
let data: InputType;

await cronometro(
  {
    "operationName - @latest": {
      async before() {
        data = generateInput(TOTAL_BYTES);
      },
      test() {
        latestImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
    "operationName - @workspace": {
      async before() {
        data = generateInput(TOTAL_BYTES);
      },
      test() {
        workspaceImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
  },
  {
    iterations: 50,
    warmup: true,
    print: { colors: true, compare: true },
    onTestError: (testName: string, error: unknown) => {
      console.error(`\nError in test "${testName}":`);
      console.error(error);
    },
  },
);
Key principles:
  • One file = one benchmark (e.g.,
    getChunks
    ,
    asBase64
    ,
    write
    )
  • One cronometro call comparing
    @latest
    vs
    @workspace
    (or old vs new)
  • Fixed data size at the top of the file (e.g.,
    const TOTAL_BYTES = 5 * 1024 * 1024
    )
  • Descriptive test names with format
    "operation - @implementation"
每个基准测试文件应包含一个
cronometro()
调用
,用于对比同一操作的多种实现方案。这样能让结果更易读和对比:
ts
import cronometro from "cronometro";

const TOTAL_BYTES = 5 * 1024 * 1024;
let data: InputType;

await cronometro(
  {
    "operationName - @latest": {
      async before() {
        data = generateInput(TOTAL_BYTES);
      },
      test() {
        latestImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
    "operationName - @workspace": {
      async before() {
        data = generateInput(TOTAL_BYTES);
      },
      test() {
        workspaceImplementation(data);
      },
      async after() {
        cleanup();
      },
    },
  },
  {
    iterations: 50,
    warmup: true,
    print: { colors: true, compare: true },
    onTestError: (testName: string, error: unknown) => {
      console.error(`\nError in test "${testName}":`);
      console.error(error);
    },
  },
);
核心原则:
  • 一个文件 = 一个基准测试(例如
    getChunks
    asBase64
    write
  • 一个cronometro调用对比
    @latest
    @workspace
    (或旧方案与新方案)
  • 固定数据大小定义在文件顶部(例如
    const TOTAL_BYTES = 5 * 1024 * 1024
  • 描述性测试名称遵循
    "operation - @implementation"
    格式

Comparing workspace vs published package

对比工作区代码与已发布包

To compare current workspace code against the latest published version:
1. Add npm aliases to
bench/package.json
:
json
{
  "dependencies": {
    "cojson": "workspace:*",
    "cojson-latest": "npm:cojson@0.20.7",
    "jazz-tools": "workspace:*",
    "jazz-tools-latest": "npm:jazz-tools@0.20.7"
  }
}
Then run
pnpm install
in
bench/
.
2. Import both versions:
ts
import * as localTools from "jazz-tools";
import * as latestPublishedTools from "jazz-tools-latest";
import { WasmCrypto as LocalWasmCrypto } from "cojson/crypto/WasmCrypto";
import { WasmCrypto as LatestPublishedWasmCrypto } from "cojson-latest/crypto/WasmCrypto";
3. Use
@ts-expect-error
when passing the published package
since the types won't match the workspace version:
ts
ctx = await createContext(
  // @ts-expect-error version mismatch
  latestPublishedTools,
  LatestPublishedWasmCrypto,
);
要将当前工作区代码与最新发布版本对比:
1. 在
bench/package.json
中添加npm别名:
json
{
  "dependencies": {
    "cojson": "workspace:*",
    "cojson-latest": "npm:cojson@0.20.7",
    "jazz-tools": "workspace:*",
    "jazz-tools-latest": "npm:jazz-tools@0.20.7"
  }
}
然后在
bench/
目录下运行
pnpm install
2. 导入两个版本:
ts
import * as localTools from "jazz-tools";
import * as latestPublishedTools from "jazz-tools-latest";
import { WasmCrypto as LocalWasmCrypto } from "cojson/crypto/WasmCrypto";
import { WasmCrypto as LatestPublishedWasmCrypto } from "cojson-latest/crypto/WasmCrypto";
3. 传入已发布包时使用
@ts-expect-error
,因为其类型与工作区版本不匹配:
ts
ctx = await createContext(
  // @ts-expect-error version mismatch
  latestPublishedTools,
  LatestPublishedWasmCrypto,
);

Benchmarking with a Jazz context

基于Jazz上下文的基准测试

When benchmarking CoValues (not standalone functions), create a full Jazz context. Use this helper pattern:
ts
async function createContext(tools: typeof localTools, wasmCrypto: typeof LocalWasmCrypto) {
  const ctx = await tools.createJazzContextForNewAccount({
    creationProps: { name: "Bench Account" },
    peers: [],
    crypto: await wasmCrypto.create(),
    sessionProvider: new tools.MockSessionProvider(),
  });
  return { account: ctx.account, node: ctx.node };
}
Key points:
  • Pass
    peers: []
    — benchmarks don't need network sync
  • Use
    MockSessionProvider
    — avoids real session persistence
  • Call
    (ctx.node as any).gracefulShutdown()
    in
    after()
    to clean up
当基准测试CoValues(而非独立函数)时,需要创建完整的Jazz上下文。可使用以下辅助模式:
ts
async function createContext(tools: typeof localTools, wasmCrypto: typeof LocalWasmCrypto) {
  const ctx = await tools.createJazzContextForNewAccount({
    creationProps: { name: "Bench Account" },
    peers: [],
    crypto: await wasmCrypto.create(),
    sessionProvider: new tools.MockSessionProvider(),
  });
  return { account: ctx.account, node: ctx.node };
}
关键点:
  • 传入
    peers: []
    — 基准测试不需要网络同步
  • 使用
    MockSessionProvider
    — 避免真实会话持久化
  • after()
    中调用
    (ctx.node as any).gracefulShutdown()
    进行清理

Test data strategy

测试数据策略

Define a fixed data size constant at the top of the file, then generate test data inside the
before
hook:
ts
const TOTAL_BYTES = 5 * 1024 * 1024; // 5MB

let chunks: Uint8Array[];

await cronometro({
  "operationName - @workspace": {
    async before() {
      chunks = makeChunks(TOTAL_BYTES, CHUNK_SIZE);
    },
    test() {
      doWork(chunks);
    },
  },
}, options);
Choose a size large enough to measure meaningfully. Small data (e.g., 100KB) may complete so fast that measurement noise dominates. 5MB is typically a good default for file/stream operations.
All fixture generation must be done inside the
before
hook
, not at module level. This ensures data is created in the same worker thread that runs the test.
在文件顶部定义固定数据大小常量,然后在
before
钩子中生成测试数据:
ts
const TOTAL_BYTES = 5 * 1024 * 1024; // 5MB

let chunks: Uint8Array[];

await cronometro({
  "operationName - @workspace": {
    async before() {
      chunks = makeChunks(TOTAL_BYTES, CHUNK_SIZE);
    },
    test() {
      doWork(chunks);
    },
  },
}, options);
选择足够大的数据量以确保测量有意义。小数据(如100KB)可能完成速度过快,导致测量误差占主导。对于文件/流操作,5MB通常是不错的默认值。
所有测试数据必须在
before
钩子中生成
,而非模块级别。这样能确保数据在运行测试的同一Worker线程中创建。

Running Benchmarks

运行基准测试

Add a script entry to
bench/package.json
:
json
{
  "scripts": {
    "bench:mytest": "node --experimental-strip-types --no-warnings ./jazz-tools/mytest.jazz-tools.bench.ts"
  }
}
Then run from the
bench/
directory:
sh
cd bench
pnpm run bench:mytest
bench/package.json
中添加脚本条目:
json
{
  "scripts": {
    "bench:mytest": "node --experimental-strip-types --no-warnings ./jazz-tools/mytest.jazz-tools.bench.ts"
  }
}
然后从
bench/
目录运行:
sh
cd bench
pnpm run bench:mytest

Critical Gotchas

关键注意事项

1. Use
node --experimental-strip-types
, NOT
tsx

1. 使用
node --experimental-strip-types
,而非
tsx

Cronometro spawns worker threads that re-import the benchmark file. Workers don't inherit tsx's custom ESM loader, so the TypeScript import fails silently and the benchmark hangs forever.
Use
node --experimental-strip-types --no-warnings
instead:
json
"bench:foo": "node --experimental-strip-types --no-warnings ./jazz-tools/foo.bench.ts"
Cronometro会启动Worker线程重新导入基准测试文件。Worker线程不会继承tsx的自定义ESM加载器,因此TypeScript导入会静默失败,导致基准测试无限挂起。
请改用
node --experimental-strip-types --no-warnings
json
"bench:foo": "node --experimental-strip-types --no-warnings ./jazz-tools/foo.bench.ts"

2.
before
/
after
hooks MUST be
async
or accept a callback

2.
before
/
after
钩子必须是
async
函数或接受回调

Cronometro's lifecycle hooks expect either:
  • An async function (returns a Promise)
  • A function that accepts and calls a callback parameter
A plain synchronous function that does neither will silently prevent the test from ever starting, causing the benchmark to hang indefinitely:
ts
// BAD — test never starts, benchmark hangs
{
  before() {
    data = generateInput();  // sync, no callback, no promise
  },
  test() { ... },
}

// GOOD — async function returns a Promise
{
  async before() {
    data = generateInput();
  },
  test() { ... },
}

// ALSO GOOD — callback style
{
  before(cb: () => void) {
    data = generateInput();
    cb();
  },
  test() { ... },
}
Cronometro的生命周期钩子要求:
  • 异步函数(返回Promise)
  • 接受并调用回调参数的函数
普通同步函数会静默阻止测试启动,导致基准测试无限挂起:
ts
// 错误示例 — 测试永远不会启动,基准测试挂起
{
  before() {
    data = generateInput();  // 同步函数,无回调,无Promise
  },
  test() { ... },
}

// 正确示例 — 异步函数返回Promise
{
  async before() {
    data = generateInput();
  },
  test() { ... },
}

// 同样正确 — 回调风格
{
  before(cb: () => void) {
    data = generateInput();
    cb();
  },
  test() { ... },
}

3.
test()
can be sync or async

3.
test()
可以是同步或异步函数

Unlike
before
/
after
, the
test
function works correctly as a plain synchronous function. Make it
async
only if the code under test is genuinely asynchronous.
before
/
after
不同,
test
函数作为普通同步函数也能正常工作。仅当被测代码确实是异步时,才将其设为
async

4. TypeScript constraints under
--experimental-strip-types

4.
--experimental-strip-types
下的TypeScript限制

Node's type stripping handles annotations,
as
casts, and
!
assertions. But it does not support:
  • enum
    declarations (use
    const
    objects instead)
  • namespace
    declarations
  • Parameter properties in constructors (
    constructor(private x: number)
    )
  • Legacy
    import =
    /
    export =
    syntax
Keep benchmark files to simple TypeScript that only uses type annotations, interfaces, type aliases, and casts.
Node的类型剥离功能支持注解、
as
类型转换和
!
断言,但不支持:
  • enum
    声明(请改用
    const
    对象)
  • namespace
    声明
  • 构造函数中的参数属性(
    constructor(private x: number)
  • 旧版
    import =
    /
    export =
    语法
基准测试文件应使用简单的TypeScript,仅包含类型注解、接口、类型别名和类型转换。

Example: Full Benchmark

示例:完整基准测试

This example shows a benchmark comparing
getChunks()
between the published package and workspace code:
ts
import cronometro from "cronometro";
import * as localTools from "jazz-tools";
import * as latestPublishedTools from "jazz-tools-latest";
import { WasmCrypto as LocalWasmCrypto } from "cojson/crypto/WasmCrypto";
import { cojsonInternals } from "cojson";
import { WasmCrypto as LatestPublishedWasmCrypto } from "cojson-latest/crypto/WasmCrypto";

const CHUNK_SIZE = cojsonInternals.TRANSACTION_CONFIG.MAX_RECOMMENDED_TX_SIZE;
const TOTAL_BYTES = 5 * 1024 * 1024;

function makeChunks(totalBytes: number, chunkSize: number): Uint8Array[] {
  const chunks: Uint8Array[] = [];
  let remaining = totalBytes;
  while (remaining > 0) {
    const size = Math.min(chunkSize, remaining);
    const chunk = new Uint8Array(size);
    for (let i = 0; i < size; i++) {
      chunk[i] = Math.floor(Math.random() * 256);
    }
    chunks.push(chunk);
    remaining -= size;
  }
  return chunks;
}

type Tools = typeof localTools;

async function createContext(tools: Tools, wasmCrypto: typeof LocalWasmCrypto) {
  const ctx = await tools.createJazzContextForNewAccount({
    creationProps: { name: "Bench Account" },
    peers: [],
    crypto: await wasmCrypto.create(),
    sessionProvider: new tools.MockSessionProvider(),
  });
  return { account: ctx.account, node: ctx.node, FileStream: tools.FileStream };
}

function populateStream(ctx: Awaited<ReturnType<typeof createContext>>, chunks: Uint8Array[]) {
  let totalBytes = 0;
  for (const c of chunks) totalBytes += c.length;
  const stream = ctx.FileStream.create({ owner: ctx.account });
  stream.start({ mimeType: "application/octet-stream", totalSizeBytes: totalBytes });
  for (const chunk of chunks) stream.push(chunk);
  stream.end();
  return stream;
}

const benchOptions = {
  iterations: 50,
  warmup: true,
  print: { colors: true, compare: true },
  onTestError: (testName: string, error: unknown) => {
    console.error(`\nError in test "${testName}":`);
    console.error(error);
  },
};

let readCtx: Awaited<ReturnType<typeof createContext>>;
let readStream: ReturnType<typeof populateStream>;

await cronometro(
  {
    "getChunks - @latest": {
      async before() {
        readCtx = await createContext(
          // @ts-expect-error version mismatch
          latestPublishedTools,
          LatestPublishedWasmCrypto,
        );
        readStream = populateStream(readCtx, makeChunks(TOTAL_BYTES, CHUNK_SIZE));
      },
      test() {
        readStream.getChunks();
      },
      async after() {
        (readCtx.node as any).gracefulShutdown();
      },
    },
    "getChunks - @workspace": {
      async before() {
        readCtx = await createContext(localTools, LocalWasmCrypto);
        readStream = populateStream(readCtx, makeChunks(TOTAL_BYTES, CHUNK_SIZE));
      },
      test() {
        readStream.getChunks();
      },
      async after() {
        (readCtx.node as any).gracefulShutdown();
      },
    },
  },
  benchOptions,
);
以下示例展示了对比已发布包与工作区代码中
getChunks()
方法的基准测试:
ts
import cronometro from "cronometro";
import * as localTools from "jazz-tools";
import * as latestPublishedTools from "jazz-tools-latest";
import { WasmCrypto as LocalWasmCrypto } from "cojson/crypto/WasmCrypto";
import { cojsonInternals } from "cojson";
import { WasmCrypto as LatestPublishedWasmCrypto } from "cojson-latest/crypto/WasmCrypto";

const CHUNK_SIZE = cojsonInternals.TRANSACTION_CONFIG.MAX_RECOMMENDED_TX_SIZE;
const TOTAL_BYTES = 5 * 1024 * 1024;

function makeChunks(totalBytes: number, chunkSize: number): Uint8Array[] {
  const chunks: Uint8Array[] = [];
  let remaining = totalBytes;
  while (remaining > 0) {
    const size = Math.min(chunkSize, remaining);
    const chunk = new Uint8Array(size);
    for (let i = 0; i < size; i++) {
      chunk[i] = Math.floor(Math.random() * 256);
    }
    chunks.push(chunk);
    remaining -= size;
  }
  return chunks;
}

type Tools = typeof localTools;

async function createContext(tools: Tools, wasmCrypto: typeof LocalWasmCrypto) {
  const ctx = await tools.createJazzContextForNewAccount({
    creationProps: { name: "Bench Account" },
    peers: [],
    crypto: await wasmCrypto.create(),
    sessionProvider: new tools.MockSessionProvider(),
  });
  return { account: ctx.account, node: ctx.node, FileStream: tools.FileStream };
}

function populateStream(ctx: Awaited<ReturnType<typeof createContext>>, chunks: Uint8Array[]) {
  let totalBytes = 0;
  for (const c of chunks) totalBytes += c.length;
  const stream = ctx.FileStream.create({ owner: ctx.account });
  stream.start({ mimeType: "application/octet-stream", totalSizeBytes: totalBytes });
  for (const chunk of chunks) stream.push(chunk);
  stream.end();
  return stream;
}

const benchOptions = {
  iterations: 50,
  warmup: true,
  print: { colors: true, compare: true },
  onTestError: (testName: string, error: unknown) => {
    console.error(`\nError in test "${testName}":`);
    console.error(error);
  },
};

let readCtx: Awaited<ReturnType<typeof createContext>>;
let readStream: ReturnType<typeof populateStream>;

await cronometro(
  {
    "getChunks - @latest": {
      async before() {
        readCtx = await createContext(
          // @ts-expect-error version mismatch
          latestPublishedTools,
          LatestPublishedWasmCrypto,
        );
        readStream = populateStream(readCtx, makeChunks(TOTAL_BYTES, CHUNK_SIZE));
      },
      test() {
        readStream.getChunks();
      },
      async after() {
        (readCtx.node as any).gracefulShutdown();
      },
    },
    "getChunks - @workspace": {
      async before() {
        readCtx = await createContext(localTools, LocalWasmCrypto);
        readStream = populateStream(readCtx, makeChunks(TOTAL_BYTES, CHUNK_SIZE));
      },
      test() {
        readStream.getChunks();
      },
      async after() {
        (readCtx.node as any).gracefulShutdown();
      },
    },
  },
  benchOptions,
);

Checklist

检查清单

  • One benchmark file per operation (e.g.,
    filestream.getChunks.bench.ts
    )
  • Single
    cronometro()
    call comparing
    @latest
    vs
    @workspace
  • Fixed data size constant at top of file (e.g.,
    const TOTAL_BYTES = 5 * 1024 * 1024
    )
  • Benchmark file placed in
    bench/jazz-tools/
    with
    *.bench.ts
    naming
  • Script added to
    bench/package.json
    using
    node --experimental-strip-types --no-warnings
  • before
    /
    after
    hooks are
    async
    (not plain sync)
  • iterations
    set to at least 50 for stable results
  • warmup: true
    enabled
  • onTestError
    handler included to surface worker failures
  • Test names follow format
    "operation - @implementation"
    (e.g.,
    "getChunks - @workspace"
    )
  • When comparing vs published: npm aliases added to
    bench/package.json
    and
    pnpm install
    run
  • When using Jazz context:
    gracefulShutdown()
    called in
    after()
    hook
  • Test data generated inside
    before()
    hooks (not at module level or inside
    test()
    )
  • 每个操作对应一个基准测试文件(例如
    filestream.getChunks.bench.ts
  • 单个
    cronometro()
    调用对比
    @latest
    @workspace
  • 文件顶部定义固定数据大小常量(例如
    const TOTAL_BYTES = 5 * 1024 * 1024
  • 基准测试文件存放在
    bench/jazz-tools/
    目录,命名遵循
    *.bench.ts
    格式
  • bench/package.json
    中添加使用
    node --experimental-strip-types --no-warnings
    的脚本
  • before
    /
    after
    钩子为
    async
    函数(非普通同步函数)
  • iterations
    设置为至少50以确保结果稳定
  • 启用
    warmup: true
  • 包含
    onTestError
    处理器以暴露Worker线程的错误
  • 测试名称遵循
    "operation - @implementation"
    格式(例如
    "getChunks - @workspace"
  • 对比已发布版本时:在
    bench/package.json
    中添加npm别名并运行
    pnpm install
  • 使用Jazz上下文时:在
    after()
    钩子中调用
    gracefulShutdown()
  • 测试数据在
    before()
    钩子中生成(而非模块级别或
    test()
    内部)