transformers-js

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Transformers.js - Machine Learning for JavaScript

Transformers.js - 面向JavaScript的机器学习库

Transformers.js enables running state-of-the-art machine learning models directly in JavaScript, both in browsers and Node.js environments, with no server required.

Transformers.js无需服务器，即可在浏览器和Node.js环境中直接通过JavaScript运行最先进的机器学习模型。

When to Use This Skill

适用场景

Use this skill when you need to:

Run ML models for text analysis, generation, or translation in JavaScript
Perform image classification, object detection, or segmentation
Implement speech recognition or audio processing
Build multimodal AI applications (text-to-image, image-to-text, etc.)
Run models client-side in the browser without a backend

在以下场景中可使用本工具：

在JavaScript中运行文本分析、生成或翻译类机器学习模型
执行图像分类、目标检测或分割任务
实现语音识别或音频处理功能
构建多模态AI应用（文本转图像、图像转文本等）
在浏览器客户端运行模型，无需后端支持

Installation

安装方法

NPM Installation

NPM安装

bash

npm install @huggingface/transformers

bash

npm install @huggingface/transformers

Browser Usage (CDN)

浏览器使用（CDN）

javascript

<script type="module">
  import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>

javascript

<script type="module">
  import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>

Core Concepts

核心概念

1. Pipeline API

The pipeline API is the easiest way to use models. It groups together preprocessing, model inference, and postprocessing:

javascript

import { pipeline } from '@huggingface/transformers';

// Create a pipeline for a specific task
const pipe = await pipeline('sentiment-analysis');

// Use the pipeline
const result = await pipe('I love transformers!');
// Output: [{ label: 'POSITIVE', score: 0.999817686 }]

// IMPORTANT: Always dispose when done to free memory
await classifier.dispose();

⚠️ Memory Management: All pipelines must be disposed with

pipe.dispose()

when finished to prevent memory leaks. See examples in Code Examples for cleanup patterns across different environments.

Pipeline API是使用模型最简单的方式，它将预处理、模型推理和后处理整合在一起：

javascript

import { pipeline } from '@huggingface/transformers';

// 创建针对特定任务的pipeline
const pipe = await pipeline('sentiment-analysis');

// 使用pipeline
const result = await pipe('I love transformers!');
// 输出: [{ label: 'POSITIVE', score: 0.999817686 }]

// 重要提示：使用完毕后务必调用dispose释放内存
await classifier.dispose();

⚠️ 内存管理： 所有pipeline使用完毕后必须调用

pipe.dispose()

，以防止内存泄漏。不同环境下的清理模式可参考代码示例。

2. Model Selection

2. 模型选择

You can specify a custom model as the second argument:

javascript

const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/bert-base-multilingual-uncased-sentiment'
);

Finding Models:

Browse available Transformers.js models on Hugging Face Hub:

All models: https://huggingface.co/models?library=transformers.js&sort=trending
By task: Add
```
pipeline_tag
```
parameter
- Text generation: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
- Image classification: https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
- Speech recognition: https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending

Tip: Filter by task type, sort by trending/downloads, and check model cards for performance metrics and usage examples.

可通过第二个参数指定自定义模型：

javascript

const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/bert-base-multilingual-uncased-sentiment'
);

查找模型：

可在Hugging Face Hub浏览兼容Transformers.js的模型：

全部模型：https://huggingface.co/models?library=transformers.js&sort=trending
按任务筛选：添加
```
pipeline_tag
```
参数

提示： 按任务类型筛选，按热度/下载量排序，并查看模型卡片了解性能指标和使用示例。

3. Device Selection

3. 设备选择

Choose where to run the model:

javascript

// Run on CPU (default for WASM)
const pipe = await pipeline('sentiment-analysis', 'model-id');

// Run on GPU (WebGPU - experimental)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  device: 'webgpu',
});

可选择模型运行的设备：

javascript

// 在CPU上运行（WASM默认选项）
const pipe = await pipeline('sentiment-analysis', 'model-id');

// 在GPU上运行（WebGPU - 实验性）
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  device: 'webgpu',
});

4. Quantization Options

4. 量化选项

Control model precision vs. performance:

javascript

// Use quantized model (faster, smaller)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q4',  // Options: 'fp32', 'fp16', 'q8', 'q4'
});

可控制模型精度与性能的平衡：

javascript

// 使用量化模型（速度更快、体积更小）
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q4',  // 可选值: 'fp32', 'fp16', 'q8', 'q4'
});

Supported Tasks

支持的任务

Note: All examples below show basic usage.

注意： 以下示例均为基础用法。

Natural Language Processing

自然语言处理

Text Classification

文本分类

javascript

const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');

javascript

const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');

Named Entity Recognition (NER)

命名实体识别（NER）

javascript

const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');

javascript

const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');

Question Answering

问答系统

javascript

const qa = await pipeline('question-answering');
const answer = await qa({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});

javascript

const qa = await pipeline('question-answering');
const answer = await qa({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});

Text Generation

文本生成

javascript

const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
  max_new_tokens: 100,
  temperature: 0.7
});

For streaming and chat: See Text Generation Guide for:

Streaming token-by-token output with
```
TextStreamer
```
Chat/conversation format with system/user/assistant roles
Generation parameters (temperature, top_k, top_p)
Browser and Node.js examples
React components and API endpoints

javascript

const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
  max_new_tokens: 100,
  temperature: 0.7
});

流式输出与对话功能： 详见**文本生成指南**，包含：

使用
```
TextStreamer
```
实现逐词流式输出
支持系统/用户/助手角色的对话格式
生成参数（temperature、top_k、top_p）
浏览器和Node.js示例
React组件与API端点

Translation

翻译

javascript

const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
  src_lang: 'eng_Latn',
  tgt_lang: 'fra_Latn'
});

javascript

const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
  src_lang: 'eng_Latn',
  tgt_lang: 'fra_Latn'
});

Summarization

文本摘要

javascript

const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
  max_length: 100,
  min_length: 30
});

javascript

const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
  max_length: 100,
  min_length: 30
});

Zero-Shot Classification

零样本分类

javascript

const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);

javascript

const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);

Computer Vision

计算机视觉

Image Classification

图像分类

javascript

const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// Or with local file
const result = await classifier(imageUrl);

javascript

const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// 或使用本地文件
const result = await classifier(imageUrl);

Object Detection

目标检测

javascript

const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// Returns: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]

javascript

const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// 返回结果: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]

Image Segmentation

图像分割

javascript

const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');

javascript

const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');

Depth Estimation

深度估计

javascript

const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');

javascript

const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');

Zero-Shot Image Classification

零样本图像分类

javascript

const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);

javascript

const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);

Audio Processing

音频处理

Automatic Speech Recognition

自动语音识别

javascript

const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// Returns: { text: 'transcribed text here' }

javascript

const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// 返回结果: { text: 'transcribed text here' }

Audio Classification

音频分类

javascript

const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');

javascript

const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');

Text-to-Speech

文本转语音

javascript

const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
  speaker_embeddings: speakerEmbeddings
});

javascript

const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
  speaker_embeddings: speakerEmbeddings
});

Multimodal

多模态

Image-to-Text (Image Captioning)

图像转文本（图像描述）

javascript

const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');

javascript

const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');

Document Question Answering

文档问答

javascript

const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');

javascript

const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');

Zero-Shot Object Detection

零样本目标检测

javascript

const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);

javascript

const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);

Feature Extraction (Embeddings)

特征提取（嵌入）

javascript

const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// Returns: tensor of shape [1, sequence_length, hidden_size]

// For sentence embeddings (mean pooling)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });

javascript

const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// 返回结果: 形状为 [1, sequence_length, hidden_size] 的张量

// 句子嵌入（均值池化）
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });

Finding and Choosing Models

模型查找与选择

Browsing the Hugging Face Hub

浏览Hugging Face Hub

Discover compatible Transformers.js models on Hugging Face Hub:

Base URL (all models):

https://huggingface.co/models?library=transformers.js&sort=trending

Filter by task using the

pipeline_tag

parameter:

Task	URL
Text Generation	https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
Text Classification	https://huggingface.co/models?pipeline_tag=text-classification&library=transformers.js&sort=trending
Translation	https://huggingface.co/models?pipeline_tag=translation&library=transformers.js&sort=trending
Summarization	https://huggingface.co/models?pipeline_tag=summarization&library=transformers.js&sort=trending
Question Answering	https://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js&sort=trending
Image Classification	https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
Object Detection	https://huggingface.co/models?pipeline_tag=object-detection&library=transformers.js&sort=trending
Image Segmentation	https://huggingface.co/models?pipeline_tag=image-segmentation&library=transformers.js&sort=trending
Speech Recognition	https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
Audio Classification	https://huggingface.co/models?pipeline_tag=audio-classification&library=transformers.js&sort=trending
Image-to-Text	https://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js&sort=trending
Feature Extraction	https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js&sort=trending
Zero-Shot Classification	https://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js&sort=trending

Sort options:

```
&sort=trending
```
- Most popular recently
```
&sort=downloads
```
- Most downloaded overall
```
&sort=likes
```
- Most liked by community
```
&sort=modified
```
- Recently updated

在Hugging Face Hub发现兼容Transformers.js的模型：

基础链接（全部模型）：

https://huggingface.co/models?library=transformers.js&sort=trending

通过
pipeline_tag
参数按任务筛选：

任务	链接
文本生成	https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
文本分类	https://huggingface.co/models?pipeline_tag=text-classification&library=transformers.js&sort=trending
翻译	https://huggingface.co/models?pipeline_tag=translation&library=transformers.js&sort=trending
文本摘要	https://huggingface.co/models?pipeline_tag=summarization&library=transformers.js&sort=trending
问答系统	https://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js&sort=trending
图像分类	https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
目标检测	https://huggingface.co/models?pipeline_tag=object-detection&library=transformers.js&sort=trending
图像分割	https://huggingface.co/models?pipeline_tag=image-segmentation&library=transformers.js&sort=trending
语音识别	https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
音频分类	https://huggingface.co/models?pipeline_tag=audio-classification&library=transformers.js&sort=trending
图像转文本	https://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js&sort=trending
特征提取	https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js&sort=trending
零样本分类	https://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js&sort=trending

排序选项：

```
&sort=trending
```
- 近期最热门
```
&sort=downloads
```
- 总下载量最高
```
&sort=likes
```
- 社区最喜爱
```
&sort=modified
```
- 最近更新

Choosing the Right Model

选择合适的模型

Consider these factors when selecting a model:

1. Model Size

Small (< 100MB): Fast, suitable for browsers, limited accuracy
Medium (100MB - 500MB): Balanced performance, good for most use cases
Large (> 500MB): High accuracy, slower, better for Node.js or powerful devices

2. Quantization Models are often available in different quantization levels:

```
fp32
```
- Full precision (largest, most accurate)
```
fp16
```
- Half precision (smaller, still accurate)
```
q8
```
- 8-bit quantized (much smaller, slight accuracy loss)
```
q4
```
- 4-bit quantized (smallest, noticeable accuracy loss)

3. Task Compatibility Check the model card for:

Supported tasks (some models support multiple tasks)
Input/output formats
Language support (multilingual vs. English-only)
License restrictions

4. Performance Metrics Model cards typically show:

Accuracy scores
Benchmark results
Inference speed
Memory requirements

选择模型时需考虑以下因素：

1. 模型大小

小型（<100MB）：速度快，适合浏览器，精度有限
中型（100MB-500MB）：性能均衡，适用于大多数场景
大型（>500MB）：精度高，速度较慢，更适合Node.js或高性能设备

2. 量化级别 模型通常提供不同的量化版本：

```
fp32
```
- 全精度（体积最大，精度最高）
```
fp16
```
- 半精度（体积较小，精度仍有保障）
```
q8
```
- 8位量化（体积大幅减小，精度略有损失）
```
q4
```
- 4位量化（体积最小，精度损失明显）

3. 任务兼容性 查看模型卡片确认：

支持的任务部分模型支持多任务
输入/输出格式
语言支持（多语言 vs 仅英文）
许可证限制

4. 性能指标 模型卡片通常包含：

精度分数
基准测试结果
推理速度
内存需求

Example: Finding a Text Generation Model

示例：查找文本生成模型

javascript

// 1. Visit: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending

// 2. Browse and select a model (e.g., onnx-community/gemma-3-270m-it-ONNX)

// 3. Check model card for:
//    - Model size: ~270M parameters
//    - Quantization: q4 available
//    - Language: English
//    - Use case: Instruction-following chat

// 4. Use the model:
import { pipeline } from '@huggingface/transformers';

const generator = await pipeline(
  'text-generation',
  'onnx-community/gemma-3-270m-it-ONNX',
  { dtype: 'q4' } // Use quantized version for faster inference
);

const output = await generator('Explain quantum computing in simple terms.', {
  max_new_tokens: 100
});

await generator.dispose();

javascript

// 1. 访问：https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending

// 2. 浏览并选择模型（例如：onnx-community/gemma-3-270m-it-ONNX）

// 3. 查看模型卡片：
//    - 模型大小：约2.7亿参数
//    - 量化版本：支持q4
//    - 语言：英文
//    - 适用场景：指令遵循式对话

// 4. 使用模型：
import { pipeline } from '@huggingface/transformers';

const generator = await pipeline(
  'text-generation',
  'onnx-community/gemma-3-270m-it-ONNX',
  { dtype: 'q4' } // 使用量化版本提升推理速度
);

const output = await generator('Explain quantum computing in simple terms.', {
  max_new_tokens: 100
});

await generator.dispose();

Tips for Model Selection

模型选择技巧

Start Small: Test with a smaller model first, then upgrade if needed
Check ONNX Support: Ensure the model has ONNX files (look for
```
onnx
```
folder in model repo)
Read Model Cards: Model cards contain usage examples, limitations, and benchmarks
Test Locally: Benchmark inference speed and memory usage in your environment
Community Models: Look for models by
```
Xenova
```
(Transformers.js maintainer) or
```
onnx-community
```

Version Pin: Use specific git commits in production for stability:

javascript

const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });

从小模型开始：先使用小型模型测试，必要时再升级
检查ONNX支持：确保模型包含ONNX文件（查看模型仓库中的
```
onnx
```
文件夹）
阅读模型卡片：模型卡片包含使用示例、局限性和基准测试
本地测试：在你的环境中测试推理速度和内存占用
社区模型：优先选择
```
Xenova
```
（Transformers.js维护者）或
```
onnx-community
```
提供的模型
版本锁定：生产环境中使用特定的Git提交以保证稳定性：
javascript
```
const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });
```

Advanced Configuration

高级配置

Environment Configuration (

env

)

环境配置（

env

）

The

env

object provides comprehensive control over Transformers.js execution, caching, and model loading.

Quick Overview:

javascript

import { env } from '@huggingface/transformers';

// View version
console.log(env.version); // e.g., '3.8.1'

// Common settings
env.allowRemoteModels = true;  // Load from Hugging Face Hub
env.allowLocalModels = false;  // Load from file system
env.localModelPath = '/models/'; // Local model directory
env.useFSCache = true;         // Cache models on disk (Node.js)
env.useBrowserCache = true;    // Cache models in browser
env.cacheDir = './.cache';     // Cache directory location

Configuration Patterns:

javascript

// Development: Fast iteration with remote models
env.allowRemoteModels = true;
env.useFSCache = true;

// Production: Local models only
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';

// Custom CDN
env.remoteHost = 'https://cdn.example.com/models';

// Disable caching (testing)
env.useFSCache = false;
env.useBrowserCache = false;

For complete documentation on all configuration options, caching strategies, cache management, pre-downloading models, and more, see:

→ Configuration Reference

env

对象可全面控制Transformers.js的执行、缓存和模型加载。

快速概览：

javascript

import { env } from '@huggingface/transformers';

// 查看版本
console.log(env.version); // 示例: '3.8.1'

// 常用设置
env.allowRemoteModels = true;  // 从Hugging Face Hub加载模型
env.allowLocalModels = false;  // 从文件系统加载模型
env.localModelPath = '/models/'; // 本地模型目录
env.useFSCache = true;         // 在Node.js中缓存模型到磁盘
env.useBrowserCache = true;    // 在浏览器中缓存模型
env.cacheDir = './.cache';     // 缓存目录位置

配置模式：

javascript

// 开发环境：快速迭代，使用远程模型
env.allowRemoteModels = true;
env.useFSCache = true;

// 生产环境：仅使用本地模型
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';

// 自定义CDN
env.remoteHost = 'https://cdn.example.com/models';

// 禁用缓存（测试用）
env.useFSCache = false;
env.useBrowserCache = false;

所有配置选项、缓存策略、缓存管理和模型预下载等详细内容，可参考：

→ 配置参考文档

Working with Tensors

张量操作

javascript

import { AutoTokenizer, AutoModel } from '@huggingface/transformers';

// Load tokenizer and model separately for more control
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');

// Tokenize input
const inputs = await tokenizer('Hello world!');

// Run model
const outputs = await model(inputs);

javascript

import { AutoTokenizer, AutoModel } from '@huggingface/transformers';

// 单独加载分词器和模型以获得更精细的控制
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');

// 对输入进行分词
const inputs = await tokenizer('Hello world!');

// 运行模型
const outputs = await model(inputs);

Batch Processing

批量处理

javascript

const classifier = await pipeline('sentiment-analysis');

// Process multiple texts
const results = await classifier([
  'I love this!',
  'This is terrible.',
  'It was okay.'
]);

javascript

const classifier = await pipeline('sentiment-analysis');

// 处理多个文本
const results = await classifier([
  'I love this!',
  'This is terrible.',
  'It was okay.'
]);

Browser-Specific Considerations

浏览器特定注意事项

WebGPU Usage

WebGPU使用

WebGPU provides GPU acceleration in browsers:

javascript

const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
  device: 'webgpu',
  dtype: 'fp32'
});

Note: WebGPU is experimental. Check browser compatibility and file issues if problems occur.

WebGPU可在浏览器中提供GPU加速：

javascript

const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
  device: 'webgpu',
  dtype: 'fp32'
});

注意：WebGPU为实验性功能。请检查浏览器兼容性，若出现问题可提交反馈。

WASM Performance

WASM性能

Default browser execution uses WASM:

javascript

// Optimized for browsers with quantization
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q8'  // or 'q4' for even smaller size
});

浏览器默认使用WASM执行：

javascript

// 针对浏览器优化，使用量化版本
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q8'  // 或使用'q4'获得更小体积
});

Progress Tracking & Loading Indicators

进度跟踪与加载指示器

Models can be large (ranging from a few MB to several GB) and consist of multiple files. Track download progress by passing a callback to the

pipeline()

function:

javascript

import { pipeline } from '@huggingface/transformers';

// Track progress for each file
const fileProgress = {};

function onProgress(info) {
  console.log(`${info.status}: ${info.file}`);
  
  if (info.status === 'progress') {
    fileProgress[info.file] = info.progress;
    console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
  }
  
  if (info.status === 'done') {
    console.log(`✓ ${info.file} complete`);
  }
}

// Pass callback to pipeline
const classifier = await pipeline('sentiment-analysis', null, {
  progress_callback: onProgress
});

Progress Info Properties:

typescript

interface ProgressInfo {
  status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
  name: string;      // Model id or path
  file: string;      // File being processed
  progress?: number; // Percentage (0-100, only for 'progress' status)
  loaded?: number;   // Bytes downloaded (only for 'progress' status)
  total?: number;    // Total bytes (only for 'progress' status)
}

For complete examples including browser UIs, React components, CLI progress bars, and retry logic, see:

→ Pipeline Options - Progress Callback

模型体积可能很大（从几MB到几十GB不等），且包含多个文件。可通过向

pipeline()

函数传入回调函数来跟踪下载进度：

javascript

import { pipeline } from '@huggingface/transformers';

// 跟踪每个文件的进度
const fileProgress = {};

function onProgress(info) {
  console.log(`${info.status}: ${info.file}`);
  
  if (info.status === 'progress') {
    fileProgress[info.file] = info.progress;
    console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
  }
  
  if (info.status === 'done') {
    console.log(`✓ ${info.file} complete`);
  }
}

// 将回调函数传入pipeline
const classifier = await pipeline('sentiment-analysis', null, {
  progress_callback: onProgress
});

进度信息属性：

typescript

interface ProgressInfo {
  status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
  name: string;      // 模型ID或路径
  file: string;      // 当前处理的文件
  progress?: number; // 进度百分比（0-100，仅'status'为'progress'时可用）
  loaded?: number;   // 已下载字节数（仅'status'为'progress'时可用）
  total?: number;    // 总字节数（仅'status'为'progress'时可用）
}

包含浏览器UI、React组件、CLI进度条和重试逻辑的完整示例，可参考：

→ Pipeline选项 - 进度回调

Error Handling

错误处理

javascript

try {
  const pipe = await pipeline('sentiment-analysis', 'model-id');
  const result = await pipe('text to analyze');
} catch (error) {
  if (error.message.includes('fetch')) {
    console.error('Model download failed. Check internet connection.');
  } else if (error.message.includes('ONNX')) {
    console.error('Model execution failed. Check model compatibility.');
  } else {
    console.error('Unknown error:', error);
  }
}

javascript

try {
  const pipe = await pipeline('sentiment-analysis', 'model-id');
  const result = await pipe('text to analyze');
} catch (error) {
  if (error.message.includes('fetch')) {
    console.error('模型下载失败，请检查网络连接。');
  } else if (error.message.includes('ONNX')) {
    console.error('模型执行失败，请检查模型兼容性。');
  } else {
    console.error('未知错误:', error);
  }
}

Performance Tips

性能优化技巧

Reuse Pipelines: Create pipeline once, reuse for multiple inferences
Use Quantization: Start with
```
q8
```
or
```
q4
```
for faster inference
Batch Processing: Process multiple inputs together when possible
Cache Models: Models are cached automatically (see Caching Reference for details on browser Cache API, Node.js filesystem cache, and custom implementations)
WebGPU for Large Models: Use WebGPU for models that benefit from GPU acceleration
Prune Context: For text generation, limit
```
max_new_tokens
```
to avoid memory issues
Clean Up Resources: Call
```
pipe.dispose()
```
when done to free memory

复用Pipeline：创建一次Pipeline，多次用于推理
使用量化版本：优先使用
```
q8
```
或
```
q4
```
版本提升推理速度
批量处理：尽可能同时处理多个输入
缓存模型：模型会自动缓存（缓存详情可参考**缓存参考文档**，包含浏览器Cache API、Node.js文件系统缓存和自定义实现）
大型模型使用WebGPU：对受益于GPU加速的模型使用WebGPU
修剪上下文：文本生成时限制
```
max_new_tokens
```
以避免内存问题
清理资源：使用完毕后调用
```
pipe.dispose()
```
释放内存

Memory Management

内存管理

IMPORTANT: Always call

pipe.dispose()

when finished to prevent memory leaks.

javascript

const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose();  // ✓ Free memory (100MB - several GB per model)

When to dispose:

Application shutdown or component unmount
Before loading a different model
After batch processing in long-running apps

Models consume significant memory and hold GPU/CPU resources. Disposal is critical for browser memory limits and server stability.

For detailed patterns (React cleanup, servers, browser), see Code Examples

重要提示： 使用完毕后务必调用

pipe.dispose()

，以防止内存泄漏。

javascript

const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose();  // ✓ 释放内存（每个模型占用100MB到数GB不等）

何时释放：

应用关闭或组件卸载时
加载不同模型之前
长运行应用中批量处理完成后

模型会占用大量内存并持有GPU/CPU资源。释放资源对浏览器内存限制和服务器稳定性至关重要。

不同场景下的详细处理模式（React清理、服务器、浏览器）可参考**代码示例**

Troubleshooting

故障排除

Model Not Found

模型未找到

Verify model exists on Hugging Face Hub
Check model name spelling
Ensure model has ONNX files (look for
```
onnx
```
folder in model repo)

确认模型在Hugging Face Hub存在
检查模型名称拼写
确保模型包含ONNX文件（查看模型仓库中的
```
onnx
```
文件夹）

Memory Issues

内存问题

Use smaller models or quantized versions (
```
dtype: 'q4'
```
)
Reduce batch size
Limit sequence length with
```
max_length
```

使用更小的模型或量化版本（
```
dtype: 'q4'
```
）
减小批量大小
通过
```
max_length
```
限制序列长度

WebGPU Errors

WebGPU错误

Check browser compatibility (Chrome 113+, Edge 113+)
Try
```
dtype: 'fp16'
```
if
```
fp32
```
fails
Fall back to WASM if WebGPU unavailable

检查浏览器兼容性（Chrome 113+、Edge 113+）
若
```
fp32
```
失败，尝试
```
dtype: 'fp16'
```
若WebGPU不可用，回退到WASM

Reference Documentation

参考文档

This Skill

本技能文档

Pipeline Options - Configure
```
pipeline()
```
with
```
progress_callback
```
,
```
device
```
,
```
dtype
```
, etc.
Configuration Reference - Global
```
env
```
configuration for caching and model loading
Caching Reference - Browser Cache API, Node.js filesystem cache, and custom cache implementations
Text Generation Guide - Streaming, chat format, and generation parameters
Model Architectures - Supported models and selection tips
Code Examples - Real-world implementations for different runtimes

Pipeline选项 - 配置
```
pipeline()
```
的
```
progress_callback
```
、
```
device
```
、
```
dtype
```
等参数
配置参考文档 - 全局
```
env
```
配置，用于缓存和模型加载
缓存参考文档 - 浏览器Cache API、Node.js文件系统缓存和自定义缓存实现
文本生成指南 - 流式输出、对话格式和生成参数
模型架构 - 支持的模型和选择技巧
代码示例 - 不同运行时的实际实现

Official Transformers.js

官方Transformers.js文档

Official docs: https://huggingface.co/docs/transformers.js
API reference: https://huggingface.co/docs/transformers.js/api/pipelines
Model hub: https://huggingface.co/models?library=transformers.js
GitHub: https://github.com/huggingface/transformers.js
Examples: https://github.com/huggingface/transformers.js/tree/main/examples

官方文档：https://huggingface.co/docs/transformers.js
API参考：https://huggingface.co/docs/transformers.js/api/pipelines
模型中心：https://huggingface.co/models?library=transformers.js
GitHub仓库：https://github.com/huggingface/transformers.js
示例：https://github.com/huggingface/transformers.js/tree/main/examples

Best Practices

最佳实践

Always Dispose Pipelines: Call
```
pipe.dispose()
```
when done - critical for preventing memory leaks
Start with Pipelines: Use the pipeline API unless you need fine-grained control
Test Locally First: Test models with small inputs before deploying
Monitor Model Sizes: Be aware of model download sizes for web applications
Handle Loading States: Show progress indicators for better UX
Version Pin: Pin specific model versions for production stability
Error Boundaries: Always wrap pipeline calls in try-catch blocks
Progressive Enhancement: Provide fallbacks for unsupported browsers
Reuse Models: Load once, use many times - don't recreate pipelines unnecessarily
Graceful Shutdown: Dispose models on SIGTERM/SIGINT in servers

务必释放Pipeline：使用完毕后调用
```
pipe.dispose()
```
- 这是防止内存泄漏的关键
从Pipeline开始：除非需要精细控制，否则优先使用Pipeline API
先本地测试：部署前先使用小输入测试模型
关注模型大小：Web应用需注意模型下载体积
处理加载状态：显示进度指示器以提升用户体验
版本锁定：生产环境中锁定特定模型版本以保证稳定性
错误边界：始终将Pipeline调用包裹在try-catch块中
渐进式增强：为不支持的浏览器提供回退方案
复用模型：加载一次，多次使用 - 不要重复创建Pipeline
优雅关闭：服务器中在SIGTERM/SIGINT信号触发时释放模型

Quick Reference: Task IDs

快速参考：任务ID

Task	Task ID
Text classification	`text-classification` or `sentiment-analysis`
Token classification	`token-classification` or `ner`
Question answering	`question-answering`
Fill mask	`fill-mask`
Summarization	`summarization`
Translation	`translation`
Text generation	`text-generation`
Text-to-text generation	`text2text-generation`
Zero-shot classification	`zero-shot-classification`
Image classification	`image-classification`
Image segmentation	`image-segmentation`
Object detection	`object-detection`
Depth estimation	`depth-estimation`
Image-to-image	`image-to-image`
Zero-shot image classification	`zero-shot-image-classification`
Zero-shot object detection	`zero-shot-object-detection`
Automatic speech recognition	`automatic-speech-recognition`
Audio classification	`audio-classification`
Text-to-speech	`text-to-speech` or `text-to-audio`
Image-to-text	`image-to-text`
Document question answering	`document-question-answering`
Feature extraction	`feature-extraction`
Sentence similarity	`sentence-similarity`

This skill enables you to integrate state-of-the-art machine learning capabilities directly into JavaScript applications without requiring separate ML servers or Python environments.

任务	任务ID
文本分类	`text-classification` 或 `sentiment-analysis`
分词分类	`token-classification` 或 `ner`
问答系统	`question-answering`
掩码填充	`fill-mask`
文本摘要	`summarization`
翻译	`translation`
文本生成	`text-generation`
文本到文本生成	`text2text-generation`
零样本分类	`zero-shot-classification`
图像分类	`image-classification`
图像分割	`image-segmentation`
目标检测	`object-detection`
深度估计	`depth-estimation`
图像到图像	`image-to-image`
零样本图像分类	`zero-shot-image-classification`
零样本目标检测	`zero-shot-object-detection`
自动语音识别	`automatic-speech-recognition`
音频分类	`audio-classification`
文本转语音	`text-to-speech` 或 `text-to-audio`
图像转文本	`image-to-text`
文档问答	`document-question-answering`
特征提取	`feature-extraction`
句子相似度	`sentence-similarity`

本技能可让你无需独立的机器学习服务器或Python环境，直接将最先进的机器学习能力集成到JavaScript应用中。