transformers-js

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Transformers.js - Machine Learning for JavaScript

Transformers.js - 面向JavaScript的机器学习库

Transformers.js enables running state-of-the-art machine learning models directly in JavaScript, both in browsers and Node.js environments, with no server required.
Transformers.js无需服务器,即可在浏览器和Node.js环境中直接通过JavaScript运行最先进的机器学习模型。

When to Use This Skill

适用场景

Use this skill when you need to:
  • Run ML models for text analysis, generation, or translation in JavaScript
  • Perform image classification, object detection, or segmentation
  • Implement speech recognition or audio processing
  • Build multimodal AI applications (text-to-image, image-to-text, etc.)
  • Run models client-side in the browser without a backend
在以下场景中可使用本工具:
  • 在JavaScript中运行文本分析、生成或翻译类机器学习模型
  • 执行图像分类、目标检测或分割任务
  • 实现语音识别或音频处理功能
  • 构建多模态AI应用(文本转图像、图像转文本等)
  • 在浏览器客户端运行模型,无需后端支持

Installation

安装方法

NPM Installation

NPM安装

bash
npm install @huggingface/transformers
bash
npm install @huggingface/transformers

Browser Usage (CDN)

浏览器使用(CDN)

javascript
<script type="module">
  import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>
javascript
<script type="module">
  import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>

Core Concepts

核心概念

1. Pipeline API

1. Pipeline API

The pipeline API is the easiest way to use models. It groups together preprocessing, model inference, and postprocessing:
javascript
import { pipeline } from '@huggingface/transformers';

// Create a pipeline for a specific task
const pipe = await pipeline('sentiment-analysis');

// Use the pipeline
const result = await pipe('I love transformers!');
// Output: [{ label: 'POSITIVE', score: 0.999817686 }]

// IMPORTANT: Always dispose when done to free memory
await classifier.dispose();
⚠️ Memory Management: All pipelines must be disposed with
pipe.dispose()
when finished to prevent memory leaks. See examples in Code Examples for cleanup patterns across different environments.
Pipeline API是使用模型最简单的方式,它将预处理、模型推理和后处理整合在一起:
javascript
import { pipeline } from '@huggingface/transformers';

// 创建针对特定任务的pipeline
const pipe = await pipeline('sentiment-analysis');

// 使用pipeline
const result = await pipe('I love transformers!');
// 输出: [{ label: 'POSITIVE', score: 0.999817686 }]

// 重要提示:使用完毕后务必调用dispose释放内存
await classifier.dispose();
⚠️ 内存管理: 所有pipeline使用完毕后必须调用
pipe.dispose()
,以防止内存泄漏。不同环境下的清理模式可参考代码示例

2. Model Selection

2. 模型选择

You can specify a custom model as the second argument:
javascript
const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/bert-base-multilingual-uncased-sentiment'
);
Finding Models:
Browse available Transformers.js models on Hugging Face Hub:
Tip: Filter by task type, sort by trending/downloads, and check model cards for performance metrics and usage examples.
可通过第二个参数指定自定义模型:
javascript
const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/bert-base-multilingual-uncased-sentiment'
);
查找模型:
可在Hugging Face Hub浏览兼容Transformers.js的模型:
提示: 按任务类型筛选,按热度/下载量排序,并查看模型卡片了解性能指标和使用示例。

3. Device Selection

3. 设备选择

Choose where to run the model:
javascript
// Run on CPU (default for WASM)
const pipe = await pipeline('sentiment-analysis', 'model-id');

// Run on GPU (WebGPU - experimental)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  device: 'webgpu',
});
可选择模型运行的设备:
javascript
// 在CPU上运行(WASM默认选项)
const pipe = await pipeline('sentiment-analysis', 'model-id');

// 在GPU上运行(WebGPU - 实验性)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  device: 'webgpu',
});

4. Quantization Options

4. 量化选项

Control model precision vs. performance:
javascript
// Use quantized model (faster, smaller)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q4',  // Options: 'fp32', 'fp16', 'q8', 'q4'
});
可控制模型精度与性能的平衡:
javascript
// 使用量化模型(速度更快、体积更小)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q4',  // 可选值: 'fp32', 'fp16', 'q8', 'q4'
});

Supported Tasks

支持的任务

Note: All examples below show basic usage.
注意: 以下示例均为基础用法。

Natural Language Processing

自然语言处理

Text Classification

文本分类

javascript
const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');
javascript
const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');

Named Entity Recognition (NER)

命名实体识别(NER)

javascript
const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');
javascript
const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');

Question Answering

问答系统

javascript
const qa = await pipeline('question-answering');
const answer = await qa({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});
javascript
const qa = await pipeline('question-answering');
const answer = await qa({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});

Text Generation

文本生成

javascript
const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
  max_new_tokens: 100,
  temperature: 0.7
});
For streaming and chat: See Text Generation Guide for:
  • Streaming token-by-token output with
    TextStreamer
  • Chat/conversation format with system/user/assistant roles
  • Generation parameters (temperature, top_k, top_p)
  • Browser and Node.js examples
  • React components and API endpoints
javascript
const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
  max_new_tokens: 100,
  temperature: 0.7
});
流式输出与对话功能: 详见**文本生成指南**,包含:
  • 使用
    TextStreamer
    实现逐词流式输出
  • 支持系统/用户/助手角色的对话格式
  • 生成参数(temperature、top_k、top_p)
  • 浏览器和Node.js示例
  • React组件与API端点

Translation

翻译

javascript
const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
  src_lang: 'eng_Latn',
  tgt_lang: 'fra_Latn'
});
javascript
const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
  src_lang: 'eng_Latn',
  tgt_lang: 'fra_Latn'
});

Summarization

文本摘要

javascript
const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
  max_length: 100,
  min_length: 30
});
javascript
const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
  max_length: 100,
  min_length: 30
});

Zero-Shot Classification

零样本分类

javascript
const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);
javascript
const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);

Computer Vision

计算机视觉

Image Classification

图像分类

javascript
const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// Or with local file
const result = await classifier(imageUrl);
javascript
const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// 或使用本地文件
const result = await classifier(imageUrl);

Object Detection

目标检测

javascript
const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// Returns: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]
javascript
const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// 返回结果: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]

Image Segmentation

图像分割

javascript
const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');
javascript
const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');

Depth Estimation

深度估计

javascript
const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');
javascript
const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');

Zero-Shot Image Classification

零样本图像分类

javascript
const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);
javascript
const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);

Audio Processing

音频处理

Automatic Speech Recognition

自动语音识别

javascript
const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// Returns: { text: 'transcribed text here' }
javascript
const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// 返回结果: { text: 'transcribed text here' }

Audio Classification

音频分类

javascript
const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');
javascript
const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');

Text-to-Speech

文本转语音

javascript
const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
  speaker_embeddings: speakerEmbeddings
});
javascript
const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
  speaker_embeddings: speakerEmbeddings
});

Multimodal

多模态

Image-to-Text (Image Captioning)

图像转文本(图像描述)

javascript
const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');
javascript
const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');

Document Question Answering

文档问答

javascript
const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');
javascript
const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');

Zero-Shot Object Detection

零样本目标检测

javascript
const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);
javascript
const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);

Feature Extraction (Embeddings)

特征提取(嵌入)

javascript
const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// Returns: tensor of shape [1, sequence_length, hidden_size]

// For sentence embeddings (mean pooling)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });
javascript
const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// 返回结果: 形状为 [1, sequence_length, hidden_size] 的张量

// 句子嵌入(均值池化)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });

Finding and Choosing Models

模型查找与选择

Browsing the Hugging Face Hub

浏览Hugging Face Hub

Discover compatible Transformers.js models on Hugging Face Hub:
Base URL (all models):
https://huggingface.co/models?library=transformers.js&sort=trending
Filter by task using the
pipeline_tag
parameter:
TaskURL
Text Generationhttps://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
Text Classificationhttps://huggingface.co/models?pipeline_tag=text-classification&library=transformers.js&sort=trending
Translationhttps://huggingface.co/models?pipeline_tag=translation&library=transformers.js&sort=trending
Summarizationhttps://huggingface.co/models?pipeline_tag=summarization&library=transformers.js&sort=trending
Question Answeringhttps://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js&sort=trending
Image Classificationhttps://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
Object Detectionhttps://huggingface.co/models?pipeline_tag=object-detection&library=transformers.js&sort=trending
Image Segmentationhttps://huggingface.co/models?pipeline_tag=image-segmentation&library=transformers.js&sort=trending
Speech Recognitionhttps://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
Audio Classificationhttps://huggingface.co/models?pipeline_tag=audio-classification&library=transformers.js&sort=trending
Image-to-Texthttps://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js&sort=trending
Feature Extractionhttps://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js&sort=trending
Zero-Shot Classificationhttps://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js&sort=trending
Sort options:
  • &sort=trending
    - Most popular recently
  • &sort=downloads
    - Most downloaded overall
  • &sort=likes
    - Most liked by community
  • &sort=modified
    - Recently updated
在Hugging Face Hub发现兼容Transformers.js的模型:
基础链接(全部模型):
https://huggingface.co/models?library=transformers.js&sort=trending
通过
pipeline_tag
参数按任务筛选:
排序选项:
  • &sort=trending
    - 近期最热门
  • &sort=downloads
    - 总下载量最高
  • &sort=likes
    - 社区最喜爱
  • &sort=modified
    - 最近更新

Choosing the Right Model

选择合适的模型

Consider these factors when selecting a model:
1. Model Size
  • Small (< 100MB): Fast, suitable for browsers, limited accuracy
  • Medium (100MB - 500MB): Balanced performance, good for most use cases
  • Large (> 500MB): High accuracy, slower, better for Node.js or powerful devices
2. Quantization Models are often available in different quantization levels:
  • fp32
    - Full precision (largest, most accurate)
  • fp16
    - Half precision (smaller, still accurate)
  • q8
    - 8-bit quantized (much smaller, slight accuracy loss)
  • q4
    - 4-bit quantized (smallest, noticeable accuracy loss)
3. Task Compatibility Check the model card for:
  • Supported tasks (some models support multiple tasks)
  • Input/output formats
  • Language support (multilingual vs. English-only)
  • License restrictions
4. Performance Metrics Model cards typically show:
  • Accuracy scores
  • Benchmark results
  • Inference speed
  • Memory requirements
选择模型时需考虑以下因素:
1. 模型大小
  • 小型(<100MB):速度快,适合浏览器,精度有限
  • 中型(100MB-500MB):性能均衡,适用于大多数场景
  • 大型(>500MB):精度高,速度较慢,更适合Node.js或高性能设备
2. 量化级别 模型通常提供不同的量化版本:
  • fp32
    - 全精度(体积最大,精度最高)
  • fp16
    - 半精度(体积较小,精度仍有保障)
  • q8
    - 8位量化(体积大幅减小,精度略有损失)
  • q4
    - 4位量化(体积最小,精度损失明显)
3. 任务兼容性 查看模型卡片确认:
  • 支持的任务部分模型支持多任务
  • 输入/输出格式
  • 语言支持(多语言 vs 仅英文)
  • 许可证限制
4. 性能指标 模型卡片通常包含:
  • 精度分数
  • 基准测试结果
  • 推理速度
  • 内存需求

Example: Finding a Text Generation Model

示例:查找文本生成模型

javascript
// 1. Visit: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending

// 2. Browse and select a model (e.g., onnx-community/gemma-3-270m-it-ONNX)

// 3. Check model card for:
//    - Model size: ~270M parameters
//    - Quantization: q4 available
//    - Language: English
//    - Use case: Instruction-following chat

// 4. Use the model:
import { pipeline } from '@huggingface/transformers';

const generator = await pipeline(
  'text-generation',
  'onnx-community/gemma-3-270m-it-ONNX',
  { dtype: 'q4' } // Use quantized version for faster inference
);

const output = await generator('Explain quantum computing in simple terms.', {
  max_new_tokens: 100
});

await generator.dispose();
javascript
// 1. 访问:https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending

// 2. 浏览并选择模型(例如:onnx-community/gemma-3-270m-it-ONNX)

// 3. 查看模型卡片:
//    - 模型大小:约2.7亿参数
//    - 量化版本:支持q4
//    - 语言:英文
//    - 适用场景:指令遵循式对话

// 4. 使用模型:
import { pipeline } from '@huggingface/transformers';

const generator = await pipeline(
  'text-generation',
  'onnx-community/gemma-3-270m-it-ONNX',
  { dtype: 'q4' } // 使用量化版本提升推理速度
);

const output = await generator('Explain quantum computing in simple terms.', {
  max_new_tokens: 100
});

await generator.dispose();

Tips for Model Selection

模型选择技巧

  1. Start Small: Test with a smaller model first, then upgrade if needed
  2. Check ONNX Support: Ensure the model has ONNX files (look for
    onnx
    folder in model repo)
  3. Read Model Cards: Model cards contain usage examples, limitations, and benchmarks
  4. Test Locally: Benchmark inference speed and memory usage in your environment
  5. Community Models: Look for models by
    Xenova
    (Transformers.js maintainer) or
    onnx-community
  6. Version Pin: Use specific git commits in production for stability:
    javascript
    const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });
  1. 从小模型开始:先使用小型模型测试,必要时再升级
  2. 检查ONNX支持:确保模型包含ONNX文件(查看模型仓库中的
    onnx
    文件夹)
  3. 阅读模型卡片:模型卡片包含使用示例、局限性和基准测试
  4. 本地测试:在你的环境中测试推理速度和内存占用
  5. 社区模型:优先选择
    Xenova
    (Transformers.js维护者)或
    onnx-community
    提供的模型
  6. 版本锁定:生产环境中使用特定的Git提交以保证稳定性:
    javascript
    const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });

Advanced Configuration

高级配置

Environment Configuration (
env
)

环境配置(
env

The
env
object provides comprehensive control over Transformers.js execution, caching, and model loading.
Quick Overview:
javascript
import { env } from '@huggingface/transformers';

// View version
console.log(env.version); // e.g., '3.8.1'

// Common settings
env.allowRemoteModels = true;  // Load from Hugging Face Hub
env.allowLocalModels = false;  // Load from file system
env.localModelPath = '/models/'; // Local model directory
env.useFSCache = true;         // Cache models on disk (Node.js)
env.useBrowserCache = true;    // Cache models in browser
env.cacheDir = './.cache';     // Cache directory location
Configuration Patterns:
javascript
// Development: Fast iteration with remote models
env.allowRemoteModels = true;
env.useFSCache = true;

// Production: Local models only
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';

// Custom CDN
env.remoteHost = 'https://cdn.example.com/models';

// Disable caching (testing)
env.useFSCache = false;
env.useBrowserCache = false;
For complete documentation on all configuration options, caching strategies, cache management, pre-downloading models, and more, see:
Configuration Reference
env
对象可全面控制Transformers.js的执行、缓存和模型加载。
快速概览:
javascript
import { env } from '@huggingface/transformers';

// 查看版本
console.log(env.version); // 示例: '3.8.1'

// 常用设置
env.allowRemoteModels = true;  // 从Hugging Face Hub加载模型
env.allowLocalModels = false;  // 从文件系统加载模型
env.localModelPath = '/models/'; // 本地模型目录
env.useFSCache = true;         // 在Node.js中缓存模型到磁盘
env.useBrowserCache = true;    // 在浏览器中缓存模型
env.cacheDir = './.cache';     // 缓存目录位置
配置模式:
javascript
// 开发环境:快速迭代,使用远程模型
env.allowRemoteModels = true;
env.useFSCache = true;

// 生产环境:仅使用本地模型
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';

// 自定义CDN
env.remoteHost = 'https://cdn.example.com/models';

// 禁用缓存(测试用)
env.useFSCache = false;
env.useBrowserCache = false;
所有配置选项、缓存策略、缓存管理和模型预下载等详细内容,可参考:
配置参考文档

Working with Tensors

张量操作

javascript
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';

// Load tokenizer and model separately for more control
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');

// Tokenize input
const inputs = await tokenizer('Hello world!');

// Run model
const outputs = await model(inputs);
javascript
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';

// 单独加载分词器和模型以获得更精细的控制
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');

// 对输入进行分词
const inputs = await tokenizer('Hello world!');

// 运行模型
const outputs = await model(inputs);

Batch Processing

批量处理

javascript
const classifier = await pipeline('sentiment-analysis');

// Process multiple texts
const results = await classifier([
  'I love this!',
  'This is terrible.',
  'It was okay.'
]);
javascript
const classifier = await pipeline('sentiment-analysis');

// 处理多个文本
const results = await classifier([
  'I love this!',
  'This is terrible.',
  'It was okay.'
]);

Browser-Specific Considerations

浏览器特定注意事项

WebGPU Usage

WebGPU使用

WebGPU provides GPU acceleration in browsers:
javascript
const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
  device: 'webgpu',
  dtype: 'fp32'
});
Note: WebGPU is experimental. Check browser compatibility and file issues if problems occur.
WebGPU可在浏览器中提供GPU加速:
javascript
const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
  device: 'webgpu',
  dtype: 'fp32'
});
注意:WebGPU为实验性功能。请检查浏览器兼容性,若出现问题可提交反馈。

WASM Performance

WASM性能

Default browser execution uses WASM:
javascript
// Optimized for browsers with quantization
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q8'  // or 'q4' for even smaller size
});
浏览器默认使用WASM执行:
javascript
// 针对浏览器优化,使用量化版本
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q8'  // 或使用'q4'获得更小体积
});

Progress Tracking & Loading Indicators

进度跟踪与加载指示器

Models can be large (ranging from a few MB to several GB) and consist of multiple files. Track download progress by passing a callback to the
pipeline()
function:
javascript
import { pipeline } from '@huggingface/transformers';

// Track progress for each file
const fileProgress = {};

function onProgress(info) {
  console.log(`${info.status}: ${info.file}`);
  
  if (info.status === 'progress') {
    fileProgress[info.file] = info.progress;
    console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
  }
  
  if (info.status === 'done') {
    console.log(`${info.file} complete`);
  }
}

// Pass callback to pipeline
const classifier = await pipeline('sentiment-analysis', null, {
  progress_callback: onProgress
});
Progress Info Properties:
typescript
interface ProgressInfo {
  status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
  name: string;      // Model id or path
  file: string;      // File being processed
  progress?: number; // Percentage (0-100, only for 'progress' status)
  loaded?: number;   // Bytes downloaded (only for 'progress' status)
  total?: number;    // Total bytes (only for 'progress' status)
}
For complete examples including browser UIs, React components, CLI progress bars, and retry logic, see:
Pipeline Options - Progress Callback
模型体积可能很大(从几MB到几十GB不等),且包含多个文件。可通过向
pipeline()
函数传入回调函数来跟踪下载进度:
javascript
import { pipeline } from '@huggingface/transformers';

// 跟踪每个文件的进度
const fileProgress = {};

function onProgress(info) {
  console.log(`${info.status}: ${info.file}`);
  
  if (info.status === 'progress') {
    fileProgress[info.file] = info.progress;
    console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
  }
  
  if (info.status === 'done') {
    console.log(`${info.file} complete`);
  }
}

// 将回调函数传入pipeline
const classifier = await pipeline('sentiment-analysis', null, {
  progress_callback: onProgress
});
进度信息属性:
typescript
interface ProgressInfo {
  status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
  name: string;      // 模型ID或路径
  file: string;      // 当前处理的文件
  progress?: number; // 进度百分比(0-100,仅'status'为'progress'时可用)
  loaded?: number;   // 已下载字节数(仅'status'为'progress'时可用)
  total?: number;    // 总字节数(仅'status'为'progress'时可用)
}
包含浏览器UI、React组件、CLI进度条和重试逻辑的完整示例,可参考:
Pipeline选项 - 进度回调

Error Handling

错误处理

javascript
try {
  const pipe = await pipeline('sentiment-analysis', 'model-id');
  const result = await pipe('text to analyze');
} catch (error) {
  if (error.message.includes('fetch')) {
    console.error('Model download failed. Check internet connection.');
  } else if (error.message.includes('ONNX')) {
    console.error('Model execution failed. Check model compatibility.');
  } else {
    console.error('Unknown error:', error);
  }
}
javascript
try {
  const pipe = await pipeline('sentiment-analysis', 'model-id');
  const result = await pipe('text to analyze');
} catch (error) {
  if (error.message.includes('fetch')) {
    console.error('模型下载失败,请检查网络连接。');
  } else if (error.message.includes('ONNX')) {
    console.error('模型执行失败,请检查模型兼容性。');
  } else {
    console.error('未知错误:', error);
  }
}

Performance Tips

性能优化技巧

  1. Reuse Pipelines: Create pipeline once, reuse for multiple inferences
  2. Use Quantization: Start with
    q8
    or
    q4
    for faster inference
  3. Batch Processing: Process multiple inputs together when possible
  4. Cache Models: Models are cached automatically (see Caching Reference for details on browser Cache API, Node.js filesystem cache, and custom implementations)
  5. WebGPU for Large Models: Use WebGPU for models that benefit from GPU acceleration
  6. Prune Context: For text generation, limit
    max_new_tokens
    to avoid memory issues
  7. Clean Up Resources: Call
    pipe.dispose()
    when done to free memory
  1. 复用Pipeline:创建一次Pipeline,多次用于推理
  2. 使用量化版本:优先使用
    q8
    q4
    版本提升推理速度
  3. 批量处理:尽可能同时处理多个输入
  4. 缓存模型:模型会自动缓存(缓存详情可参考**缓存参考文档**,包含浏览器Cache API、Node.js文件系统缓存和自定义实现)
  5. 大型模型使用WebGPU:对受益于GPU加速的模型使用WebGPU
  6. 修剪上下文:文本生成时限制
    max_new_tokens
    以避免内存问题
  7. 清理资源:使用完毕后调用
    pipe.dispose()
    释放内存

Memory Management

内存管理

IMPORTANT: Always call
pipe.dispose()
when finished to prevent memory leaks.
javascript
const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose();  // ✓ Free memory (100MB - several GB per model)
When to dispose:
  • Application shutdown or component unmount
  • Before loading a different model
  • After batch processing in long-running apps
Models consume significant memory and hold GPU/CPU resources. Disposal is critical for browser memory limits and server stability.
For detailed patterns (React cleanup, servers, browser), see Code Examples
重要提示: 使用完毕后务必调用
pipe.dispose()
,以防止内存泄漏。
javascript
const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose();  // ✓ 释放内存(每个模型占用100MB到数GB不等)
何时释放:
  • 应用关闭或组件卸载时
  • 加载不同模型之前
  • 长运行应用中批量处理完成后
模型会占用大量内存并持有GPU/CPU资源。释放资源对浏览器内存限制和服务器稳定性至关重要。
不同场景下的详细处理模式(React清理、服务器、浏览器)可参考**代码示例**

Troubleshooting

故障排除

Model Not Found

模型未找到

  • Verify model exists on Hugging Face Hub
  • Check model name spelling
  • Ensure model has ONNX files (look for
    onnx
    folder in model repo)
  • 确认模型在Hugging Face Hub存在
  • 检查模型名称拼写
  • 确保模型包含ONNX文件(查看模型仓库中的
    onnx
    文件夹)

Memory Issues

内存问题

  • Use smaller models or quantized versions (
    dtype: 'q4'
    )
  • Reduce batch size
  • Limit sequence length with
    max_length
  • 使用更小的模型或量化版本(
    dtype: 'q4'
  • 减小批量大小
  • 通过
    max_length
    限制序列长度

WebGPU Errors

WebGPU错误

  • Check browser compatibility (Chrome 113+, Edge 113+)
  • Try
    dtype: 'fp16'
    if
    fp32
    fails
  • Fall back to WASM if WebGPU unavailable
  • 检查浏览器兼容性(Chrome 113+、Edge 113+)
  • fp32
    失败,尝试
    dtype: 'fp16'
  • 若WebGPU不可用,回退到WASM

Reference Documentation

参考文档

This Skill

本技能文档

  • Pipeline Options - Configure
    pipeline()
    with
    progress_callback
    ,
    device
    ,
    dtype
    , etc.
  • Configuration Reference - Global
    env
    configuration for caching and model loading
  • Caching Reference - Browser Cache API, Node.js filesystem cache, and custom cache implementations
  • Text Generation Guide - Streaming, chat format, and generation parameters
  • Model Architectures - Supported models and selection tips
  • Code Examples - Real-world implementations for different runtimes
  • Pipeline选项 - 配置
    pipeline()
    progress_callback
    device
    dtype
    等参数
  • 配置参考文档 - 全局
    env
    配置,用于缓存和模型加载
  • 缓存参考文档 - 浏览器Cache API、Node.js文件系统缓存和自定义缓存实现
  • 文本生成指南 - 流式输出、对话格式和生成参数
  • 模型架构 - 支持的模型和选择技巧
  • 代码示例 - 不同运行时的实际实现

Official Transformers.js

官方Transformers.js文档

Best Practices

最佳实践

  1. Always Dispose Pipelines: Call
    pipe.dispose()
    when done - critical for preventing memory leaks
  2. Start with Pipelines: Use the pipeline API unless you need fine-grained control
  3. Test Locally First: Test models with small inputs before deploying
  4. Monitor Model Sizes: Be aware of model download sizes for web applications
  5. Handle Loading States: Show progress indicators for better UX
  6. Version Pin: Pin specific model versions for production stability
  7. Error Boundaries: Always wrap pipeline calls in try-catch blocks
  8. Progressive Enhancement: Provide fallbacks for unsupported browsers
  9. Reuse Models: Load once, use many times - don't recreate pipelines unnecessarily
  10. Graceful Shutdown: Dispose models on SIGTERM/SIGINT in servers
  1. 务必释放Pipeline:使用完毕后调用
    pipe.dispose()
    - 这是防止内存泄漏的关键
  2. 从Pipeline开始:除非需要精细控制,否则优先使用Pipeline API
  3. 先本地测试:部署前先使用小输入测试模型
  4. 关注模型大小:Web应用需注意模型下载体积
  5. 处理加载状态:显示进度指示器以提升用户体验
  6. 版本锁定:生产环境中锁定特定模型版本以保证稳定性
  7. 错误边界:始终将Pipeline调用包裹在try-catch块中
  8. 渐进式增强:为不支持的浏览器提供回退方案
  9. 复用模型:加载一次,多次使用 - 不要重复创建Pipeline
  10. 优雅关闭:服务器中在SIGTERM/SIGINT信号触发时释放模型

Quick Reference: Task IDs

快速参考:任务ID

TaskTask ID
Text classification
text-classification
or
sentiment-analysis
Token classification
token-classification
or
ner
Question answering
question-answering
Fill mask
fill-mask
Summarization
summarization
Translation
translation
Text generation
text-generation
Text-to-text generation
text2text-generation
Zero-shot classification
zero-shot-classification
Image classification
image-classification
Image segmentation
image-segmentation
Object detection
object-detection
Depth estimation
depth-estimation
Image-to-image
image-to-image
Zero-shot image classification
zero-shot-image-classification
Zero-shot object detection
zero-shot-object-detection
Automatic speech recognition
automatic-speech-recognition
Audio classification
audio-classification
Text-to-speech
text-to-speech
or
text-to-audio
Image-to-text
image-to-text
Document question answering
document-question-answering
Feature extraction
feature-extraction
Sentence similarity
sentence-similarity

This skill enables you to integrate state-of-the-art machine learning capabilities directly into JavaScript applications without requiring separate ML servers or Python environments.
任务任务ID
文本分类
text-classification
sentiment-analysis
分词分类
token-classification
ner
问答系统
question-answering
掩码填充
fill-mask
文本摘要
summarization
翻译
translation
文本生成
text-generation
文本到文本生成
text2text-generation
零样本分类
zero-shot-classification
图像分类
image-classification
图像分割
image-segmentation
目标检测
object-detection
深度估计
depth-estimation
图像到图像
image-to-image
零样本图像分类
zero-shot-image-classification
零样本目标检测
zero-shot-object-detection
自动语音识别
automatic-speech-recognition
音频分类
audio-classification
文本转语音
text-to-speech
text-to-audio
图像转文本
image-to-text
文档问答
document-question-answering
特征提取
feature-extraction
句子相似度
sentence-similarity

本技能可让你无需独立的机器学习服务器或Python环境,直接将最先进的机器学习能力集成到JavaScript应用中。