transformers-js
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTransformers.js - Machine Learning for JavaScript
Transformers.js - 面向JavaScript的机器学习库
Transformers.js enables running state-of-the-art machine learning models directly in JavaScript, both in browsers and Node.js environments, with no server required.
Transformers.js无需服务器,即可在浏览器和Node.js环境中直接通过JavaScript运行最先进的机器学习模型。
When to Use This Skill
适用场景
Use this skill when you need to:
- Run ML models for text analysis, generation, or translation in JavaScript
- Perform image classification, object detection, or segmentation
- Implement speech recognition or audio processing
- Build multimodal AI applications (text-to-image, image-to-text, etc.)
- Run models client-side in the browser without a backend
在以下场景中可使用本工具:
- 在JavaScript中运行文本分析、生成或翻译类机器学习模型
- 执行图像分类、目标检测或分割任务
- 实现语音识别或音频处理功能
- 构建多模态AI应用(文本转图像、图像转文本等)
- 在浏览器客户端运行模型,无需后端支持
Installation
安装方法
NPM Installation
NPM安装
bash
npm install @huggingface/transformersbash
npm install @huggingface/transformersBrowser Usage (CDN)
浏览器使用(CDN)
javascript
<script type="module">
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>javascript
<script type="module">
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>Core Concepts
核心概念
1. Pipeline API
1. Pipeline API
The pipeline API is the easiest way to use models. It groups together preprocessing, model inference, and postprocessing:
javascript
import { pipeline } from '@huggingface/transformers';
// Create a pipeline for a specific task
const pipe = await pipeline('sentiment-analysis');
// Use the pipeline
const result = await pipe('I love transformers!');
// Output: [{ label: 'POSITIVE', score: 0.999817686 }]
// IMPORTANT: Always dispose when done to free memory
await classifier.dispose();⚠️ Memory Management: All pipelines must be disposed with when finished to prevent memory leaks. See examples in Code Examples for cleanup patterns across different environments.
pipe.dispose()Pipeline API是使用模型最简单的方式,它将预处理、模型推理和后处理整合在一起:
javascript
import { pipeline } from '@huggingface/transformers';
// 创建针对特定任务的pipeline
const pipe = await pipeline('sentiment-analysis');
// 使用pipeline
const result = await pipe('I love transformers!');
// 输出: [{ label: 'POSITIVE', score: 0.999817686 }]
// 重要提示:使用完毕后务必调用dispose释放内存
await classifier.dispose();⚠️ 内存管理: 所有pipeline使用完毕后必须调用,以防止内存泄漏。不同环境下的清理模式可参考代码示例。
pipe.dispose()2. Model Selection
2. 模型选择
You can specify a custom model as the second argument:
javascript
const pipe = await pipeline(
'sentiment-analysis',
'Xenova/bert-base-multilingual-uncased-sentiment'
);Finding Models:
Browse available Transformers.js models on Hugging Face Hub:
- All models: https://huggingface.co/models?library=transformers.js&sort=trending
- By task: Add parameter
pipeline_tag- Text generation: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
- Image classification: https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
- Speech recognition: https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
Tip: Filter by task type, sort by trending/downloads, and check model cards for performance metrics and usage examples.
可通过第二个参数指定自定义模型:
javascript
const pipe = await pipeline(
'sentiment-analysis',
'Xenova/bert-base-multilingual-uncased-sentiment'
);查找模型:
可在Hugging Face Hub浏览兼容Transformers.js的模型:
- 全部模型:https://huggingface.co/models?library=transformers.js&sort=trending
- 按任务筛选:添加参数
pipeline_tag- 文本生成:https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
- 图像分类:https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
- 语音识别:https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
提示: 按任务类型筛选,按热度/下载量排序,并查看模型卡片了解性能指标和使用示例。
3. Device Selection
3. 设备选择
Choose where to run the model:
javascript
// Run on CPU (default for WASM)
const pipe = await pipeline('sentiment-analysis', 'model-id');
// Run on GPU (WebGPU - experimental)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
device: 'webgpu',
});可选择模型运行的设备:
javascript
// 在CPU上运行(WASM默认选项)
const pipe = await pipeline('sentiment-analysis', 'model-id');
// 在GPU上运行(WebGPU - 实验性)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
device: 'webgpu',
});4. Quantization Options
4. 量化选项
Control model precision vs. performance:
javascript
// Use quantized model (faster, smaller)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q4', // Options: 'fp32', 'fp16', 'q8', 'q4'
});可控制模型精度与性能的平衡:
javascript
// 使用量化模型(速度更快、体积更小)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q4', // 可选值: 'fp32', 'fp16', 'q8', 'q4'
});Supported Tasks
支持的任务
Note: All examples below show basic usage.
注意: 以下示例均为基础用法。
Natural Language Processing
自然语言处理
Text Classification
文本分类
javascript
const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');javascript
const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');Named Entity Recognition (NER)
命名实体识别(NER)
javascript
const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');javascript
const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');Question Answering
问答系统
javascript
const qa = await pipeline('question-answering');
const answer = await qa({
question: 'What is the capital of France?',
context: 'Paris is the capital and largest city of France.'
});javascript
const qa = await pipeline('question-answering');
const answer = await qa({
question: 'What is the capital of France?',
context: 'Paris is the capital and largest city of France.'
});Text Generation
文本生成
javascript
const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
max_new_tokens: 100,
temperature: 0.7
});For streaming and chat: See Text Generation Guide for:
- Streaming token-by-token output with
TextStreamer - Chat/conversation format with system/user/assistant roles
- Generation parameters (temperature, top_k, top_p)
- Browser and Node.js examples
- React components and API endpoints
javascript
const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
max_new_tokens: 100,
temperature: 0.7
});流式输出与对话功能: 详见**文本生成指南**,包含:
- 使用实现逐词流式输出
TextStreamer - 支持系统/用户/助手角色的对话格式
- 生成参数(temperature、top_k、top_p)
- 浏览器和Node.js示例
- React组件与API端点
Translation
翻译
javascript
const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
src_lang: 'eng_Latn',
tgt_lang: 'fra_Latn'
});javascript
const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
src_lang: 'eng_Latn',
tgt_lang: 'fra_Latn'
});Summarization
文本摘要
javascript
const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
max_length: 100,
min_length: 30
});javascript
const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
max_length: 100,
min_length: 30
});Zero-Shot Classification
零样本分类
javascript
const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);javascript
const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);Computer Vision
计算机视觉
Image Classification
图像分类
javascript
const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// Or with local file
const result = await classifier(imageUrl);javascript
const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// 或使用本地文件
const result = await classifier(imageUrl);Object Detection
目标检测
javascript
const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// Returns: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]javascript
const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// 返回结果: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]Image Segmentation
图像分割
javascript
const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');javascript
const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');Depth Estimation
深度估计
javascript
const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');javascript
const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');Zero-Shot Image Classification
零样本图像分类
javascript
const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);javascript
const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);Audio Processing
音频处理
Automatic Speech Recognition
自动语音识别
javascript
const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// Returns: { text: 'transcribed text here' }javascript
const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// 返回结果: { text: 'transcribed text here' }Audio Classification
音频分类
javascript
const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');javascript
const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');Text-to-Speech
文本转语音
javascript
const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
speaker_embeddings: speakerEmbeddings
});javascript
const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
speaker_embeddings: speakerEmbeddings
});Multimodal
多模态
Image-to-Text (Image Captioning)
图像转文本(图像描述)
javascript
const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');javascript
const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');Document Question Answering
文档问答
javascript
const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');javascript
const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');Zero-Shot Object Detection
零样本目标检测
javascript
const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);javascript
const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);Feature Extraction (Embeddings)
特征提取(嵌入)
javascript
const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// Returns: tensor of shape [1, sequence_length, hidden_size]
// For sentence embeddings (mean pooling)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });javascript
const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// 返回结果: 形状为 [1, sequence_length, hidden_size] 的张量
// 句子嵌入(均值池化)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });Finding and Choosing Models
模型查找与选择
Browsing the Hugging Face Hub
浏览Hugging Face Hub
Discover compatible Transformers.js models on Hugging Face Hub:
Base URL (all models):
https://huggingface.co/models?library=transformers.js&sort=trendingFilter by task using the parameter:
pipeline_tagSort options:
- - Most popular recently
&sort=trending - - Most downloaded overall
&sort=downloads - - Most liked by community
&sort=likes - - Recently updated
&sort=modified
在Hugging Face Hub发现兼容Transformers.js的模型:
基础链接(全部模型):
https://huggingface.co/models?library=transformers.js&sort=trending通过参数按任务筛选:
pipeline_tag排序选项:
- - 近期最热门
&sort=trending - - 总下载量最高
&sort=downloads - - 社区最喜爱
&sort=likes - - 最近更新
&sort=modified
Choosing the Right Model
选择合适的模型
Consider these factors when selecting a model:
1. Model Size
- Small (< 100MB): Fast, suitable for browsers, limited accuracy
- Medium (100MB - 500MB): Balanced performance, good for most use cases
- Large (> 500MB): High accuracy, slower, better for Node.js or powerful devices
2. Quantization
Models are often available in different quantization levels:
- - Full precision (largest, most accurate)
fp32 - - Half precision (smaller, still accurate)
fp16 - - 8-bit quantized (much smaller, slight accuracy loss)
q8 - - 4-bit quantized (smallest, noticeable accuracy loss)
q4
3. Task Compatibility
Check the model card for:
- Supported tasks (some models support multiple tasks)
- Input/output formats
- Language support (multilingual vs. English-only)
- License restrictions
4. Performance Metrics
Model cards typically show:
- Accuracy scores
- Benchmark results
- Inference speed
- Memory requirements
选择模型时需考虑以下因素:
1. 模型大小
- 小型(<100MB):速度快,适合浏览器,精度有限
- 中型(100MB-500MB):性能均衡,适用于大多数场景
- 大型(>500MB):精度高,速度较慢,更适合Node.js或高性能设备
2. 量化级别
模型通常提供不同的量化版本:
- - 全精度(体积最大,精度最高)
fp32 - - 半精度(体积较小,精度仍有保障)
fp16 - - 8位量化(体积大幅减小,精度略有损失)
q8 - - 4位量化(体积最小,精度损失明显)
q4
3. 任务兼容性
查看模型卡片确认:
- 支持的任务部分模型支持多任务
- 输入/输出格式
- 语言支持(多语言 vs 仅英文)
- 许可证限制
4. 性能指标
模型卡片通常包含:
- 精度分数
- 基准测试结果
- 推理速度
- 内存需求
Example: Finding a Text Generation Model
示例:查找文本生成模型
javascript
// 1. Visit: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
// 2. Browse and select a model (e.g., onnx-community/gemma-3-270m-it-ONNX)
// 3. Check model card for:
// - Model size: ~270M parameters
// - Quantization: q4 available
// - Language: English
// - Use case: Instruction-following chat
// 4. Use the model:
import { pipeline } from '@huggingface/transformers';
const generator = await pipeline(
'text-generation',
'onnx-community/gemma-3-270m-it-ONNX',
{ dtype: 'q4' } // Use quantized version for faster inference
);
const output = await generator('Explain quantum computing in simple terms.', {
max_new_tokens: 100
});
await generator.dispose();javascript
// 1. 访问:https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
// 2. 浏览并选择模型(例如:onnx-community/gemma-3-270m-it-ONNX)
// 3. 查看模型卡片:
// - 模型大小:约2.7亿参数
// - 量化版本:支持q4
// - 语言:英文
// - 适用场景:指令遵循式对话
// 4. 使用模型:
import { pipeline } from '@huggingface/transformers';
const generator = await pipeline(
'text-generation',
'onnx-community/gemma-3-270m-it-ONNX',
{ dtype: 'q4' } // 使用量化版本提升推理速度
);
const output = await generator('Explain quantum computing in simple terms.', {
max_new_tokens: 100
});
await generator.dispose();Tips for Model Selection
模型选择技巧
- Start Small: Test with a smaller model first, then upgrade if needed
- Check ONNX Support: Ensure the model has ONNX files (look for folder in model repo)
onnx - Read Model Cards: Model cards contain usage examples, limitations, and benchmarks
- Test Locally: Benchmark inference speed and memory usage in your environment
- Community Models: Look for models by (Transformers.js maintainer) or
Xenovaonnx-community - Version Pin: Use specific git commits in production for stability:
javascript
const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });
- 从小模型开始:先使用小型模型测试,必要时再升级
- 检查ONNX支持:确保模型包含ONNX文件(查看模型仓库中的文件夹)
onnx - 阅读模型卡片:模型卡片包含使用示例、局限性和基准测试
- 本地测试:在你的环境中测试推理速度和内存占用
- 社区模型:优先选择(Transformers.js维护者)或
Xenova提供的模型onnx-community - 版本锁定:生产环境中使用特定的Git提交以保证稳定性:
javascript
const pipe = await pipeline('task', 'model-id', { revision: 'abc123' });
Advanced Configuration
高级配置
Environment Configuration (env
)
env环境配置(env
)
envThe object provides comprehensive control over Transformers.js execution, caching, and model loading.
envQuick Overview:
javascript
import { env } from '@huggingface/transformers';
// View version
console.log(env.version); // e.g., '3.8.1'
// Common settings
env.allowRemoteModels = true; // Load from Hugging Face Hub
env.allowLocalModels = false; // Load from file system
env.localModelPath = '/models/'; // Local model directory
env.useFSCache = true; // Cache models on disk (Node.js)
env.useBrowserCache = true; // Cache models in browser
env.cacheDir = './.cache'; // Cache directory locationConfiguration Patterns:
javascript
// Development: Fast iteration with remote models
env.allowRemoteModels = true;
env.useFSCache = true;
// Production: Local models only
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';
// Custom CDN
env.remoteHost = 'https://cdn.example.com/models';
// Disable caching (testing)
env.useFSCache = false;
env.useBrowserCache = false;For complete documentation on all configuration options, caching strategies, cache management, pre-downloading models, and more, see:
→ Configuration Reference
env快速概览:
javascript
import { env } from '@huggingface/transformers';
// 查看版本
console.log(env.version); // 示例: '3.8.1'
// 常用设置
env.allowRemoteModels = true; // 从Hugging Face Hub加载模型
env.allowLocalModels = false; // 从文件系统加载模型
env.localModelPath = '/models/'; // 本地模型目录
env.useFSCache = true; // 在Node.js中缓存模型到磁盘
env.useBrowserCache = true; // 在浏览器中缓存模型
env.cacheDir = './.cache'; // 缓存目录位置配置模式:
javascript
// 开发环境:快速迭代,使用远程模型
env.allowRemoteModels = true;
env.useFSCache = true;
// 生产环境:仅使用本地模型
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';
// 自定义CDN
env.remoteHost = 'https://cdn.example.com/models';
// 禁用缓存(测试用)
env.useFSCache = false;
env.useBrowserCache = false;所有配置选项、缓存策略、缓存管理和模型预下载等详细内容,可参考:
→ 配置参考文档
Working with Tensors
张量操作
javascript
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';
// Load tokenizer and model separately for more control
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');
// Tokenize input
const inputs = await tokenizer('Hello world!');
// Run model
const outputs = await model(inputs);javascript
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';
// 单独加载分词器和模型以获得更精细的控制
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');
// 对输入进行分词
const inputs = await tokenizer('Hello world!');
// 运行模型
const outputs = await model(inputs);Batch Processing
批量处理
javascript
const classifier = await pipeline('sentiment-analysis');
// Process multiple texts
const results = await classifier([
'I love this!',
'This is terrible.',
'It was okay.'
]);javascript
const classifier = await pipeline('sentiment-analysis');
// 处理多个文本
const results = await classifier([
'I love this!',
'This is terrible.',
'It was okay.'
]);Browser-Specific Considerations
浏览器特定注意事项
WebGPU Usage
WebGPU使用
WebGPU provides GPU acceleration in browsers:
javascript
const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
device: 'webgpu',
dtype: 'fp32'
});Note: WebGPU is experimental. Check browser compatibility and file issues if problems occur.
WebGPU可在浏览器中提供GPU加速:
javascript
const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
device: 'webgpu',
dtype: 'fp32'
});注意:WebGPU为实验性功能。请检查浏览器兼容性,若出现问题可提交反馈。
WASM Performance
WASM性能
Default browser execution uses WASM:
javascript
// Optimized for browsers with quantization
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q8' // or 'q4' for even smaller size
});浏览器默认使用WASM执行:
javascript
// 针对浏览器优化,使用量化版本
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q8' // 或使用'q4'获得更小体积
});Progress Tracking & Loading Indicators
进度跟踪与加载指示器
Models can be large (ranging from a few MB to several GB) and consist of multiple files. Track download progress by passing a callback to the function:
pipeline()javascript
import { pipeline } from '@huggingface/transformers';
// Track progress for each file
const fileProgress = {};
function onProgress(info) {
console.log(`${info.status}: ${info.file}`);
if (info.status === 'progress') {
fileProgress[info.file] = info.progress;
console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
}
if (info.status === 'done') {
console.log(`✓ ${info.file} complete`);
}
}
// Pass callback to pipeline
const classifier = await pipeline('sentiment-analysis', null, {
progress_callback: onProgress
});Progress Info Properties:
typescript
interface ProgressInfo {
status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
name: string; // Model id or path
file: string; // File being processed
progress?: number; // Percentage (0-100, only for 'progress' status)
loaded?: number; // Bytes downloaded (only for 'progress' status)
total?: number; // Total bytes (only for 'progress' status)
}For complete examples including browser UIs, React components, CLI progress bars, and retry logic, see:
→ Pipeline Options - Progress Callback
模型体积可能很大(从几MB到几十GB不等),且包含多个文件。可通过向函数传入回调函数来跟踪下载进度:
pipeline()javascript
import { pipeline } from '@huggingface/transformers';
// 跟踪每个文件的进度
const fileProgress = {};
function onProgress(info) {
console.log(`${info.status}: ${info.file}`);
if (info.status === 'progress') {
fileProgress[info.file] = info.progress;
console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
}
if (info.status === 'done') {
console.log(`✓ ${info.file} complete`);
}
}
// 将回调函数传入pipeline
const classifier = await pipeline('sentiment-analysis', null, {
progress_callback: onProgress
});进度信息属性:
typescript
interface ProgressInfo {
status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
name: string; // 模型ID或路径
file: string; // 当前处理的文件
progress?: number; // 进度百分比(0-100,仅'status'为'progress'时可用)
loaded?: number; // 已下载字节数(仅'status'为'progress'时可用)
total?: number; // 总字节数(仅'status'为'progress'时可用)
}包含浏览器UI、React组件、CLI进度条和重试逻辑的完整示例,可参考:
→ Pipeline选项 - 进度回调
Error Handling
错误处理
javascript
try {
const pipe = await pipeline('sentiment-analysis', 'model-id');
const result = await pipe('text to analyze');
} catch (error) {
if (error.message.includes('fetch')) {
console.error('Model download failed. Check internet connection.');
} else if (error.message.includes('ONNX')) {
console.error('Model execution failed. Check model compatibility.');
} else {
console.error('Unknown error:', error);
}
}javascript
try {
const pipe = await pipeline('sentiment-analysis', 'model-id');
const result = await pipe('text to analyze');
} catch (error) {
if (error.message.includes('fetch')) {
console.error('模型下载失败,请检查网络连接。');
} else if (error.message.includes('ONNX')) {
console.error('模型执行失败,请检查模型兼容性。');
} else {
console.error('未知错误:', error);
}
}Performance Tips
性能优化技巧
- Reuse Pipelines: Create pipeline once, reuse for multiple inferences
- Use Quantization: Start with or
q8for faster inferenceq4 - Batch Processing: Process multiple inputs together when possible
- Cache Models: Models are cached automatically (see Caching Reference for details on browser Cache API, Node.js filesystem cache, and custom implementations)
- WebGPU for Large Models: Use WebGPU for models that benefit from GPU acceleration
- Prune Context: For text generation, limit to avoid memory issues
max_new_tokens - Clean Up Resources: Call when done to free memory
pipe.dispose()
- 复用Pipeline:创建一次Pipeline,多次用于推理
- 使用量化版本:优先使用或
q8版本提升推理速度q4 - 批量处理:尽可能同时处理多个输入
- 缓存模型:模型会自动缓存(缓存详情可参考**缓存参考文档**,包含浏览器Cache API、Node.js文件系统缓存和自定义实现)
- 大型模型使用WebGPU:对受益于GPU加速的模型使用WebGPU
- 修剪上下文:文本生成时限制以避免内存问题
max_new_tokens - 清理资源:使用完毕后调用释放内存
pipe.dispose()
Memory Management
内存管理
IMPORTANT: Always call when finished to prevent memory leaks.
pipe.dispose()javascript
const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose(); // ✓ Free memory (100MB - several GB per model)When to dispose:
- Application shutdown or component unmount
- Before loading a different model
- After batch processing in long-running apps
Models consume significant memory and hold GPU/CPU resources. Disposal is critical for browser memory limits and server stability.
For detailed patterns (React cleanup, servers, browser), see Code Examples
重要提示: 使用完毕后务必调用,以防止内存泄漏。
pipe.dispose()javascript
const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose(); // ✓ 释放内存(每个模型占用100MB到数GB不等)何时释放:
- 应用关闭或组件卸载时
- 加载不同模型之前
- 长运行应用中批量处理完成后
模型会占用大量内存并持有GPU/CPU资源。释放资源对浏览器内存限制和服务器稳定性至关重要。
不同场景下的详细处理模式(React清理、服务器、浏览器)可参考**代码示例**
Troubleshooting
故障排除
Model Not Found
模型未找到
- Verify model exists on Hugging Face Hub
- Check model name spelling
- Ensure model has ONNX files (look for folder in model repo)
onnx
- 确认模型在Hugging Face Hub存在
- 检查模型名称拼写
- 确保模型包含ONNX文件(查看模型仓库中的文件夹)
onnx
Memory Issues
内存问题
- Use smaller models or quantized versions ()
dtype: 'q4' - Reduce batch size
- Limit sequence length with
max_length
- 使用更小的模型或量化版本()
dtype: 'q4' - 减小批量大小
- 通过限制序列长度
max_length
WebGPU Errors
WebGPU错误
- Check browser compatibility (Chrome 113+, Edge 113+)
- Try if
dtype: 'fp16'failsfp32 - Fall back to WASM if WebGPU unavailable
- 检查浏览器兼容性(Chrome 113+、Edge 113+)
- 若失败,尝试
fp32dtype: 'fp16' - 若WebGPU不可用,回退到WASM
Reference Documentation
参考文档
This Skill
本技能文档
- Pipeline Options - Configure with
pipeline(),progress_callback,device, etc.dtype - Configuration Reference - Global configuration for caching and model loading
env - Caching Reference - Browser Cache API, Node.js filesystem cache, and custom cache implementations
- Text Generation Guide - Streaming, chat format, and generation parameters
- Model Architectures - Supported models and selection tips
- Code Examples - Real-world implementations for different runtimes
- Pipeline选项 - 配置的
pipeline()、progress_callback、device等参数dtype - 配置参考文档 - 全局配置,用于缓存和模型加载
env - 缓存参考文档 - 浏览器Cache API、Node.js文件系统缓存和自定义缓存实现
- 文本生成指南 - 流式输出、对话格式和生成参数
- 模型架构 - 支持的模型和选择技巧
- 代码示例 - 不同运行时的实际实现
Official Transformers.js
官方Transformers.js文档
- Official docs: https://huggingface.co/docs/transformers.js
- API reference: https://huggingface.co/docs/transformers.js/api/pipelines
- Model hub: https://huggingface.co/models?library=transformers.js
- GitHub: https://github.com/huggingface/transformers.js
- Examples: https://github.com/huggingface/transformers.js/tree/main/examples
Best Practices
最佳实践
- Always Dispose Pipelines: Call when done - critical for preventing memory leaks
pipe.dispose() - Start with Pipelines: Use the pipeline API unless you need fine-grained control
- Test Locally First: Test models with small inputs before deploying
- Monitor Model Sizes: Be aware of model download sizes for web applications
- Handle Loading States: Show progress indicators for better UX
- Version Pin: Pin specific model versions for production stability
- Error Boundaries: Always wrap pipeline calls in try-catch blocks
- Progressive Enhancement: Provide fallbacks for unsupported browsers
- Reuse Models: Load once, use many times - don't recreate pipelines unnecessarily
- Graceful Shutdown: Dispose models on SIGTERM/SIGINT in servers
- 务必释放Pipeline:使用完毕后调用- 这是防止内存泄漏的关键
pipe.dispose() - 从Pipeline开始:除非需要精细控制,否则优先使用Pipeline API
- 先本地测试:部署前先使用小输入测试模型
- 关注模型大小:Web应用需注意模型下载体积
- 处理加载状态:显示进度指示器以提升用户体验
- 版本锁定:生产环境中锁定特定模型版本以保证稳定性
- 错误边界:始终将Pipeline调用包裹在try-catch块中
- 渐进式增强:为不支持的浏览器提供回退方案
- 复用模型:加载一次,多次使用 - 不要重复创建Pipeline
- 优雅关闭:服务器中在SIGTERM/SIGINT信号触发时释放模型
Quick Reference: Task IDs
快速参考:任务ID
| Task | Task ID |
|---|---|
| Text classification | |
| Token classification | |
| Question answering | |
| Fill mask | |
| Summarization | |
| Translation | |
| Text generation | |
| Text-to-text generation | |
| Zero-shot classification | |
| Image classification | |
| Image segmentation | |
| Object detection | |
| Depth estimation | |
| Image-to-image | |
| Zero-shot image classification | |
| Zero-shot object detection | |
| Automatic speech recognition | |
| Audio classification | |
| Text-to-speech | |
| Image-to-text | |
| Document question answering | |
| Feature extraction | |
| Sentence similarity | |
This skill enables you to integrate state-of-the-art machine learning capabilities directly into JavaScript applications without requiring separate ML servers or Python environments.
| 任务 | 任务ID |
|---|---|
| 文本分类 | |
| 分词分类 | |
| 问答系统 | |
| 掩码填充 | |
| 文本摘要 | |
| 翻译 | |
| 文本生成 | |
| 文本到文本生成 | |
| 零样本分类 | |
| 图像分类 | |
| 图像分割 | |
| 目标检测 | |
| 深度估计 | |
| 图像到图像 | |
| 零样本图像分类 | |
| 零样本目标检测 | |
| 自动语音识别 | |
| 音频分类 | |
| 文本转语音 | |
| 图像转文本 | |
| 文档问答 | |
| 特征提取 | |
| 句子相似度 | |
本技能可让你无需独立的机器学习服务器或Python环境,直接将最先进的机器学习能力集成到JavaScript应用中。