react-native-executorch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWhen to Use This Skill
什么时候使用该技能
Use this skill when you need to:
- Build AI features directly into mobile apps without cloud infrastructure
- Deploy LLMs locally for text generation, chat, or function calling
- Add computer vision (image classification, object detection, OCR)
- Process audio (speech-to-text, text-to-speech, voice activity detection)
- Implement semantic search with text embeddings
- Ensure privacy by keeping all AI processing on-device
- Reduce latency by eliminating cloud API calls
- Work offline once models are downloaded
当你需要以下功能时,可以使用该技能:
- 无需依赖云端基础设施,直接在移动应用中构建AI功能
- 本地部署LLM,用于文本生成、聊天或函数调用
- 添加计算机视觉功能(图像分类、目标检测、OCR)
- 处理音频(语音转文字、文字转语音、语音活动检测)
- 通过文本向量嵌入实现语义搜索
- 通过所有AI处理都在端侧完成来保障隐私
- 消除云端API调用以降低延迟
- 模型下载完成后可离线使用
Overview
概述
React Native Executorch is a library that enables on-device AI model execution in React Native applications. It provides hooks and utilities for running machine learning models directly on mobile devices without requiring cloud infrastructure or internet connectivity (after initial model download).
React Native Executorch是一个能在React Native应用中实现端侧AI模型运行的库。它提供hooks和工具,让机器学习模型无需依赖云端基础设施或网络连接(模型首次下载后)即可直接在移动设备上运行。
Key Use Cases
核心使用场景
Use Case 1: Mobile Chatbot/Assistant
场景1:移动聊天机器人/助手
Trigger: User asks to build a chat interface, create a conversational AI, or add an AI assistant to their app
Steps:
- Choose appropriate LLM based on device memory constraints
- Load model using ExecuTorch hooks
- Implement message handling and conversation history
- Optionally add system prompts, tool calling, or structured output
Result: Functional chat interface with on-device AI responding without cloud dependency
Reference: ./references/reference-llms.md
触发条件: 用户需要构建聊天界面、创建对话式AI,或在应用中添加AI助手
步骤:
- 根据设备内存限制选择合适的LLM
- 使用ExecuTorch hooks加载模型
- 实现消息处理和对话历史管理
- 可选择添加系统提示词、工具调用或结构化输出
结果: 具备端侧AI响应能力的功能完整的聊天界面,无需依赖云端
参考文档: ./references/reference-llms.md
Use Case 2: Image Recognition & Tagging
场景2:图像识别与标记
Trigger: User needs to classify images, detect objects, or recognize content in photos
Steps:
- Select vision model (classification, detection, or segmentation)
- Load model for image processing task
- Pass image URI and process results
- Display detections or classifications in app UI
Result: App that understands image content without sending data to servers
Reference: ./references/reference-cv.md
触发条件: 用户需要对图像进行分类、检测目标或识别照片内容
步骤:
- 选择视觉模型(分类、检测或分割)
- 加载用于图像处理任务的模型
- 传入图像URI并处理结果
- 在应用UI中展示检测或分类结果
结果: 无需将数据发送到服务器即可理解图像内容的应用
参考文档: ./references/reference-cv.md
Use Case 3: Document/Receipt Scanning
场景3:文档/收据扫描
Trigger: User wants to extract text from photos (receipts, documents, business cards)
Steps:
- Choose OCR model matching target language
- Load appropriate recognizer for alphabet/language
- Capture or load image
- Extract text regions with bounding boxes
- Post-process results for application
Result: OCR-enabled app that reads text directly from device camera
Reference: ./references/reference-ocr.md
触发条件: 用户需要从照片(收据、文档、名片)中提取文本
步骤:
- 选择匹配目标语言的OCR模型
- 加载对应语言/字母体系的识别器
- 拍摄或加载图像
- 提取带边界框的文本区域
- 根据应用需求对结果进行后处理
结果: 可直接通过设备摄像头读取文本的OCR应用
参考文档: ./references/reference-ocr.md
Use Case 4: Voice Interface
场景4:语音交互界面
Trigger: User wants to add voice commands, transcription, or voice output to app
Steps:
- For voice input: Capture audio at correct sample rate → transcribe with STT model
- For voice output: Generate speech from text → play through audio context
- Handle audio format/sample rate conversion
Result: App with hands-free voice interaction
Reference: ./references/reference-audio.md
触发条件: 用户需要在应用中添加语音命令、转录或语音输出功能
步骤:
- 语音输入: 以正确采样率捕获音频 → 使用STT模型转录
- 语音输出: 从文本生成语音 → 通过音频上下文播放
- 处理音频格式/采样率转换
结果: 支持免提语音交互的应用
参考文档: ./references/reference-audio.md
Use Case 5: Semantic Search
场景5:语义搜索
Trigger: User needs intelligent search, similarity matching, or content recommendations
Steps:
- Load text or image embeddings model
- Generate embeddings for searchable content
- Compute similarity scores between queries and content
- Rank and return results
Result: Smart search that understands meaning, not just keywords
Reference: ./references/reference-nlp.md
触发条件: 用户需要智能搜索、相似度匹配或内容推荐功能
步骤:
- 加载文本或图像向量嵌入模型
- 为可搜索内容生成向量嵌入
- 计算查询内容与目标内容的相似度得分
- 对结果进行排序并返回
结果: 能理解语义而非仅匹配关键词的智能搜索功能
参考文档: ./references/reference-nlp.md
Core Capabilities by Category
按类别划分的核心能力
Large Language Models (LLMs)
大语言模型(LLMs)
Run text generation, chat, function calling, and structured output generation locally on-device.
Supported features:
- Text generation and chat completions
- Function/tool calling
- Structured output with JSON schema validation
- Streaming responses
- Multiple model families (Llama 3.2, Qwen 3, Hammer 2.1, SmolLM2, Phi 4)
Reference: See ./references/reference-llms.md
在端侧本地运行文本生成、聊天、函数调用和结构化输出生成。
支持功能:
- 文本生成与聊天补全
- 函数/工具调用
- 带JSON Schema验证的结构化输出
- 流式响应
- 多模型家族(Llama 3.2、Qwen 3、Hammer 2.1、SmolLM2、Phi 4)
参考文档: 查看 ./references/reference-llms.md
Computer Vision
计算机视觉
Perform image understanding and manipulation tasks entirely on-device.
Supported tasks:
- Image Classification - Categorize images into predefined classes
- Object Detection - Locate and identify objects with bounding boxes
- Image Segmentation - Pixel-level classification
- Style Transfer - Apply artistic styles to images
- Text-to-Image - Generate images from text descriptions
- Image Embeddings - Convert images to numerical vectors for similarity/search
Reference: See ./references/reference-cv.md and ./references/reference-cv-2.md
完全在端侧执行图像理解与处理任务。
支持任务:
- 图像分类 - 将图像归类到预定义类别
- 目标检测 - 定位并识别带边界框的目标
- 图像分割 - 像素级分类
- 风格迁移 - 为图像应用艺术风格
- 文本生成图像 - 根据文本描述生成图像
- 图像向量嵌入 - 将图像转换为用于相似度/搜索的数值向量
参考文档: 查看 ./references/reference-cv.md 和 ./references/reference-cv-2.md
Optical Character Recognition (OCR)
光学字符识别(OCR)
Extract and recognize text from images with support for multiple languages and text orientations.
Supported features:
- Text detection in images
- Text recognition across different alphabets
- Horizontal text (standard documents, receipts)
- Vertical text support (experimental, for CJK languages)
- Multi-language support with language-specific recognizers
Reference: See ./references/reference-ocr.md
从图像中提取并识别文本,支持多语言和不同文本方向。
支持功能:
- 图像中的文本检测
- 跨不同字母体系的文本识别
- 水平文本(标准文档、收据)
- 垂直文本支持(实验性,适用于CJK语言)
- 多语言支持,含语言专属识别器
参考文档: 查看 ./references/reference-ocr.md
Audio Processing
音频处理
Convert between speech and text, and detect speech activity in audio.
Supported tasks:
- Speech-to-Text - Transcribe audio to text (supports English and multilingual)
- Text-to-Speech - Generate natural-sounding speech from text
- Voice Activity Detection - Detect speech segments in audio
Reference: See ./references/reference-audio.md
实现语音与文本的互转,以及音频中的语音活动检测。
支持任务:
- 语音转文本 - 将音频转录为文本(支持英语及多语言)
- 文本转语音 - 从文本生成自然语音
- 语音活动检测 - 检测音频中的语音片段
参考文档: 查看 ./references/reference-audio.md
Natural Language Processing
自然语言处理
Convert text to numerical representations for semantic understanding and search.
Supported tasks:
- Text Embeddings - Convert text to vectors for similarity/search
- Tokenization - Convert text to tokens and vice versa
Reference: See ./references/reference-nlp.md
将文本转换为数值表示,用于语义理解与搜索。
支持任务:
- 文本向量嵌入 - 将文本转换为用于相似度/搜索的向量
- 分词 - 文本与token的互相转换
参考文档: 查看 ./references/reference-nlp.md
Getting Started by Use Case
按使用场景快速上手
I want to build a chatbot or AI assistant
我想构建聊天机器人或AI助手
Use hook with one of the available language models.
useLLMWhat to do:
- Choose a model from available LLM options (consider device memory constraints)
- Use the hook to load the model
useLLM - Send messages and receive responses
- Optionally configure system prompts, generation parameters, and tools
Reference: ./references/reference-llms.md
Model options: ./references/reference-models.md - LLMs section
使用 hook搭配任一可用语言模型。
useLLM操作步骤:
- 从可用LLM选项中选择模型(需考虑设备内存限制)
- 使用hook加载模型
useLLM - 发送消息并接收响应
- 可选择配置系统提示词、生成参数和工具
参考文档: ./references/reference-llms.md
模型选项: ./references/reference-models.md - LLMs章节
I want to enable function/tool calling in my LLM
我想在LLM中启用函数/工具调用
Use with tool definitions to allow the model to call predefined functions.
useLLMWhat to do:
- Define tools with name, description, and parameter schema
- Configure the LLM with tool definitions
- Implement callbacks to execute tools when the model requests them
- Parse tool results and pass them back to the model
Reference: ./references/reference-llms.md - Tool Calling section
使用搭配工具定义,让模型能够调用预定义函数。
useLLM操作步骤:
- 定义包含名称、描述和参数Schema的工具
- 为LLM配置工具定义
- 实现回调函数,在模型请求时执行工具
- 解析工具结果并返回给模型
参考文档: ./references/reference-llms.md - 工具调用章节
I want structured data extraction from text
我想从文本中提取结构化数据
Use with structured output generation using JSON schema validation.
useLLMWhat to do:
- Define a schema (JSON Schema or Zod) for desired output format
- Configure the LLM with the schema
- Generate responses and validate against the schema
- Use the validated structured data in your app
Reference: ./references/reference-llms.md - Structured Output section
使用搭配基于JSON Schema验证的结构化输出生成功能。
useLLM操作步骤:
- 为所需输出格式定义Schema(JSON Schema或Zod)
- 为LLM配置该Schema
- 生成响应并验证是否符合Schema
- 在应用中使用经过验证的结构化数据
参考文档: ./references/reference-llms.md - 结构化输出章节
I want to classify or recognize objects in images
我想对图像中的目标进行分类或识别
Use for simple categorization or for locating specific objects.
useClassificationuseObjectDetectionWhat to do:
- Choose appropriate computer vision model based on task
- Load the model with the appropriate hook
- Pass image URI (local, remote, or base64)
- Process results (classifications, detections with bounding boxes)
Reference: ./references/reference-cv.md
Model options: ./references/reference-models.md - Classification and Object Detection sections
简单分类使用,定位特定目标使用。
useClassificationuseObjectDetection操作步骤:
- 根据任务选择合适的计算机视觉模型
- 使用对应hook加载模型
- 传入图像URI(本地、远程或base64格式)
- 处理结果(分类结果、带边界框的检测结果)
参考文档: ./references/reference-cv.md
模型选项: ./references/reference-models.md - 分类与目标检测章节
I want to extract text from images
我想从图像中提取文本
Use for horizontal text or for vertical text (experimental).
useOCRuseVerticalOCRWhat to do:
- Choose appropriate OCR model and recognizer matching your target language
- Load the model with or
useOCRhookuseVerticalOCR - Pass image URI
- Extract detected text regions with bounding boxes and confidence scores
- Process results based on your application needs
Reference: ./references/reference-ocr.md
Model options: ./references/reference-models.md - OCR section
水平文本使用,垂直文本使用(实验性)。
useOCRuseVerticalOCR操作步骤:
- 选择匹配目标语言的OCR模型和识别器
- 使用或
useOCRhook加载模型useVerticalOCR - 传入图像URI
- 提取带边界框和置信度得分的检测文本区域
- 根据应用需求处理结果
参考文档: ./references/reference-ocr.md
模型选项: ./references/reference-models.md - OCR章节
I want to convert speech to text or text to speech
我想实现语音转文本或文本转语音
Use for transcription or for voice synthesis.
useSpeechToTextuseTextToSpeechWhat to do:
- For Speech-to-Text: Capture or load audio, ensure 16kHz sample rate, transcribe
- For Text-to-Speech: Prepare text, specify voice parameters, generate audio waveform, play using audio context
Reference: ./references/reference-audio.md
Model options: ./references/reference-models.md - Speech to Text and Text to Speech sections
转录功能使用,语音合成使用。
useSpeechToTextuseTextToSpeech操作步骤:
- 语音转文本: 捕获或加载音频,确保采样率为16kHz,然后进行转录
- 文本转语音: 准备文本,指定语音参数,生成音频波形,通过音频上下文播放
参考文档: ./references/reference-audio.md
模型选项: ./references/reference-models.md - 语音转文本与文本转语音章节
I want to find similar images or texts
我想查找相似图像或文本
Use for images or for text.
useImageEmbeddingsuseTextEmbeddingsWhat to do:
- Load appropriate embeddings model
- Generate embeddings for your content
- Compute similarity metrics (cosine similarity, dot product)
- Use similarity scores for search, clustering, or deduplication
Reference:
- Text: ./references/reference-nlp.md
- Images: ./references/reference-cv-2.md
图像使用,文本使用。
useImageEmbeddingsuseTextEmbeddings操作步骤:
- 加载合适的向量嵌入模型
- 为内容生成向量嵌入
- 计算相似度指标(余弦相似度、点积)
- 将相似度得分用于搜索、聚类或去重
参考文档:
- 文本:./references/reference-nlp.md
- 图像:./references/reference-cv-2.md
I want to apply artistic filters to photos
我想为照片应用艺术滤镜
Use to apply predefined artistic styles to images.
useStyleTransferWhat to do:
- Choose from available artistic styles (Candy, Mosaic, Udnie, Rain Princess)
- Load the style transfer model
- Pass image URI
- Retrieve and use the stylized image
Reference: ./references/reference-cv-2.md
Model options: ./references/reference-models.md - Style Transfer section
使用为图像应用预定义艺术风格。
useStyleTransfer操作步骤:
- 从可用艺术风格中选择(Candy、Mosaic、Udnie、Rain Princess)
- 加载风格迁移模型
- 传入图像URI
- 获取并使用风格化后的图像
参考文档: ./references/reference-cv-2.md
模型选项: ./references/reference-models.md - 风格迁移章节
I want to generate images from text
我想根据文本生成图像
Use to create images based on text descriptions.
useTextToImageWhat to do:
- Load the text-to-image model
- Provide text description (prompt)
- Optionally specify image size and number of generation steps
- Receive generated image (may take 20-60 seconds depending on device)
Reference: ./references/reference-cv-2.md
Model options: ./references/reference-models.md - Text to Image section
使用根据文本描述创建图像。
useTextToImage操作步骤:
- 加载文本生成图像模型
- 提供文本描述(提示词)
- 可选择指定图像尺寸和生成步数
- 接收生成的图像(根据设备性能可能需要20-60秒)
参考文档: ./references/reference-cv-2.md
模型选项: ./references/reference-models.md - 文本生成图像章节
Understanding Model Loading
模型加载说明
Before using any AI model, you need to load it. Models can be loaded from three sources:
1. Bundled with app (assets folder)
- Best for small models (< 512MB)
- Available immediately without download
- Increases app installation size
2. Remote URL (downloaded on first use)
- Best for large models (> 512MB)
- Downloaded once and cached locally
- Keeps app size small
- Requires internet on first use
3. Local file system
- Maximum flexibility for user-managed models
- Requires custom download/file management UI
Model selection strategy:
- Small models (< 512MB) → Bundle with app or download from URL
- Large models (> 512MB) → Download from URL on first use with progress tracking
- Quantized models → Preferred for lower-end devices to save memory
Reference: ./references/reference-models.md - Loading Models section
在使用任何AI模型前,你需要先加载它。模型可从三个来源加载:
1. 与应用捆绑(assets文件夹)
- 最适合小型模型(<512MB)
- 无需下载,立即可用
- 会增加应用安装包体积
2. 远程URL(首次使用时下载)
- 最适合大型模型(>512MB)
- 仅下载一次并本地缓存
- 保持应用体积小巧
- 首次使用需要联网
3. 本地文件系统
- 用户管理模型的灵活性最高
- 需要自定义下载/文件管理UI
模型选择策略:
- 小型模型(<512MB)→ 与应用捆绑或从URL下载
- 大型模型(>512MB)→ 首次使用时从URL下载并跟踪进度
- 量化模型 → 优先用于低端设备以节省内存
参考文档: ./references/reference-models.md - 模型加载章节
Device Constraints and Model Selection
设备限制与模型选择
Not all models work on all devices. Consider these constraints:
Memory limitations:
- Low-end devices: Use smaller models (135M-1.7B parameters) and quantized variants
- High-end devices: Can run larger models (3B-4B parameters)
Processing power:
- Lower-end devices: Expect longer inference times
- Audio processing requires specific sample rates (16kHz for STT, 24kHz for TTS output)
Storage:
- Large models require significant disk space
- Implement cleanup mechanisms to remove unused models
- Monitor total downloaded model size
Guidance:
- Always check model memory requirements before recommending models
- Prefer quantized model variants on lower-end devices
- Show download progress for models > 512MB
- Test on target devices before release
Reference: ./references/reference-models.md
并非所有模型都能在所有设备上运行,需考虑以下限制:
内存限制:
- 低端设备:使用更小的模型(135M-1.7B参数)和量化版本
- 高端设备:可运行更大的模型(3B-4B参数)
处理能力:
- 低端设备:推理时间会更长
- 音频处理需要特定采样率(STT为16kHz,TTS输出为24kHz)
存储限制:
- 大型模型需要大量磁盘空间
- 实现清理机制以移除未使用的模型
- 监控已下载模型的总大小
指导原则:
- 推荐模型前务必检查模型内存要求
- 低端设备优先选择量化模型版本
- 对大于512MB的模型显示下载进度
- 发布前在目标设备上进行测试
参考文档: ./references/reference-models.md
Important Technical Requirements
重要技术要求
Audio Processing
音频处理
Audio must be in correct sample rate for processing:
- Speech-to-Text input: 16kHz sample rate
- Text-to-Speech output: 24kHz sample rate
- Always decode/resample audio to correct rate before processing
Reference: ./references/reference-audio.md
音频必须采用正确的采样率才能处理:
- 语音转文本输入: 16kHz采样率
- 文本转语音输出: 24kHz采样率
- 处理前务必将音频解码/重采样至正确速率
参考文档: ./references/reference-audio.md
Image Processing
图像处理
Images can be provided as:
- Remote URLs (http/https) - automatically cached
- Local file URIs (file://)
- Base64-encoded strings
Image preprocessing (resizing, normalization) is handled automatically by most hooks.
Reference: ./references/reference-cv.md and ./references/reference-cv-2.md
图像可通过以下形式提供:
- 远程URL(http/https)- 自动缓存
- 本地文件URI(file://)
- Base64编码字符串
大多数hook会自动处理图像预处理(调整大小、归一化)。
参考文档: ./references/reference-cv.md 和 ./references/reference-cv-2.md
Text Tokens
文本Token
Text embeddings and LLMs have maximum token limits. Text exceeding these limits will be truncated. Use to count tokens before processing.
useTokenizerReference: ./references/reference-nlp.md
文本向量嵌入和LLM有最大token限制,超出限制的文本会被截断。处理前使用统计token数量。
useTokenizer参考文档: ./references/reference-nlp.md
Core Utilities and Error Handling
核心工具与错误处理
The library provides core utilities for managing models and handling errors:
ResourceFetcher: Manage model downloads with pause/resume capabilities, storage cleanup, and progress tracking.
Error Handling: Use and error codes for robust error handling and user feedback.
RnExecutorchErroruseExecutorchModule: Low-level API for custom models not covered by dedicated hooks.
Reference: ./references/core-utilities.md
该库提供用于模型管理和错误处理的核心工具:
ResourceFetcher: 管理模型下载,支持暂停/恢复、存储清理和进度跟踪。
错误处理: 使用和错误码实现健壮的错误处理和用户反馈。
RnExecutorchErroruseExecutorchModule: 用于处理专用hook未覆盖的自定义模型的底层API。
参考文档: ./references/core-utilities.md
Common Troubleshooting
常见问题排查
Model not loading: Check model source URL/path validity and sufficient device storage
Out of memory errors: Switch to smaller model or quantized variant
Poor LLM quality: Adjust temperature/top-p parameters or improve system prompt
Audio issues: Verify correct sample rate (16kHz for STT, 24kHz output for TTS)
Download failures: Implement retry logic and check network connectivity
Reference: ./references/core-utilities.md for error handling details, or specific reference file for your use case
模型无法加载: 检查模型源URL/路径是否有效,设备存储是否充足
内存不足错误: 切换到更小的模型或量化版本
LLM输出质量差: 调整temperature/top-p参数或优化系统提示词
音频问题: 验证采样率是否正确(STT为16kHz,TTS输出为24kHz)
下载失败: 实现重试逻辑并检查网络连接
参考文档: 错误处理详情查看 ./references/core-utilities.md,或对应使用场景的参考文档
Quick Reference by Hook
Hook快速参考表
| Hook | Purpose | Reference |
|---|---|---|
| Text generation, chat, function calling | reference-llms.md |
| Image categorization | reference-cv.md |
| Object localization | reference-cv.md |
| Pixel-level classification | reference-cv.md |
| Artistic image filters | reference-cv-2.md |
| Image generation | reference-cv-2.md |
| Image similarity/search | reference-cv-2.md |
| Text recognition (horizontal) | reference-ocr.md |
| Text recognition (vertical, experimental) | reference-ocr.md |
| Audio transcription | reference-audio.md |
| Voice synthesis | reference-audio.md |
| Voice activity detection | reference-audio.md |
| Text similarity/search | reference-nlp.md |
| Text to tokens conversion | reference-nlp.md |
| Custom model inference (advanced) | core-utilities.md |
| Hook | 用途 | 参考文档 |
|---|---|---|
| 文本生成、聊天、函数调用 | reference-llms.md |
| 图像分类 | reference-cv.md |
| 目标定位 | reference-cv.md |
| 像素级分类 | reference-cv.md |
| 图像艺术滤镜 | reference-cv-2.md |
| 图像生成 | reference-cv-2.md |
| 图像相似度/搜索 | reference-cv-2.md |
| 文本识别(水平) | reference-ocr.md |
| 文本识别(垂直,实验性) | reference-ocr.md |
| 音频转录 | reference-audio.md |
| 语音合成 | reference-audio.md |
| 语音活动检测 | reference-audio.md |
| 文本相似度/搜索 | reference-nlp.md |
| 文本与token互相转换 | reference-nlp.md |
| 自定义模型推理(进阶) | core-utilities.md |
Quick Checklist for Implementation
实现快速检查清单
Use this when building AI features with ExecuTorch:
Planning Phase
- Identified what AI task you need (chat, vision, audio, search)
- Considered device memory constraints and target devices
- Chose appropriate model from available options
- Determined if cloud backup fallback is needed
Development Phase
- Selected correct hook for your task
- Configured model loading (bundled, remote URL, or local)
- Implemented proper error handling
- Added loading states for model operations
- Tested audio sample rates (if audio task)
- Set up resource management for large models
Testing Phase
- Tested on target minimum device
- Verified offline functionality works
- Checked memory usage doesn't exceed device limits
- Tested error handling (network, memory, invalid inputs)
- Measured inference time for acceptable UX
Deployment Phase
- Model bundling strategy decided (size/download tradeoff)
- Download progress UI implemented (if remote models)
- Version management plan for model updates
- User feedback mechanism for quality issues
使用ExecuTorch构建AI功能时可参考以下清单:
规划阶段
- 明确所需的AI任务(聊天、视觉、音频、搜索)
- 考虑设备内存限制和目标设备
- 从可用选项中选择合适的模型
- 确定是否需要云端备份 fallback
开发阶段
- 为任务选择正确的hook
- 配置模型加载方式(捆绑、远程URL或本地)
- 实现正确的错误处理
- 为模型操作添加加载状态
- 测试音频采样率(若涉及音频任务)
- 为大型模型设置资源管理
测试阶段
- 在目标最低配置设备上测试
- 验证离线功能正常工作
- 检查内存使用是否超出设备限制
- 测试错误处理(网络、内存、无效输入)
- 测量推理时间以确保用户体验可接受
部署阶段
- 确定模型捆绑策略(体积与下载的权衡)
- 为远程模型实现下载进度UI
- 制定模型更新的版本管理计划
- 建立用户反馈机制以收集质量问题
Reference Files Overview
参考文件概述
reference-llms.md
- Complete LLM hook documentation
- Functional vs Managed modes
- Tool calling implementation
- Structured output generation
reference-cv.md
- Image classification, detection, and segmentation
- Basic computer vision tasks
reference-cv-2.md
- Advanced vision tasks: style transfer, text-to-image, embeddings
- Image similarity and search
reference-ocr.md
- Horizontal and vertical text recognition
- Multi-language support
- OCR model selection
reference-audio.md
- Speech-to-text transcription
- Text-to-speech voice synthesis
- Voice activity detection
- Audio sample rate requirements
reference-nlp.md
- Text embeddings for semantic search
- Tokenization utilities
- Token limits and model compatibility
reference-models.md
- Complete list of available models
- Model loading strategies
- Model selection guidelines
- Device memory/performance considerations
core-utilities.md
- ResourceFetcher for download management
- Error handling with RnExecutorchError
- Low-level useExecutorchModule API
- Error codes reference
reference-llms.md
- 完整的LLM hook文档
- 功能模式与托管模式
- 工具调用实现
- 结构化输出生成
reference-cv.md
- 图像分类、检测与分割
- 基础计算机视觉任务
reference-cv-2.md
- 进阶视觉任务:风格迁移、文本生成图像、向量嵌入
- 图像相似度与搜索
reference-ocr.md
- 水平与垂直文本识别
- 多语言支持
- OCR模型选择
reference-audio.md
- 语音转文本转录
- 文本转语音合成
- 语音活动检测
- 音频采样率要求
reference-nlp.md
- 用于语义搜索的文本向量嵌入
- 分词工具
- Token限制与模型兼容性
reference-models.md
- 可用模型完整列表
- 模型加载策略
- 模型选择指南
- 设备内存/性能考量
core-utilities.md
- 用于下载管理的ResourceFetcher
- 使用RnExecutorchError进行错误处理
- 底层useExecutorchModule API
- 错误码参考
Troubleshooting Guide
故障排除指南
Model not loading or crashing
- Check model source (URL valid, file exists)
- Verify device has sufficient free storage and memory
- Try bundling smaller models first
- Check error codes with
RnExecutorchError
Out of memory errors
- Switch to quantized model variant (smaller file size)
- Use smaller parameter model (135M instead of 1.7B)
- Close other apps to free device memory
- Implement model unloading when not in use
Poor quality results from LLM
- Adjust generation parameters (temperature, top-p)
- Improve system prompt
- Try larger model if device supports it
- Check input preprocessing
Audio not processing
- Verify sample rate is 16kHz for STT, 24kHz output for TTS
- Check audio format compatibility
- Ensure audio buffer has data before processing
- Validate microphone permissions
Slow inference speed
- Expected on lower-end devices (especially larger models)
- Show loading indicator to user
- Consider preprocessing optimization
- Profile on actual target device
模型无法加载或崩溃
- 检查模型源(URL是否有效、文件是否存在)
- 验证设备是否有足够的可用存储和内存
- 先尝试捆绑小型模型
- 使用检查错误码
RnExecutorchError
内存不足错误
- 切换到量化模型版本(文件体积更小)
- 使用参数更少的模型(如135M而非1.7B)
- 关闭其他应用以释放设备内存
- 实现模型未使用时的卸载逻辑
LLM输出质量差
- 调整生成参数(temperature、top-p)
- 优化系统提示词
- 若设备支持,尝试更大的模型
- 检查输入预处理是否正确
音频无法处理
- 验证采样率是否正确(STT为16kHz,TTS输出为24kHz)
- 检查音频格式兼容性
- 确保处理前音频缓冲区有数据
- 验证麦克风权限
推理速度慢
- 低端设备(尤其是大型模型)属于正常现象
- 向用户展示加载指示器
- 考虑优化预处理
- 在实际目标设备上进行性能分析
Best Practices
最佳实践
Model Selection
- Match model size to device capabilities
- Use quantized variants for memory-constrained devices
- Test on minimum target device before release
- Keep models updated via download mechanism
Error Handling
- Always wrap AI operations in try-catch
- Provide user-friendly error messages
- Implement fallback behavior (cloud API, simplified UX)
- Log errors for debugging
User Experience
- Show loading states during model operations
- Display download progress for large models
- Ensure app remains responsive during inference
- Consider offline-first design
Resource Management
- Unload unused models to free memory
- Implement cleanup for old cached models
- Show storage impact of AI features
- Monitor battery usage of continuous processing
Performance Optimization
- Batch requests when possible
- Preload models during idle time
- Profile actual device performance before launch
- Use appropriate model size for each task
模型选择
- 模型大小要匹配设备能力
- 内存受限设备优先使用量化版本
- 发布前在目标最低配置设备上测试
- 通过下载机制保持模型更新
错误处理
- 始终将AI操作包裹在try-catch中
- 提供用户友好的错误提示
- 实现 fallback 行为(云端API、简化版用户体验)
- 记录错误用于调试
用户体验
- 模型操作期间显示加载状态
- 大型模型显示下载进度
- 推理过程中确保应用保持响应
- 优先考虑离线优先设计
资源管理
- 卸载未使用的模型以释放内存
- 实现旧缓存模型的清理逻辑
- 向用户展示AI功能的存储占用
- 持续监控连续处理的电池消耗
性能优化
- 尽可能批量处理请求
- 空闲时间预加载模型
- 发布前在实际设备上进行性能分析
- 为每个任务选择合适大小的模型
External Resources
外部资源
- Official Documentation: https://docs.swmansion.com/react-native-executorch
- HuggingFace Models: https://huggingface.co/software-mansion/collections
- GitHub Repository: https://github.com/software-mansion/react-native-executorch
- API Reference: https://docs.swmansion.com/react-native-executorch/docs/api-reference