axiom-ios-ml

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

iOS Machine Learning Router

iOS机器学习路由

You MUST use this skill for ANY on-device machine learning or speech-to-text work.
任何设备端机器学习或语音转文本工作都必须使用此技能。

When to Use

使用场景

Use this router when:
  • Converting PyTorch/TensorFlow models to CoreML
  • Deploying ML models on-device
  • Compressing models (quantization, palettization, pruning)
  • Working with large language models (LLMs)
  • Implementing KV-cache for transformers
  • Using MLTensor for model stitching
  • Building speech-to-text features
  • Transcribing audio (live or recorded)
在以下场景中使用此路由:
  • 将PyTorch/TensorFlow模型转换为CoreML
  • 在设备端部署机器学习模型
  • 模型压缩(量化、调色板化、剪枝)
  • 处理大语言模型(LLMs)
  • 为Transformer模型实现KV缓存
  • 使用MLTensor进行模型拼接
  • 构建语音转文本功能
  • 音频转录(实时或录制)

Routing Logic

路由逻辑

CoreML Work

CoreML相关工作

Implementation patterns
/skill coreml
  • Model conversion workflow
  • MLTensor for model stitching
  • Stateful models with KV-cache
  • Multi-function models (adapters/LoRA)
  • Async prediction patterns
  • Compute unit selection
API reference
/skill coreml-ref
  • CoreML Tools Python API
  • MLModel lifecycle
  • MLTensor operations
  • MLComputeDevice availability
  • State management APIs
  • Performance reports
Diagnostics
/skill coreml-diag
  • Model won't load
  • Slow inference
  • Memory issues
  • Compression accuracy loss
  • Compute unit problems
实现模式
/skill coreml
  • 模型转换工作流
  • 用于模型拼接的MLTensor
  • 带KV缓存的有状态模型
  • 多功能模型(适配器/LoRA)
  • 异步预测模式
  • 计算单元选择
API参考
/skill coreml-ref
  • CoreML Tools Python API
  • MLModel生命周期
  • MLTensor操作
  • MLComputeDevice可用性
  • 状态管理API
  • 性能报告
诊断
/skill coreml-diag
  • 模型无法加载
  • 推理速度慢
  • 内存问题
  • 压缩导致的精度损失
  • 计算单元问题

Speech Work

语音相关工作

Implementation patterns
/skill speech
  • SpeechAnalyzer setup (iOS 26+)
  • SpeechTranscriber configuration
  • Live transcription
  • File transcription
  • Volatile vs finalized results
  • Model asset management
实现模式
/skill speech
  • SpeechAnalyzer设置(iOS 26+)
  • SpeechTranscriber配置
  • 实时转录
  • 文件转录
  • 临时结果与最终结果
  • 模型资源管理

Decision Tree

决策树

  1. Implementing / converting ML models? → coreml
  2. CoreML API reference? → coreml-ref
  3. Debugging ML issues (load, inference, compression)? → coreml-diag
  4. Speech-to-text / transcription? → speech
  1. 实现/转换机器学习模型?→ coreml
  2. CoreML API参考?→ coreml-ref
  3. 调试机器学习问题(加载、推理、压缩)?→ coreml-diag
  4. 语音转文本/转录?→ speech

Anti-Rationalization

常见误区纠正

ThoughtReality
"CoreML is just load and predict"CoreML has compression, stateful models, compute unit selection, and async prediction. coreml covers all.
"My model is small, no optimization needed"Even small models benefit from compute unit selection and async prediction. coreml has the patterns.
"I'll just use SFSpeechRecognizer"iOS 26 has SpeechAnalyzer with better accuracy and offline support. speech skill covers the modern API.
想法实际情况
"CoreML只需要加载和预测就行"CoreML包含压缩、有状态模型、计算单元选择和异步预测等功能,coreml技能涵盖所有这些内容。
"我的模型很小,不需要优化"即使是小模型也能从计算单元选择和异步预测中受益,coreml技能提供了相关实现模式。
"我直接用SFSpeechRecognizer就行"iOS 26中的SpeechAnalyzer具有更高的准确率和离线支持,speech技能涵盖了这个现代API。

Critical Patterns

关键模式

coreml:
  • Model conversion (PyTorch → CoreML)
  • Compression (palettization, quantization, pruning)
  • Stateful KV-cache for LLMs
  • Multi-function models for adapters
  • MLTensor for pipeline stitching
  • Async concurrent prediction
coreml-diag:
  • Load failures and caching
  • Inference performance issues
  • Memory pressure from models
  • Accuracy degradation from compression
speech:
  • SpeechAnalyzer + SpeechTranscriber setup
  • AssetInventory model management
  • Live transcription with volatile results
  • Audio format conversion
coreml:
  • 模型转换(PyTorch → CoreML)
  • 模型压缩(调色板化、量化、剪枝)
  • 大语言模型的有状态KV缓存
  • 适配器用多功能模型
  • 用于流水线拼接的MLTensor
  • 异步并发预测
coreml-diag:
  • 加载失败与缓存问题
  • 推理性能问题
  • 模型导致的内存压力
  • 压缩导致的精度下降
speech:
  • SpeechAnalyzer + SpeechTranscriber设置
  • AssetInventory模型管理
  • 带临时结果的实时转录
  • 音频格式转换

Example Invocations

调用示例

User: "How do I convert a PyTorch model to CoreML?" → Invoke:
/skill coreml
User: "Compress my model to fit on iPhone" → Invoke:
/skill coreml
User: "Implement KV-cache for my language model" → Invoke:
/skill coreml
User: "Model loads slowly on first launch" → Invoke:
/skill coreml-diag
User: "My compressed model has bad accuracy" → Invoke:
/skill coreml-diag
User: "Add live transcription to my app" → Invoke:
/skill speech
User: "Transcribe audio files with SpeechAnalyzer" → Invoke:
/skill speech
User: "What's MLTensor and how do I use it?" → Invoke:
/skill coreml-ref
用户:"如何将PyTorch模型转换为CoreML?" → 调用:
/skill coreml
用户:"压缩我的模型以适配iPhone" → 调用:
/skill coreml
用户:"为我的语言模型实现KV缓存" → 调用:
/skill coreml
用户:"模型首次启动时加载缓慢" → 调用:
/skill coreml-diag
用户:"我的压缩模型精度很差" → 调用:
/skill coreml-diag
用户:"为我的应用添加实时转录功能" → 调用:
/skill speech
用户:"使用SpeechAnalyzer转录音频文件" → 调用:
/skill speech
用户:"MLTensor是什么,我该如何使用它?" → 调用:
/skill coreml-ref