axiom-ios-ml
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseiOS Machine Learning Router
iOS机器学习路由
You MUST use this skill for ANY on-device machine learning or speech-to-text work.
任何设备端机器学习或语音转文本工作都必须使用此技能。
When to Use
使用场景
Use this router when:
- Converting PyTorch/TensorFlow models to CoreML
- Deploying ML models on-device
- Compressing models (quantization, palettization, pruning)
- Working with large language models (LLMs)
- Implementing KV-cache for transformers
- Using MLTensor for model stitching
- Building speech-to-text features
- Transcribing audio (live or recorded)
在以下场景中使用此路由:
- 将PyTorch/TensorFlow模型转换为CoreML
- 在设备端部署机器学习模型
- 模型压缩(量化、调色板化、剪枝)
- 处理大语言模型(LLMs)
- 为Transformer模型实现KV缓存
- 使用MLTensor进行模型拼接
- 构建语音转文本功能
- 音频转录(实时或录制)
Routing Logic
路由逻辑
CoreML Work
CoreML相关工作
Implementation patterns →
/skill coreml- Model conversion workflow
- MLTensor for model stitching
- Stateful models with KV-cache
- Multi-function models (adapters/LoRA)
- Async prediction patterns
- Compute unit selection
API reference →
/skill coreml-ref- CoreML Tools Python API
- MLModel lifecycle
- MLTensor operations
- MLComputeDevice availability
- State management APIs
- Performance reports
Diagnostics →
/skill coreml-diag- Model won't load
- Slow inference
- Memory issues
- Compression accuracy loss
- Compute unit problems
实现模式 →
/skill coreml- 模型转换工作流
- 用于模型拼接的MLTensor
- 带KV缓存的有状态模型
- 多功能模型(适配器/LoRA)
- 异步预测模式
- 计算单元选择
API参考 →
/skill coreml-ref- CoreML Tools Python API
- MLModel生命周期
- MLTensor操作
- MLComputeDevice可用性
- 状态管理API
- 性能报告
诊断 →
/skill coreml-diag- 模型无法加载
- 推理速度慢
- 内存问题
- 压缩导致的精度损失
- 计算单元问题
Speech Work
语音相关工作
Implementation patterns →
/skill speech- SpeechAnalyzer setup (iOS 26+)
- SpeechTranscriber configuration
- Live transcription
- File transcription
- Volatile vs finalized results
- Model asset management
实现模式 →
/skill speech- SpeechAnalyzer设置(iOS 26+)
- SpeechTranscriber配置
- 实时转录
- 文件转录
- 临时结果与最终结果
- 模型资源管理
Decision Tree
决策树
- Implementing / converting ML models? → coreml
- CoreML API reference? → coreml-ref
- Debugging ML issues (load, inference, compression)? → coreml-diag
- Speech-to-text / transcription? → speech
- 实现/转换机器学习模型?→ coreml
- CoreML API参考?→ coreml-ref
- 调试机器学习问题(加载、推理、压缩)?→ coreml-diag
- 语音转文本/转录?→ speech
Anti-Rationalization
常见误区纠正
| Thought | Reality |
|---|---|
| "CoreML is just load and predict" | CoreML has compression, stateful models, compute unit selection, and async prediction. coreml covers all. |
| "My model is small, no optimization needed" | Even small models benefit from compute unit selection and async prediction. coreml has the patterns. |
| "I'll just use SFSpeechRecognizer" | iOS 26 has SpeechAnalyzer with better accuracy and offline support. speech skill covers the modern API. |
| 想法 | 实际情况 |
|---|---|
| "CoreML只需要加载和预测就行" | CoreML包含压缩、有状态模型、计算单元选择和异步预测等功能,coreml技能涵盖所有这些内容。 |
| "我的模型很小,不需要优化" | 即使是小模型也能从计算单元选择和异步预测中受益,coreml技能提供了相关实现模式。 |
| "我直接用SFSpeechRecognizer就行" | iOS 26中的SpeechAnalyzer具有更高的准确率和离线支持,speech技能涵盖了这个现代API。 |
Critical Patterns
关键模式
coreml:
- Model conversion (PyTorch → CoreML)
- Compression (palettization, quantization, pruning)
- Stateful KV-cache for LLMs
- Multi-function models for adapters
- MLTensor for pipeline stitching
- Async concurrent prediction
coreml-diag:
- Load failures and caching
- Inference performance issues
- Memory pressure from models
- Accuracy degradation from compression
speech:
- SpeechAnalyzer + SpeechTranscriber setup
- AssetInventory model management
- Live transcription with volatile results
- Audio format conversion
coreml:
- 模型转换(PyTorch → CoreML)
- 模型压缩(调色板化、量化、剪枝)
- 大语言模型的有状态KV缓存
- 适配器用多功能模型
- 用于流水线拼接的MLTensor
- 异步并发预测
coreml-diag:
- 加载失败与缓存问题
- 推理性能问题
- 模型导致的内存压力
- 压缩导致的精度下降
speech:
- SpeechAnalyzer + SpeechTranscriber设置
- AssetInventory模型管理
- 带临时结果的实时转录
- 音频格式转换
Example Invocations
调用示例
User: "How do I convert a PyTorch model to CoreML?"
→ Invoke:
/skill coremlUser: "Compress my model to fit on iPhone"
→ Invoke:
/skill coremlUser: "Implement KV-cache for my language model"
→ Invoke:
/skill coremlUser: "Model loads slowly on first launch"
→ Invoke:
/skill coreml-diagUser: "My compressed model has bad accuracy"
→ Invoke:
/skill coreml-diagUser: "Add live transcription to my app"
→ Invoke:
/skill speechUser: "Transcribe audio files with SpeechAnalyzer"
→ Invoke:
/skill speechUser: "What's MLTensor and how do I use it?"
→ Invoke:
/skill coreml-ref用户:"如何将PyTorch模型转换为CoreML?"
→ 调用:
/skill coreml用户:"压缩我的模型以适配iPhone"
→ 调用:
/skill coreml用户:"为我的语言模型实现KV缓存"
→ 调用:
/skill coreml用户:"模型首次启动时加载缓慢"
→ 调用:
/skill coreml-diag用户:"我的压缩模型精度很差"
→ 调用:
/skill coreml-diag用户:"为我的应用添加实时转录功能"
→ 调用:
/skill speech用户:"使用SpeechAnalyzer转录音频文件"
→ 调用:
/skill speech用户:"MLTensor是什么,我该如何使用它?"
→ 调用:
/skill coreml-ref