Search Results: model-compression

Found 5 Skills

AI & Machine Learningdavila7/claude-code-templ...

model-pruning

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.

🇺🇸|EnglishTranslated

AI & Machine Learningcharleswiltgen/axiom

axiom-ios-ml

Use when deploying ANY machine learning model on-device, converting models to CoreML, compressing models, or implementing speech-to-text. Covers CoreML conversion, MLTensor, model compression (quantization/palettization/pruning), stateful models, KV-cache, multi-function models, async prediction, SpeechAnalyzer, SpeechTranscriber.

🇺🇸|EnglishTranslated

AI & Machine Learningletta-ai/skills

train-fasttext

Guidance for training FastText text classification models with constraints on model size and accuracy. This skill should be used when training FastText models, optimizing hyperparameters, or balancing trade-offs between model size and classification accuracy.

🇺🇸|EnglishTranslated

AI & Machine Learningmelodic-software/claude-c...

ml-inference-optimization

ML inference latency optimization, model compression, distillation, caching strategies, and edge deployment patterns. Use when optimizing inference performance, reducing model size, or deploying ML at the edge.

🇺🇸|EnglishTranslated

AI & Machine Learningdoanchienthangdev/omgkit

efficient-ai

Efficient AI techniques including model compression, quantization, pruning, knowledge distillation, and hardware-aware optimization for production systems.

🇺🇸|EnglishTranslated