Search Results: kernel-selection

Found 1 Skills

AI & Machine Learningpepperu96/hyper-mla

mla-analysis

MLA (Multi-Latent Attention) cost models, regime analysis, and kernel selection guide. Use when: (1) reasoning about which kernel approach to use for a given regime, (2) understanding cost model tradeoffs between FlashMLA, FlashAttention, and MLAvar6+, (3) analyzing roofline behavior across decode/speculative/prefill regimes, (4) setting optimization targets, (5) understanding MLA math and absorption trick.

🇺🇸|EnglishTranslated