Search Results: sgl-kernel

Found 2 Skills

add-sgl-kernel

Step-by-step tutorial for adding a heavyweight AOT CUDA/C++ kernel to sgl-kernel (including tests & benchmarks)

AI & Machine Learningslowlyc/agent-gpu-skills

sglang-skill

Develop, debug, and optimize SGLang LLM serving engine. Use when the user mentions SGLang, sglang, srt, sgl-kernel, LLM serving, model inference, KV cache, attention backend, FlashInfer, MLA, MoE routing, speculative decoding, disaggregated serving, TP/PP/EP, radix cache, continuous batching, chunked prefill, CUDA graph, model loading, quantization FP8/GPTQ/AWQ, JIT kernel, triton kernel SGLang, or asks about serving LLMs with SGLang.

🇺🇸|EnglishTranslated

1 scripts/Attention