Loading...
Loading...
Use when an SGLang, vLLM, or TensorRT-LLM serving/model optimization task needs prior model-family PR evidence. Query and read the PR-driven history docs under model-pr-optimization-history before choosing source paths, fast paths, kernel/fusion ideas, regression risks, or validation lanes.
npx skill4agent add bbuf/sglang-auto-driven-skills model-pr-history-knowledgepython3 scripts/query.py --list
python3 scripts/query.py --framework sglang --model qwen3-core --paths-only
python3 scripts/query.py --framework sglang --model qwen3-core "fused qk norm rope"
python3 scripts/query.py --framework vllm "DeepSeek-V4 fused norm router" --limit 5--framework sglang|vllm--model <slug>--lang en|zh|both--paths-only--limit Nscripts/query.py "<model name>"history/model-pr-history-notes.mdsglangvllmdeepseek-ocr, deepseek-ocr-2, deepseek-v3-r1, deepseek-v31, deepseek-v32,
deepseek-v4, ernie45, gemma4, glm-vlm-ocr, glm45, glm46-glm47, glm5-glm51,
gpt-oss, intern-s1, internvl35, jina-reranker-m0, kimi, ling25, llada21,
llama31, llama33-70b, llama4, mimo-v2-flash, minimax, mistral-small-4,
mixtral-quark-int4fp8-moe, nemotron-super, qwen-vlm-omni-asr, qwen3-coder,
qwen3-core, qwen3-next, qwen35, ring25, step35sglang-sota-humanize-loopanalysis/root-cause.mdhistory/model-pr-history-notes.md