sglang-humanize-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSGLang Humanize Review
SGLang 类人化代码审查
Overview
概述
Use this skill when the user asks for a human-style SGLang code review or wants
review feedback that resembles SGLang maintainers instead of generic linting.
The bundled corpus was collected from PRs created in 2024
and 2025, excluding PRs authored by bots or obvious coding-agent accounts. It
contains 10,959 inline review threads and 18,266 human reviewer comments. Each
thread preserves:
sgl-project/sglang- PR metadata
- file path and code language
- GitHub code context
diff_hunk - original reviewer comment text
- replies grouped into the same review discussion
- original comment language, including non-English and CJK text
Read references/corpus-summary.md first for
coverage, counts, top paths, and category distribution. Do not load the gzip
corpus directly into context; query it with the helper script.
当用户要求进行类人风格的SGLang代码审查,或者希望获得类似SGLang维护者给出的审查反馈而非通用代码检查时,可使用本技能。
附带的语料库收集自2024年和2025年仓库的PR,排除了由机器人或明显的代码Agent账号提交的PR。它包含10,959条嵌入式审查线程和18,266条人类审查者的评论。每条线程保留了以下内容:
sgl-project/sglang- PR元数据
- 文件路径和代码语言
- GitHub 代码上下文
diff_hunk - 原始审查者评论文本
- 归为同一审查讨论的回复
- 原始评论语言,包括非英语和中日韩文本
请先阅读references/corpus-summary.md了解覆盖范围、数量、高频路径和分类分布。请勿直接将gzip压缩的语料库加载到上下文环境中;请使用辅助脚本进行查询。
Corpus Tools
语料库工具
Search the corpus by topic, path, category, or reviewer:
bash
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--query cuda --limit 5
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--path python/sglang/srt --category correctness --limit 8
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--query server_args --format jsonl --limit 3The full corpus is:
text
references/sglang-review-corpus-2024-2025.jsonl.gzRegenerate it only when the user asks to refresh the evidence:
bash
python3 skills/sglang-humanize-review/scripts/collect_sglang_review_corpus.py \
--repo sgl-project/sglang \
--start-year 2024 \
--end-year 2025 \
--out-dir skills/sglang-humanize-review/references可按主题、路径、分类或审查者搜索语料库:
bash
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--query cuda --limit 5
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--path python/sglang/srt --category correctness --limit 8
python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
--query server_args --format jsonl --limit 3完整语料库路径为:
text
references/sglang-review-corpus-2024-2025.jsonl.gz仅当用户要求刷新证据时才重新生成语料库:
bash
python3 skills/sglang-humanize-review/scripts/collect_sglang_review_corpus.py \
--repo sgl-project/sglang \
--start-year 2024 \
--end-year 2025 \
--out-dir skills/sglang-humanize-review/referencesReview Workflow
审查流程
- Inspect the actual diff first.
- Use ,
git diff, or the patch supplied by the user.gh pr diff - Identify changed SGLang subsystems: server args, scheduler, memory/cache, model runner, attention backend, quantization, kernels, OpenAI API, metrics, docs, or tests.
- Use
- Read .
references/corpus-summary.md- Note top review surfaces and categories that overlap with the diff.
- Preserve the original language of any relevant corpus examples; do not translate user-facing comments unless the user asks.
- Query similar review threads.
- Search by path first for touched SGLang modules.
- Search by risk keyword next, for example ,
cuda,kv cache,server_args,openai,logprob,tp,dp,eagle,fp8, orbenchmark.pytest - Prefer evidence from the same subsystem over broad keyword matches.
- Produce a code-review response.
- Lead with concrete findings ordered by severity.
- Include file and line references from the reviewed diff.
- Explain the failure mode, not just the preferred style.
- Suggest a fix or validation step when the issue is actionable.
- Keep nits separate from correctness, performance, or compatibility risks.
- If no issue is found, say so clearly.
- Mention the main residual risk and the test or benchmark coverage that would increase confidence.
- 首先检查实际代码差异。
- 使用、
git diff或用户提供的补丁。gh pr diff - 识别变更涉及的SGLang子系统:server args、调度器、内存/缓存、模型运行器、注意力后端、量化、内核、OpenAI API、指标、文档或测试。
- 使用
- 阅读。
references/corpus-summary.md- 注意与代码差异重叠的主要审查方向和分类。
- 保留任何相关语料库示例的原始语言;除非用户要求,否则不要翻译面向用户的评论。
- 查询类似的审查线程。
- 首先按路径搜索受影响的SGLang模块。
- 然后按风险关键词搜索,例如、
cuda、kv cache、server_args、openai、logprob、tp、dp、eagle、fp8或benchmark。pytest - 优先选择同一子系统的证据,而非宽泛的关键词匹配。
- 生成代码审查回复。
- 首先列出按严重程度排序的具体发现。
- 包含被审查代码差异中的文件和行号引用。
- 解释故障模式,而非仅说明偏好的风格。
- 当问题可解决时,建议修复方案或验证步骤。
- 将细微问题与正确性、性能或兼容性风险分开。
- 如果未发现问题,请明确说明。
- 提及主要的剩余风险,以及能提升信心的测试或基准测试覆盖范围。
SGLang Review Heuristics From The Corpus
来自语料库的SGLang审查启发规则
Prioritize these risks because they recur heavily in the 2024-2025 human review
threads:
- Model and quantization behavior: model config drift, tokenizer assumptions, FP8/INT4 quantization paths, MoE routing, speculative decoding, and attention backend compatibility.
- Correctness before style: edge cases, failed assertions, unexpected error codes, shape/dtype mismatches, state cleanup, and silent behavior changes.
- GPU and kernel paths: CUDA graph capture, Triton/CUDA kernels, FlashInfer and FlashAttention behavior, launch conditions, SM compatibility, and fallback behavior.
- Server API compatibility: OpenAI-compatible request/response shapes,
, CLI defaults, endpoint behavior, streaming, and backward compatibility.
server_args - Memory and cache lifecycle: KV cache accounting, radix cache resets, memory pool ownership, eviction, fragmentation, and OOM behavior.
- Distributed runtime: TP/DP/PP/EP rank assumptions, NCCL paths, synchronization, worker state, race conditions, and hang risk.
- Tests and benchmarks: ask for targeted tests when behavior changes, and ask for benchmark evidence when a change claims performance or touches a hot path.
- Docs and examples: keep docs aligned with CLI defaults, endpoint names, model support, install steps, and version-specific behavior.
- Observability: review metrics, logs, warning levels, traceability, and error messages when operational behavior changes.
优先关注以下风险,因为它们在2024-2025年的人类审查线程中频繁出现:
- 模型与量化行为:模型配置偏差、分词器假设、FP8/INT4量化路径、MoE路由、speculative decoding、注意力后端兼容性。
- 正确性优先于风格:边缘情况、断言失败、意外错误码、形状/数据类型不匹配、状态清理、静默行为变更。
- GPU与内核路径:CUDA图捕获、Triton/CUDA内核、FlashInfer和FlashAttention行为、启动条件、SM兼容性、降级行为。
- Server API兼容性:OpenAI兼容的请求/响应格式、、CLI默认值、端点行为、流式传输、向后兼容性。
server_args - 内存与缓存生命周期:KV缓存统计、基数缓存重置、内存池所有权、驱逐、碎片化、OOM行为。
- 分布式运行时:TP/DP/PP/EP秩假设、NCCL路径、同步、工作节点状态、竞争条件、挂起风险。
- 测试与基准测试:当行为变更时,要求针对性测试;当变更声称提升性能或触及热点路径时,要求提供基准测试证据。
- 文档与示例:保持文档与CLI默认值、端点名称、模型支持、安装步骤和版本特定行为一致。
- 可观测性:当操作行为变更时,审查指标、日志、警告级别、可追溯性和错误消息。
Review Style
审查风格
Mirror human SGLang review habits:
- Be terse but specific.
- Prefer a question when intent is ambiguous.
- Call out production-facing behavior changes explicitly.
- Do not invent a corpus precedent; query the corpus when using it as evidence.
- Keep multilingual comments intact. If a relevant thread is Chinese or another language, use it as-is for evidence and answer in the user's language unless the user asks otherwise.
- Avoid cargo-culting old comments. Use corpus examples to sharpen the current review, not to force the current patch into an old template.
模仿人类SGLang维护者的审查习惯:
- 简洁但具体。
- 当意图不明确时,优先使用提问方式。
- 明确指出面向生产环境的行为变更。
- 不要编造语料库先例;当将其作为证据时,请查询语料库。
- 保留多语言注释原样。如果相关线程是中文或其他语言,直接将其作为证据,并使用用户的语言作答,除非用户另有要求。
- 避免盲目照搬旧评论。使用语料库示例来优化当前审查,而非将当前补丁强行套入旧模板。
Output Contract
输出约定
For a normal review, return:
- Findings first, ordered by severity, with file/line references.
- Open questions or assumptions.
- Test or benchmark gaps.
- A short summary only after findings.
For a review-prep pass before the user opens a PR, return:
- likely reviewer concerns
- missing tests or benchmark evidence
- suggested patch cleanup
- corpus queries used
For a corpus-backed explanation, include the query terms and summarize the
matched review behavior without dumping long comment bodies.
对于常规审查,返回:
- 首先列出按严重程度排序的发现,包含文件/行号引用。
- 未解决的问题或假设。
- 测试或基准测试缺口。
- 仅在发现之后添加简短总结。
对于用户提交PR前的审查准备环节,返回:
- 可能的审查关注点
- 缺失的测试或基准测试证据
- 建议的补丁优化
- 使用的语料库查询语句
对于基于语料库的解释,需包含查询术语,并总结匹配到的审查行为,不要直接输出冗长的评论内容。