sglang-humanize-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

SGLang Humanize Review

SGLang 类人化代码审查

Overview

概述

Use this skill when the user asks for a human-style SGLang code review or wants review feedback that resembles SGLang maintainers instead of generic linting.

The bundled corpus was collected from

sgl-project/sglang

PRs created in 2024 and 2025, excluding PRs authored by bots or obvious coding-agent accounts. It contains 10,959 inline review threads and 18,266 human reviewer comments. Each thread preserves:

PR metadata
file path and code language
GitHub
```
diff_hunk
```
code context
original reviewer comment text
replies grouped into the same review discussion
original comment language, including non-English and CJK text

Read references/corpus-summary.md first for coverage, counts, top paths, and category distribution. Do not load the gzip corpus directly into context; query it with the helper script.

当用户要求进行类人风格的SGLang代码审查，或者希望获得类似SGLang维护者给出的审查反馈而非通用代码检查时，可使用本技能。

附带的语料库收集自2024年和2025年

sgl-project/sglang

仓库的PR，排除了由机器人或明显的代码Agent账号提交的PR。它包含10,959条嵌入式审查线程和18,266条人类审查者的评论。每条线程保留了以下内容：

PR元数据
文件路径和代码语言
GitHub
```
diff_hunk
```
代码上下文
原始审查者评论文本
归为同一审查讨论的回复
原始评论语言，包括非英语和中日韩文本

请先阅读references/corpus-summary.md了解覆盖范围、数量、高频路径和分类分布。请勿直接将gzip压缩的语料库加载到上下文环境中；请使用辅助脚本进行查询。

Corpus Tools

语料库工具

Search the corpus by topic, path, category, or reviewer:

bash

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --query cuda --limit 5

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --path python/sglang/srt --category correctness --limit 8

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --query server_args --format jsonl --limit 3

The full corpus is:

text

references/sglang-review-corpus-2024-2025.jsonl.gz

Regenerate it only when the user asks to refresh the evidence:

bash

python3 skills/sglang-humanize-review/scripts/collect_sglang_review_corpus.py \
  --repo sgl-project/sglang \
  --start-year 2024 \
  --end-year 2025 \
  --out-dir skills/sglang-humanize-review/references

可按主题、路径、分类或审查者搜索语料库：

bash

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --query cuda --limit 5

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --path python/sglang/srt --category correctness --limit 8

python3 skills/sglang-humanize-review/scripts/query_sglang_review_corpus.py \
  --query server_args --format jsonl --limit 3

完整语料库路径为：

text

references/sglang-review-corpus-2024-2025.jsonl.gz

仅当用户要求刷新证据时才重新生成语料库：

bash

python3 skills/sglang-humanize-review/scripts/collect_sglang_review_corpus.py \
  --repo sgl-project/sglang \
  --start-year 2024 \
  --end-year 2025 \
  --out-dir skills/sglang-humanize-review/references

Review Workflow

审查流程

Inspect the actual diff first.
- Use
```
git diff
```
  ,
```
gh pr diff
```
  , or the patch supplied by the user.
- Identify changed SGLang subsystems: server args, scheduler, memory/cache, model runner, attention backend, quantization, kernels, OpenAI API, metrics, docs, or tests.
Read
```
references/corpus-summary.md
```
.
- Note top review surfaces and categories that overlap with the diff.
- Preserve the original language of any relevant corpus examples; do not translate user-facing comments unless the user asks.
Query similar review threads.
- Search by path first for touched SGLang modules.
- Search by risk keyword next, for example
```
cuda
```
  ,
```
kv cache
```
  ,
```
server_args
```
  ,
```
openai
```
  ,
```
logprob
```
  ,
```
tp
```
  ,
```
dp
```
  ,
```
eagle
```
  ,
```
fp8
```
  ,
```
benchmark
```
  , or
```
pytest
```
  .
- Prefer evidence from the same subsystem over broad keyword matches.
Produce a code-review response.
- Lead with concrete findings ordered by severity.
- Include file and line references from the reviewed diff.
- Explain the failure mode, not just the preferred style.
- Suggest a fix or validation step when the issue is actionable.
- Keep nits separate from correctness, performance, or compatibility risks.
If no issue is found, say so clearly.
- Mention the main residual risk and the test or benchmark coverage that would increase confidence.

首先检查实际代码差异。
- 使用
```
git diff
```
  、
```
gh pr diff
```
  或用户提供的补丁。
- 识别变更涉及的SGLang子系统：server args、调度器、内存/缓存、模型运行器、注意力后端、量化、内核、OpenAI API、指标、文档或测试。
阅读
```
references/corpus-summary.md
```
。
- 注意与代码差异重叠的主要审查方向和分类。
- 保留任何相关语料库示例的原始语言；除非用户要求，否则不要翻译面向用户的评论。
查询类似的审查线程。
- 首先按路径搜索受影响的SGLang模块。
- 然后按风险关键词搜索，例如
```
cuda
```
  、
```
kv cache
```
  、
```
server_args
```
  、
```
openai
```
  、
```
logprob
```
  、
```
tp
```
  、
```
dp
```
  、
```
eagle
```
  、
```
fp8
```
  、
```
benchmark
```
  或
```
pytest
```
  。
- 优先选择同一子系统的证据，而非宽泛的关键词匹配。
生成代码审查回复。
- 首先列出按严重程度排序的具体发现。
- 包含被审查代码差异中的文件和行号引用。
- 解释故障模式，而非仅说明偏好的风格。
- 当问题可解决时，建议修复方案或验证步骤。
- 将细微问题与正确性、性能或兼容性风险分开。
如果未发现问题，请明确说明。
- 提及主要的剩余风险，以及能提升信心的测试或基准测试覆盖范围。

SGLang Review Heuristics From The Corpus

来自语料库的SGLang审查启发规则

Prioritize these risks because they recur heavily in the 2024-2025 human review threads:

Model and quantization behavior: model config drift, tokenizer assumptions, FP8/INT4 quantization paths, MoE routing, speculative decoding, and attention backend compatibility.
Correctness before style: edge cases, failed assertions, unexpected error codes, shape/dtype mismatches, state cleanup, and silent behavior changes.
GPU and kernel paths: CUDA graph capture, Triton/CUDA kernels, FlashInfer and FlashAttention behavior, launch conditions, SM compatibility, and fallback behavior.
Server API compatibility: OpenAI-compatible request/response shapes,
```
server_args
```
, CLI defaults, endpoint behavior, streaming, and backward compatibility.
Memory and cache lifecycle: KV cache accounting, radix cache resets, memory pool ownership, eviction, fragmentation, and OOM behavior.
Distributed runtime: TP/DP/PP/EP rank assumptions, NCCL paths, synchronization, worker state, race conditions, and hang risk.
Tests and benchmarks: ask for targeted tests when behavior changes, and ask for benchmark evidence when a change claims performance or touches a hot path.
Docs and examples: keep docs aligned with CLI defaults, endpoint names, model support, install steps, and version-specific behavior.
Observability: review metrics, logs, warning levels, traceability, and error messages when operational behavior changes.

优先关注以下风险，因为它们在2024-2025年的人类审查线程中频繁出现：

模型与量化行为：模型配置偏差、分词器假设、FP8/INT4量化路径、MoE路由、speculative decoding、注意力后端兼容性。
正确性优先于风格：边缘情况、断言失败、意外错误码、形状/数据类型不匹配、状态清理、静默行为变更。
GPU与内核路径：CUDA图捕获、Triton/CUDA内核、FlashInfer和FlashAttention行为、启动条件、SM兼容性、降级行为。
Server API兼容性：OpenAI兼容的请求/响应格式、
```
server_args
```
、CLI默认值、端点行为、流式传输、向后兼容性。
内存与缓存生命周期：KV缓存统计、基数缓存重置、内存池所有权、驱逐、碎片化、OOM行为。
分布式运行时：TP/DP/PP/EP秩假设、NCCL路径、同步、工作节点状态、竞争条件、挂起风险。
测试与基准测试：当行为变更时，要求针对性测试；当变更声称提升性能或触及热点路径时，要求提供基准测试证据。
文档与示例：保持文档与CLI默认值、端点名称、模型支持、安装步骤和版本特定行为一致。
可观测性：当操作行为变更时，审查指标、日志、警告级别、可追溯性和错误消息。

Review Style

审查风格

Mirror human SGLang review habits:

Be terse but specific.
Prefer a question when intent is ambiguous.
Call out production-facing behavior changes explicitly.
Do not invent a corpus precedent; query the corpus when using it as evidence.
Keep multilingual comments intact. If a relevant thread is Chinese or another language, use it as-is for evidence and answer in the user's language unless the user asks otherwise.
Avoid cargo-culting old comments. Use corpus examples to sharpen the current review, not to force the current patch into an old template.

模仿人类SGLang维护者的审查习惯：

简洁但具体。
当意图不明确时，优先使用提问方式。
明确指出面向生产环境的行为变更。
不要编造语料库先例；当将其作为证据时，请查询语料库。
保留多语言注释原样。如果相关线程是中文或其他语言，直接将其作为证据，并使用用户的语言作答，除非用户另有要求。
避免盲目照搬旧评论。使用语料库示例来优化当前审查，而非将当前补丁强行套入旧模板。

Output Contract

输出约定

For a normal review, return:

Findings first, ordered by severity, with file/line references.
Open questions or assumptions.
Test or benchmark gaps.
A short summary only after findings.

For a review-prep pass before the user opens a PR, return:

likely reviewer concerns
missing tests or benchmark evidence
suggested patch cleanup
corpus queries used

For a corpus-backed explanation, include the query terms and summarize the matched review behavior without dumping long comment bodies.

对于常规审查，返回：

首先列出按严重程度排序的发现，包含文件/行号引用。
未解决的问题或假设。
测试或基准测试缺口。
仅在发现之后添加简短总结。

对于用户提交PR前的审查准备环节，返回：

可能的审查关注点
缺失的测试或基准测试证据
建议的补丁优化
使用的语料库查询语句

对于基于语料库的解释，需包含查询术语，并总结匹配到的审查行为，不要直接输出冗长的评论内容。