devtu-docs-quality
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDocumentation Quality Assurance
文档质量保障
Systematic documentation quality system combining automated validation scripts with ToolUniverse-specific structural audits.
结合自动化验证脚本与ToolUniverse专属结构审计的系统化文档质量体系。
When to Use
适用场景
- Pre-release documentation review
- After major refactoring (commands, APIs, tool counts changed)
- User reports confusing or outdated documentation
- Circular navigation or structural problems suspected
- Want to establish automated validation pipeline
- 发布前文档评审
- 重大重构后(命令、API、工具数量变更)
- 用户反馈文档存在混淆或过时问题
- 怀疑存在循环导航或结构问题
- 希望建立自动化验证流水线
Approach: Two-Phase Strategy
实施方法:两阶段策略
Phase A: Automated Validation (15-20 min)
- Create validation scripts for systematic detection
- Test commands, links, terminology consistency
- Priority-based fixes (blockers → polish)
Phase B: ToolUniverse-Specific Audit (20-25 min)
- Circular navigation checks
- MCP configuration duplication
- Tool count consistency
- Auto-generated file conflicts
阶段A:自动化验证(15-20分钟)
- 创建验证脚本以进行系统性检测
- 测试命令、链接、术语一致性
- 基于优先级修复问题(阻塞问题 → 优化问题)
阶段B:ToolUniverse专属审计(20-25分钟)
- 循环导航检查
- MCP配置重复检查
- 工具数量一致性检查
- 自动生成文件冲突检查
Phase A: Automated Validation
阶段A:自动化验证
A1. Build Validation Script
A1. 构建验证脚本
Create :
scripts/validate_documentation.pypython
#!/usr/bin/env python3
"""Documentation validator for ToolUniverse"""
import re
import glob
from pathlib import Path
DOCS_ROOT = Path("docs")创建 :
scripts/validate_documentation.pypython
#!/usr/bin/env python3
"""Documentation validator for ToolUniverse"""
import re
import glob
from pathlib import Path
DOCS_ROOT = Path("docs")ToolUniverse-specific patterns
ToolUniverse-specific patterns
DEPRECATED_PATTERNS = [
(r"python -m tooluniverse.server", "tooluniverse-server"),
(r"600+?\s+tools", "1000+ tools"),
(r"750+?\s+tools", "1000+ tools"),
]
def is_false_positive(match, content):
"""Smart context checking to avoid false positives"""
start = max(0, match.start() - 100)
end = min(len(content), match.end() + 100)
context = content[start:end].lower()
# Skip if discussing deprecation itself
if any(kw in context for kw in ['deprecated', 'old version', 'migration']):
return True
# Skip technical values (ports, dimensions, etc.)
if any(kw in context for kw in ['width', 'height', 'port', '":"']):
return True
return Falsedef validate_file(filepath):
"""Check one file for issues"""
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
issues = []
# Check deprecated patterns
for old_pattern, new_text in DEPRECATED_PATTERNS:
matches = re.finditer(old_pattern, content)
for match in matches:
if is_false_positive(match, content):
continue
line_num = content[:match.start()].count('\n') + 1
issues.append({
'file': filepath,
'line': line_num,
'severity': 'HIGH',
'found': match.group(),
'suggestion': new_text
})
return issuesDEPRECATED_PATTERNS = [
(r"python -m tooluniverse.server", "tooluniverse-server"),
(r"600+?\s+tools", "1000+ tools"),
(r"750+?\s+tools", "1000+ tools"),
]
def is_false_positive(match, content):
"""Smart context checking to avoid false positives"""
start = max(0, match.start() - 100)
end = min(len(content), match.end() + 100)
context = content[start:end].lower()
# Skip if discussing deprecation itself
if any(kw in context for kw in ['deprecated', 'old version', 'migration']):
return True
# Skip technical values (ports, dimensions, etc.)
if any(kw in context for kw in ['width', 'height', 'port', '":"']):
return True
return Falsedef validate_file(filepath):
"""Check one file for issues"""
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
issues = []
# Check deprecated patterns
for old_pattern, new_text in DEPRECATED_PATTERNS:
matches = re.finditer(old_pattern, content)
for match in matches:
if is_false_positive(match, content):
continue
line_num = content[:match.start()].count('\n') + 1
issues.append({
'file': filepath,
'line': line_num,
'severity': 'HIGH',
'found': match.group(),
'suggestion': new_text
})
return issuesScan all docs
Scan all docs
all_issues = []
for doc_file in glob.glob(str(DOCS_ROOT / "**/*.md"), recursive=True):
all_issues.extend(validate_file(doc_file))
for doc_file in glob.glob(str(DOCS_ROOT / "**/*.rst"), recursive=True):
all_issues.extend(validate_file(doc_file))
all_issues = []
for doc_file in glob.glob(str(DOCS_ROOT / "**/*.md"), recursive=True):
all_issues.extend(validate_file(doc_file))
for doc_file in glob.glob(str(DOCS_ROOT / "**/*.rst"), recursive=True):
all_issues.extend(validate_file(doc_file))
Report
Report
if all_issues:
print(f"❌ Found {len(all_issues)} issues\n")
for issue in all_issues:
print(f"{issue['file']}:{issue['line']} [{issue['severity']}]")
print(f" Found: {issue['found']}")
print(f" Should be: {issue['suggestion']}\n")
exit(1)
else:
print("✅ Documentation validation passed")
exit(0)
undefinedif all_issues:
print(f"❌ Found {len(all_issues)} issues\n")
for issue in all_issues:
print(f"{issue['file']}:{issue['line']} [{issue['severity']}]")
print(f" Found: {issue['found']}")
print(f" Should be: {issue['suggestion']}\n")
exit(1)
else:
print("✅ Documentation validation passed")
exit(0)
undefinedA2. Command Accuracy Check
A2. 命令准确性检查
Test that commands in docs actually work:
bash
undefined测试文档中的命令是否实际可用:
bash
undefinedExtract and test commands
Extract and test commands
grep -r "^\s*$\s*" docs/ | while read line; do
cmd=$(echo "$line" | sed 's/.*$ //' | cut -d' ' -f1)
if ! command -v "$cmd" &> /dev/null; then
echo "❌ Command not found: $cmd in $line"
fi
done
undefinedgrep -r "^\s*$\s*" docs/ | while read line; do
cmd=$(echo "$line" | sed 's/.*$ //' | cut -d' ' -f1)
if ! command -v "$cmd" &> /dev/null; then
echo "❌ Command not found: $cmd in $line"
fi
done
undefinedA3. Link Integrity Check
A3. 链接完整性检查
For RST docs:
python
def check_rst_links(docs_root):
"""Validate :doc: references"""
pattern = r':doc:`([^`]+)`'
for rst_file in glob.glob(f"{docs_root}/**/*.rst", recursive=True):
with open(rst_file) as f:
content = f.read()
matches = re.finditer(pattern, content)
for match in matches:
ref = match.group(1)
# Check if target exists
possible = [f"{ref}.rst", f"{ref}.md", f"{ref}/index.rst"]
if not any(Path(docs_root, p).exists() for p in possible):
print(f"❌ Broken link in {rst_file}: {ref}")针对RST文档:
python
def check_rst_links(docs_root):
"""Validate :doc: references"""
pattern = r':doc:`([^`]+)`'
for rst_file in glob.glob(f"{docs_root}/**/*.rst", recursive=True):
with open(rst_file) as f:
content = f.read()
matches = re.finditer(pattern, content)
for match in matches:
ref = match.group(1)
# Check if target exists
possible = [f"{ref}.rst", f"{ref}.md", f"{ref}/index.rst"]
if not any(Path(docs_root, p).exists() for p in possible):
print(f"❌ Broken link in {rst_file}: {ref}")A4. Terminology Consistency
A4. 术语一致性检查
Track variations and standardize:
python
undefined追踪术语变体并标准化:
python
undefinedDefine standard terms
Define standard terms
TERMINOLOGY = {
'api_endpoint': ['endpoint', 'url', 'route', 'path'],
'tool_count': ['tools', 'resources', 'integrations'],
}
def check_terminology(content):
"""Find inconsistent terminology"""
for standard, variations in TERMINOLOGY.items():
counts = {v: content.lower().count(v) for v in variations}
if len([c for c in counts.values() if c > 0]) > 2:
return f"Inconsistent terminology: {counts}"
return None
undefinedTERMINOLOGY = {
'api_endpoint': ['endpoint', 'url', 'route', 'path'],
'tool_count': ['tools', 'resources', 'integrations'],
}
def check_terminology(content):
"""Find inconsistent terminology"""
for standard, variations in TERMINOLOGY.items():
counts = {v: content.lower().count(v) for v in variations}
if len([c for c in counts.values() if c > 0]) > 2:
return f"Inconsistent terminology: {counts}"
return None
undefinedPhase B: ToolUniverse-Specific Audit
阶段B:ToolUniverse专属审计
B1. Circular Navigation Check
B1. 循环导航检查
Issue: Documentation pages that reference each other in loops.
Check manually:
bash
undefined问题:文档页面之间形成互相引用的循环。
手动检查:
bash
undefinedFind cross-references
Find cross-references
grep -r ":doc:`" docs/*.rst | grep -E "(quickstart|getting_started|installation)"
**Checklist**:
- [ ] Is there a clear "Start Here" on `docs/index.rst`?
- [ ] Does navigation follow linear path: index → quickstart → getting_started → guides?
- [ ] No "you should have completed X first" statements that create dependency loops?
**Common patterns to fix**:
- `quickstart.rst` → "See getting_started"
- `getting_started.rst` → "Complete quickstart first"grep -r ":doc:`" docs/*.rst | grep -E "(quickstart|getting_started|installation)"
**检查清单**:
- [ ] `docs/index.rst` 是否有明确的「开始入口」?
- [ ] 导航是否遵循线性路径:index → quickstart → getting_started → guides?
- [ ] 是否存在「应先完成X」的表述导致依赖循环?
**常见需修复模式**:
- `quickstart.rst` → 「参见getting_started」
- `getting_started.rst` → 「请先完成quickstart」B2. Duplicate Content Check
B2. 重复内容检查
Common duplicates in ToolUniverse:
- Multiple FAQs: and
docs/faq.rstdocs/help/faq.rst - Getting started: ,
docs/installation.rst,docs/quickstart.rstdocs/getting_started.rst - MCP configuration: All files in
docs/guide/building_ai_scientists/
Detection:
bash
undefinedToolUniverse中常见重复内容:
- 多个FAQ:和
docs/faq.rstdocs/help/faq.rst - 入门指南:,
docs/installation.rst,docs/quickstart.rstdocs/getting_started.rst - MCP配置:下的所有文件
docs/guide/building_ai_scientists/
检测方法:
bash
undefinedFind MCP config duplication
Find MCP config duplication
rg "MCP.*configuration" docs/ -l | wc -l
rg "pip install tooluniverse" docs/ -l | wc -l
**Action**: Consolidate or clearly differentiaterg "MCP.*configuration" docs/ -l | wc -l
rg "pip install tooluniverse" docs/ -l | wc -l
**处理措施**:合并内容或明确区分差异B3. Tool Count Consistency
B3. 工具数量一致性检查
Standard: Use "1000+ tools" consistently.
Detection:
bash
undefined标准:统一使用「1000+ tools」表述。
检测方法:
bash
undefinedFind all tool count mentions
Find all tool count mentions
rg "[0-9]++?\s+(tools|resources|integrations)" docs/ --no-filename | sort -u
**Check**:
- [ ] Are different numbers used (600, 750, 1195)?
- [ ] Is "1000+ tools" used consistently?
- [ ] Exact counts avoided in favor of "1000+"?rg "[0-9]++?\s+(tools|resources|integrations)" docs/ --no-filename | sort -u
**检查项**:
- [ ] 是否使用了不同数字(600、750、1195)?
- [ ] 是否统一使用「1000+ tools」?
- [ ] 是否避免使用精确数字,改用「1000+」?B4. Auto-Generated File Headers
B4. 自动生成文件头检查
Auto-generated directories:
- (from
docs/tools/*_tools.rst)generate_config_index.py - (from
docs/api/*.rst)sphinx-apidoc
Required header:
rst
.. AUTO-GENERATED - DO NOT EDIT MANUALLY
.. Generated by: docs/generate_config_index.py
.. Last updated: 2024-02-05
..
.. To modify, edit source files and regenerate.Check:
bash
head -5 docs/tools/*_tools.rst | grep "AUTO-GENERATED"自动生成目录:
- (来自
docs/tools/*_tools.rst)generate_config_index.py - (来自
docs/api/*.rst)sphinx-apidoc
必填文件头:
rst
.. AUTO-GENERATED - DO NOT EDIT MANUALLY
.. Generated by: docs/generate_config_index.py
.. Last updated: 2024-02-05
..
.. To modify, edit source files and regenerate.检查方法:
bash
head -5 docs/tools/*_tools.rst | grep "AUTO-GENERATED"B5. CLI Tools Documentation
B5. CLI工具文档检查
Check pyproject.toml for all CLIs:
bash
grep -A 20 "\[project.scripts\]" pyproject.tomlCommon undocumented:
tooluniverse-expert-feedbacktooluniverse-expert-feedback-webgenerate-mcp-tools
Action: Ensure all in
docs/reference/cli_tools.rst检查pyproject.toml中的所有CLI:
bash
grep -A 20 "\[project.scripts\]" pyproject.toml常见未文档化CLI:
tooluniverse-expert-feedbacktooluniverse-expert-feedback-webgenerate-mcp-tools
处理措施:确保所有CLI都已记录在 中
docs/reference/cli_tools.rstB6. Environment Variables
B6. 环境变量检查
Discovery:
bash
undefined发现方法:
bash
undefinedFind all env vars in code
Find all env vars in code
rg "os.getenv|os.environ" src/tooluniverse/ -o | sort -u
rg "TOOLUNIVERSE_[A-Z_]+" src/tooluniverse/ -o | sort -u
**Categories to document**:
- Cache: `TOOLUNIVERSE_CACHE_*`
- Logging: `TOOLUNIVERSE_LOG_*`
- LLM: `TOOLUNIVERSE_LLM_*`
- API keys: `*_API_KEY`
**Check**:
- [ ] Does `docs/reference/environment_variables.rst` exist?
- [ ] Are variables categorized?
- [ ] Each has: default, description, example?
- [ ] Is there `.env.template` at project root?rg "os.getenv|os.environ" src/tooluniverse/ -o | sort -u
rg "TOOLUNIVERSE_[A-Z_]+" src/tooluniverse/ -o | sort -u
**需文档化的类别**:
- 缓存:`TOOLUNIVERSE_CACHE_*`
- 日志:`TOOLUNIVERSE_LOG_*`
- LLM:`TOOLUNIVERSE_LLM_*`
- API密钥:`*_API_KEY`
**检查项**:
- [ ] 是否存在 `docs/reference/environment_variables.rst`?
- [ ] 变量是否已分类?
- [ ] 每个变量是否包含:默认值、描述、示例?
- [ ] 项目根目录是否有 `.env.template` 文件?B7. ToolUniverse-Specific Jargon
B7. ToolUniverse专属术语检查
Terms to define on first use:
- Tool Specification
- EFO ID
- MCP, SMCP
- Compact Mode
- Tool Finder
- AI Scientist
Check:
- Is there ?
docs/glossary.rst - Terms defined inline with references?
:term: - Glossary linked from main index?
首次出现时需定义的术语:
- Tool Specification
- EFO ID
- MCP、SMCP
- Compact Mode
- Tool Finder
- AI Scientist
检查项:
- 是否存在 ?
docs/glossary.rst - 术语是否使用 引用进行内联定义?
:term: - 术语表是否从主索引页链接?
B8. CI/CD Documentation Regeneration
B8. CI/CD文档再生检查
Required in :
.github/workflows/deploy-docs.ymlyaml
- name: Regenerate tool documentation
run: |
cd docs
python generate_config_index.py
python generate_remote_tools_docs.py
python generate_tool_reference.pyCheck:
- CI/CD regenerates docs before build?
- Regeneration happens BEFORE Sphinx build?
- excluded from cache?
docs/api/
.github/workflows/deploy-docs.yml 中必填步骤:
yaml
- name: Regenerate tool documentation
run: |
cd docs
python generate_config_index.py
python generate_remote_tools_docs.py
python generate_tool_reference.py检查项:
- CI/CD是否在构建前再生文档?
- 再生步骤是否在Sphinx构建之前执行?
- 是否被排除在缓存之外?
docs/api/
Priority Framework
优先级框架
Issue Severity
问题严重程度
| Severity | Definition | Examples | Timeline |
|---|---|---|---|
| CRITICAL | Blocks release | Broken builds, dangerous instructions | Immediate |
| HIGH | Blocks users | Wrong commands, broken setup | Same day |
| MEDIUM | Causes confusion | Inconsistent terminology, unclear examples | Same week |
| LOW | Reduces quality | Long files, minor formatting | Future task |
| 严重程度 | 定义 | 示例 | 处理时限 |
|---|---|---|---|
| CRITICAL(致命) | 阻碍发布 | 构建失败、危险操作说明 | 立即处理 |
| HIGH(高) | 阻碍用户使用 | 错误命令、失效安装步骤 | 当日处理 |
| MEDIUM(中) | 造成用户混淆 | 术语不一致、示例不清晰 | 当周处理 |
| LOW(低) | 降低文档质量 | 文件过长、轻微格式问题 | 后续任务 |
Fix Order
修复顺序
- Run automated validation → Fix HIGH issues
- Check circular navigation → Fix CRITICAL loops
- Verify tool counts → Standardize to "1000+"
- Check auto-generated headers → Add missing
- Validate CLI docs → Document all from pyproject.toml
- Check env vars → Create reference page
- Review jargon → Create/update glossary
- Verify CI/CD → Add regeneration steps
- 运行自动化验证 → 修复高优先级问题
- 检查循环导航 → 修复致命循环
- 验证工具数量表述 → 统一为「1000+」
- 检查自动生成文件头 → 添加缺失的文件头
- 验证CLI文档 → 记录pyproject.toml中的所有CLI
- 检查环境变量 → 创建参考页面
- 审阅专属术语 → 创建/更新术语表
- 验证CI/CD → 添加文档再生步骤
Validation Checklist
验证检查清单
Before considering docs "done":
文档被视为「完成」前需满足:
Accuracy
准确性
- Automated validation passes
- All commands tested
- Version numbers current
- Counts match reality
- 自动化验证通过
- 所有命令已测试
- 版本号为最新
- 数量表述与实际一致
Structure (ToolUniverse-specific)
结构(ToolUniverse专属)
- No circular navigation
- Clear "Start Here" entry point
- Linear learning path
- Max 2-3 level hierarchy
- 无循环导航
- 明确的「开始入口」
- 线性学习路径
- 最多2-3级层级结构
Consistency
一致性
- "1000+ tools" everywhere
- Same terminology throughout
- Auto-generated files have headers
- All CLIs documented
- 所有地方均使用「1000+ tools」
- 术语全程统一
- 自动生成文件包含指定头
- 所有CLI已文档化
Completeness
完整性
- All features documented
- All CLIs in pyproject.toml covered
- All env vars documented
- Glossary includes all jargon
- 所有功能已文档化
- pyproject.toml中的所有CLI已覆盖
- 所有环境变量已文档化
- 术语表包含所有技术术语
Output: Audit Report
输出:审计报告
markdown
undefinedmarkdown
undefinedDocumentation Quality Report
Documentation Quality Report
Date: [date]
Scope: Automated validation + ToolUniverse audit
Date: [date]
Scope: Automated validation + ToolUniverse audit
Executive Summary
Executive Summary
- Files scanned: X
- Issues found: Y (Critical: A, High: B, Medium: C, Low: D)
- Files scanned: X
- Issues found: Y (Critical: A, High: B, Medium: C, Low: D)
Critical Issues
Critical Issues
- [Issue] - Location: file:line
- Problem: [description]
- Fix: [action]
- Effort: [time]
- [Issue] - Location: file:line
- Problem: [description]
- Fix: [action]
- Effort: [time]
Automated Validation Results
Automated Validation Results
- Deprecated commands: X instances
- Inconsistent counts: Y instances
- Broken links: Z instances
- Deprecated commands: X instances
- Inconsistent counts: Y instances
- Broken links: Z instances
ToolUniverse-Specific Findings
ToolUniverse-Specific Findings
- Circular navigation: [yes/no]
- Tool count variations: [list]
- Missing CLI docs: [list]
- Auto-generated headers: X missing
- Circular navigation: [yes/no]
- Tool count variations: [list]
- Missing CLI docs: [list]
- Auto-generated headers: X missing
Recommendations
Recommendations
- Immediate (today): [list]
- This week: [list]
- Next sprint: [list]
- Immediate (today): [list]
- This week: [list]
- Next sprint: [list]
Validation Command
Validation Command
Run to verify fixes
python scripts/validate_documentation.pyundefinedRun to verify fixes
python scripts/validate_documentation.pyundefinedCI/CD Integration
CI/CD集成
Add to :
.github/workflows/validate-docs.ymlyaml
name: Validate Documentation
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run validation
run: python scripts/validate_documentation.py
- name: Check auto-generated headers
run: |
for f in docs/tools/*_tools.rst; do
if ! head -1 "$f" | grep -q "AUTO-GENERATED"; then
echo "Missing header: $f"
exit 1
fi
done添加至 :
.github/workflows/validate-docs.ymlyaml
name: Validate Documentation
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run validation
run: python scripts/validate_documentation.py
- name: Check auto-generated headers
run: |
for f in docs/tools/*_tools.rst; do
if ! head -1 "$f" | grep -q "AUTO-GENERATED"; then
echo "Missing header: $f"
exit 1
fi
doneCommon Issues Quick Reference
常见问题快速参考
| Issue | Detection | Fix |
|---|---|---|
| Deprecated command | | Replace with |
| Wrong tool count | | Change to "1000+ tools" |
| Circular nav | Manual trace | Remove back-references |
| Missing header | | Add AUTO-GENERATED header |
| Undocumented CLI | Check pyproject.toml | Add to cli_tools.rst |
| Missing env var | | Add to env vars reference |
| 问题 | 检测方法 | 修复措施 |
|---|---|---|
| 过时命令 | | 替换为 |
| 错误工具数量 | | 改为「1000+ tools」 |
| 循环导航 | 手动追踪 | 删除反向引用 |
| 缺失文件头 | | 添加AUTO-GENERATED头 |
| 未文档化CLI | 检查pyproject.toml | 添加至cli_tools.rst |
| 缺失环境变量文档 | | 添加至环境变量参考页 |
Best Practices
最佳实践
- Automate first - Build validation before manual audit
- Context matters - Smart pattern matching avoids false positives
- Fix systematically - Batch similar issues together
- Validate continuously - Add to CI/CD pipeline
- ToolUniverse-specific last - Automated checks catch most issues
- 优先自动化 - 先构建验证脚本再进行手动审计
- 关注上下文 - 智能模式匹配避免误报
- 系统化修复 - 批量处理同类问题
- 持续验证 - 集成至CI/CD流水线
- 最后处理专属内容 - 自动化检查可覆盖大部分问题
Success Criteria
成功标准
Documentation quality achieved when:
- ✅ Automated validation reports 0 HIGH issues
- ✅ No circular navigation
- ✅ "1000+ tools" used consistently
- ✅ All auto-generated files have headers
- ✅ All CLIs from pyproject.toml documented
- ✅ All env vars have reference page
- ✅ Glossary covers all technical terms
- ✅ CI/CD validates on every PR
满足以下条件时,文档质量达标:
- ✅ 自动化验证报告0个高优先级问题
- ✅ 无循环导航
- ✅ 统一使用「1000+ tools」
- ✅ 所有自动生成文件包含指定头
- ✅ pyproject.toml中的所有CLI已文档化
- ✅ 所有环境变量有对应的参考页面
- ✅ 术语表覆盖所有技术术语
- ✅ CI/CD在每个PR上执行验证