devtu-docs-quality

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Documentation Quality Assurance

文档质量保障

Systematic documentation quality system combining automated validation scripts with ToolUniverse-specific structural audits.

结合自动化验证脚本与ToolUniverse专属结构审计的系统化文档质量体系。

When to Use

适用场景

Pre-release documentation review
After major refactoring (commands, APIs, tool counts changed)
User reports confusing or outdated documentation
Circular navigation or structural problems suspected
Want to establish automated validation pipeline

发布前文档评审
重大重构后（命令、API、工具数量变更）
用户反馈文档存在混淆或过时问题
怀疑存在循环导航或结构问题
希望建立自动化验证流水线

Approach: Two-Phase Strategy

实施方法：两阶段策略

Phase A: Automated Validation (15-20 min)

Create validation scripts for systematic detection
Test commands, links, terminology consistency
Priority-based fixes (blockers → polish)

Phase B: ToolUniverse-Specific Audit (20-25 min)

Circular navigation checks
MCP configuration duplication
Tool count consistency
Auto-generated file conflicts

阶段A：自动化验证（15-20分钟）

创建验证脚本以进行系统性检测
测试命令、链接、术语一致性
基于优先级修复问题（阻塞问题 → 优化问题）

阶段B：ToolUniverse专属审计（20-25分钟）

循环导航检查
MCP配置重复检查
工具数量一致性检查
自动生成文件冲突检查

Phase A: Automated Validation

阶段A：自动化验证

A1. Build Validation Script

A1. 构建验证脚本

Create

scripts/validate_documentation.py

python

#!/usr/bin/env python3
"""Documentation validator for ToolUniverse"""

import re
import glob
from pathlib import Path

DOCS_ROOT = Path("docs")

创建

scripts/validate_documentation.py

python

#!/usr/bin/env python3
"""Documentation validator for ToolUniverse"""

import re
import glob
from pathlib import Path

DOCS_ROOT = Path("docs")

ToolUniverse-specific patterns

DEPRECATED_PATTERNS = [ (r"python -m tooluniverse.server", "tooluniverse-server"), (r"600+?\s+tools", "1000+ tools"), (r"750+?\s+tools", "1000+ tools"), ]

def is_false_positive(match, content): """Smart context checking to avoid false positives""" start = max(0, match.start() - 100) end = min(len(content), match.end() + 100) context = content[start:end].lower()

# Skip if discussing deprecation itself
if any(kw in context for kw in ['deprecated', 'old version', 'migration']):
    return True

# Skip technical values (ports, dimensions, etc.)
if any(kw in context for kw in ['width', 'height', 'port', '":"']):
    return True

return False

def validate_file(filepath): """Check one file for issues""" with open(filepath, 'r', encoding='utf-8') as f: content = f.read()

issues = []

# Check deprecated patterns
for old_pattern, new_text in DEPRECATED_PATTERNS:
    matches = re.finditer(old_pattern, content)
    for match in matches:
        if is_false_positive(match, content):
            continue
        
        line_num = content[:match.start()].count('\n') + 1
        issues.append({
            'file': filepath,
            'line': line_num,
            'severity': 'HIGH',
            'found': match.group(),
            'suggestion': new_text
        })

return issues

DEPRECATED_PATTERNS = [ (r"python -m tooluniverse.server", "tooluniverse-server"), (r"600+?\s+tools", "1000+ tools"), (r"750+?\s+tools", "1000+ tools"), ]

# Skip if discussing deprecation itself
if any(kw in context for kw in ['deprecated', 'old version', 'migration']):
    return True

# Skip technical values (ports, dimensions, etc.)
if any(kw in context for kw in ['width', 'height', 'port', '":"']):
    return True

return False

def validate_file(filepath): """Check one file for issues""" with open(filepath, 'r', encoding='utf-8') as f: content = f.read()

issues = []

# Check deprecated patterns
for old_pattern, new_text in DEPRECATED_PATTERNS:
    matches = re.finditer(old_pattern, content)
    for match in matches:
        if is_false_positive(match, content):
            continue
        
        line_num = content[:match.start()].count('\n') + 1
        issues.append({
            'file': filepath,
            'line': line_num,
            'severity': 'HIGH',
            'found': match.group(),
            'suggestion': new_text
        })

return issues

Scan all docs

all_issues = [] for doc_file in glob.glob(str(DOCS_ROOT / "**/*.md"), recursive=True): all_issues.extend(validate_file(doc_file))

for doc_file in glob.glob(str(DOCS_ROOT / "**/*.rst"), recursive=True): all_issues.extend(validate_file(doc_file))

all_issues = [] for doc_file in glob.glob(str(DOCS_ROOT / "**/*.md"), recursive=True): all_issues.extend(validate_file(doc_file))

for doc_file in glob.glob(str(DOCS_ROOT / "**/*.rst"), recursive=True): all_issues.extend(validate_file(doc_file))

Report

if all_issues: print(f"❌ Found {len(all_issues)} issues\n") for issue in all_issues: print(f"{issue['file']}:{issue['line']} [{issue['severity']}]") print(f" Found: {issue['found']}") print(f" Should be: {issue['suggestion']}\n") exit(1) else: print("✅ Documentation validation passed") exit(0)

undefined

undefined

A2. Command Accuracy Check

A2. 命令准确性检查

Test that commands in docs actually work:

bash

undefined

测试文档中的命令是否实际可用：

bash

undefined

Extract and test commands

grep -r "^\s*$\s*" docs/ | while read line; do cmd=$(echo "$line" | sed 's/.*$ //' | cut -d' ' -f1) if ! command -v "$cmd" &> /dev/null; then echo "❌ Command not found: $cmd in $line" fi done

undefined

grep -r "^\s*$\s*" docs/ | while read line; do cmd=$(echo "$line" | sed 's/.*$ //' | cut -d' ' -f1) if ! command -v "$cmd" &> /dev/null; then echo "❌ Command not found: $cmd in $line" fi done

undefined

A3. Link Integrity Check

A3. 链接完整性检查

For RST docs:

python

def check_rst_links(docs_root):
    """Validate :doc: references"""
    pattern = r':doc:`([^`]+)`'
    
    for rst_file in glob.glob(f"{docs_root}/**/*.rst", recursive=True):
        with open(rst_file) as f:
            content = f.read()
        
        matches = re.finditer(pattern, content)
        for match in matches:
            ref = match.group(1)
            
            # Check if target exists
            possible = [f"{ref}.rst", f"{ref}.md", f"{ref}/index.rst"]
            if not any(Path(docs_root, p).exists() for p in possible):
                print(f"❌ Broken link in {rst_file}: {ref}")

针对RST文档：

python

def check_rst_links(docs_root):
    """Validate :doc: references"""
    pattern = r':doc:`([^`]+)`'
    
    for rst_file in glob.glob(f"{docs_root}/**/*.rst", recursive=True):
        with open(rst_file) as f:
            content = f.read()
        
        matches = re.finditer(pattern, content)
        for match in matches:
            ref = match.group(1)
            
            # Check if target exists
            possible = [f"{ref}.rst", f"{ref}.md", f"{ref}/index.rst"]
            if not any(Path(docs_root, p).exists() for p in possible):
                print(f"❌ Broken link in {rst_file}: {ref}")

A4. Terminology Consistency

A4. 术语一致性检查

Track variations and standardize:

python

undefined

追踪术语变体并标准化：

python

undefined

Define standard terms

TERMINOLOGY = { 'api_endpoint': ['endpoint', 'url', 'route', 'path'], 'tool_count': ['tools', 'resources', 'integrations'], }

def check_terminology(content): """Find inconsistent terminology""" for standard, variations in TERMINOLOGY.items(): counts = {v: content.lower().count(v) for v in variations} if len([c for c in counts.values() if c > 0]) > 2: return f"Inconsistent terminology: {counts}" return None

undefined

TERMINOLOGY = { 'api_endpoint': ['endpoint', 'url', 'route', 'path'], 'tool_count': ['tools', 'resources', 'integrations'], }

undefined

Phase B: ToolUniverse-Specific Audit

阶段B：ToolUniverse专属审计

B1. Circular Navigation Check

B1. 循环导航检查

Issue: Documentation pages that reference each other in loops.

Check manually:

bash

undefined

问题：文档页面之间形成互相引用的循环。

手动检查:

bash

undefined

Find cross-references

grep -r ":doc:`" docs/*.rst | grep -E "(quickstart|getting_started|installation)"


**Checklist**:
- [ ] Is there a clear "Start Here" on `docs/index.rst`?
- [ ] Does navigation follow linear path: index → quickstart → getting_started → guides?
- [ ] No "you should have completed X first" statements that create dependency loops?

**Common patterns to fix**:
- `quickstart.rst` → "See getting_started"
- `getting_started.rst` → "Complete quickstart first"

grep -r ":doc:`" docs/*.rst | grep -E "(quickstart|getting_started|installation)"


**检查清单**:
- [ ] `docs/index.rst` 是否有明确的「开始入口」？
- [ ] 导航是否遵循线性路径：index → quickstart → getting_started → guides？
- [ ] 是否存在「应先完成X」的表述导致依赖循环？

**常见需修复模式**:
- `quickstart.rst` → 「参见getting_started」
- `getting_started.rst` → 「请先完成quickstart」

B2. Duplicate Content Check

B2. 重复内容检查

Common duplicates in ToolUniverse:

Multiple FAQs:
```
docs/faq.rst
```
and
```
docs/help/faq.rst
```

Getting started:

docs/installation.rst

docs/quickstart.rst

docs/getting_started.rst

MCP configuration: All files in
```
docs/guide/building_ai_scientists/
```

Detection:

bash

undefined

ToolUniverse中常见重复内容:

多个FAQ：
```
docs/faq.rst
```
和
```
docs/help/faq.rst
```

入门指南：

docs/installation.rst

docs/quickstart.rst

docs/getting_started.rst

MCP配置：
```
docs/guide/building_ai_scientists/
```
下的所有文件

检测方法:

bash

undefined

Find MCP config duplication

rg "MCP.*configuration" docs/ -l | wc -l rg "pip install tooluniverse" docs/ -l | wc -l


**Action**: Consolidate or clearly differentiate

rg "MCP.*configuration" docs/ -l | wc -l rg "pip install tooluniverse" docs/ -l | wc -l


**处理措施**：合并内容或明确区分差异

B3. Tool Count Consistency

B3. 工具数量一致性检查

Standard: Use "1000+ tools" consistently.

Detection:

bash

undefined

标准：统一使用「1000+ tools」表述。

检测方法:

bash

undefined

Find all tool count mentions

rg "[0-9]++?\s+(tools|resources|integrations)" docs/ --no-filename | sort -u


**Check**:
- [ ] Are different numbers used (600, 750, 1195)?
- [ ] Is "1000+ tools" used consistently?
- [ ] Exact counts avoided in favor of "1000+"?

rg "[0-9]++?\s+(tools|resources|integrations)" docs/ --no-filename | sort -u


**检查项**:
- [ ] 是否使用了不同数字（600、750、1195）？
- [ ] 是否统一使用「1000+ tools」？
- [ ] 是否避免使用精确数字，改用「1000+」？

B4. Auto-Generated File Headers

B4. 自动生成文件头检查

Auto-generated directories:

docs/tools/*_tools.rst

(from

generate_config_index.py

)

```
docs/api/*.rst
```
(from
```
sphinx-apidoc
```
)

Required header:

rst

.. AUTO-GENERATED - DO NOT EDIT MANUALLY
.. Generated by: docs/generate_config_index.py
.. Last updated: 2024-02-05
.. 
.. To modify, edit source files and regenerate.

Check:

bash

head -5 docs/tools/*_tools.rst | grep "AUTO-GENERATED"

自动生成目录:

docs/tools/*_tools.rst

（来自

generate_config_index.py

）

```
docs/api/*.rst
```
（来自
```
sphinx-apidoc
```
）

必填文件头:

rst

.. AUTO-GENERATED - DO NOT EDIT MANUALLY
.. Generated by: docs/generate_config_index.py
.. Last updated: 2024-02-05
.. 
.. To modify, edit source files and regenerate.

检查方法:

bash

head -5 docs/tools/*_tools.rst | grep "AUTO-GENERATED"

B5. CLI Tools Documentation

B5. CLI工具文档检查

Check pyproject.toml for all CLIs:

bash

grep -A 20 "\[project.scripts\]" pyproject.toml

Common undocumented:

```
tooluniverse-expert-feedback
```
```
tooluniverse-expert-feedback-web
```
```
generate-mcp-tools
```

Action: Ensure all in

docs/reference/cli_tools.rst

检查pyproject.toml中的所有CLI:

bash

grep -A 20 "\[project.scripts\]" pyproject.toml

常见未文档化CLI:

```
tooluniverse-expert-feedback
```
```
tooluniverse-expert-feedback-web
```
```
generate-mcp-tools
```

处理措施：确保所有CLI都已记录在

docs/reference/cli_tools.rst

中

B6. Environment Variables

B6. 环境变量检查

Discovery:

bash

undefined

发现方法:

bash

undefined

Find all env vars in code

rg "os.getenv|os.environ" src/tooluniverse/ -o | sort -u rg "TOOLUNIVERSE_[A-Z_]+" src/tooluniverse/ -o | sort -u


**Categories to document**:
- Cache: `TOOLUNIVERSE_CACHE_*`
- Logging: `TOOLUNIVERSE_LOG_*`
- LLM: `TOOLUNIVERSE_LLM_*`
- API keys: `*_API_KEY`

**Check**:
- [ ] Does `docs/reference/environment_variables.rst` exist?
- [ ] Are variables categorized?
- [ ] Each has: default, description, example?
- [ ] Is there `.env.template` at project root?

rg "os.getenv|os.environ" src/tooluniverse/ -o | sort -u rg "TOOLUNIVERSE_[A-Z_]+" src/tooluniverse/ -o | sort -u


**需文档化的类别**:
- 缓存：`TOOLUNIVERSE_CACHE_*`
- 日志：`TOOLUNIVERSE_LOG_*`
- LLM：`TOOLUNIVERSE_LLM_*`
- API密钥：`*_API_KEY`

**检查项**:
- [ ] 是否存在 `docs/reference/environment_variables.rst`？
- [ ] 变量是否已分类？
- [ ] 每个变量是否包含：默认值、描述、示例？
- [ ] 项目根目录是否有 `.env.template` 文件？

B7. ToolUniverse-Specific Jargon

B7. ToolUniverse专属术语检查

Terms to define on first use:

Tool Specification
EFO ID
MCP, SMCP
Compact Mode
Tool Finder
AI Scientist

Check:

Is there
```
docs/glossary.rst
```
?
Terms defined inline with
```
:term:
```
references?
Glossary linked from main index?

首次出现时需定义的术语:

Tool Specification
EFO ID
MCP、SMCP
Compact Mode
Tool Finder
AI Scientist

检查项:

是否存在
```
docs/glossary.rst
```
？
术语是否使用
```
:term:
```
引用进行内联定义？
术语表是否从主索引页链接？

B8. CI/CD Documentation Regeneration

B8. CI/CD文档再生检查

Required in
.github/workflows/deploy-docs.yml
:

yaml

- name: Regenerate tool documentation
  run: |
    cd docs
    python generate_config_index.py
    python generate_remote_tools_docs.py
    python generate_tool_reference.py

Check:

CI/CD regenerates docs before build?
Regeneration happens BEFORE Sphinx build?
```
docs/api/
```
excluded from cache?

.github/workflows/deploy-docs.yml 中必填步骤:

yaml

- name: Regenerate tool documentation
  run: |
    cd docs
    python generate_config_index.py
    python generate_remote_tools_docs.py
    python generate_tool_reference.py

检查项:

CI/CD是否在构建前再生文档？
再生步骤是否在Sphinx构建之前执行？
```
docs/api/
```
是否被排除在缓存之外？

Priority Framework

优先级框架

Issue Severity

问题严重程度

Severity	Definition	Examples	Timeline
CRITICAL	Blocks release	Broken builds, dangerous instructions	Immediate
HIGH	Blocks users	Wrong commands, broken setup	Same day
MEDIUM	Causes confusion	Inconsistent terminology, unclear examples	Same week
LOW	Reduces quality	Long files, minor formatting	Future task

严重程度	定义	示例	处理时限
CRITICAL（致命）	阻碍发布	构建失败、危险操作说明	立即处理
HIGH（高）	阻碍用户使用	错误命令、失效安装步骤	当日处理
MEDIUM（中）	造成用户混淆	术语不一致、示例不清晰	当周处理
LOW（低）	降低文档质量	文件过长、轻微格式问题	后续任务

Fix Order

修复顺序

Run automated validation → Fix HIGH issues
Check circular navigation → Fix CRITICAL loops
Verify tool counts → Standardize to "1000+"
Check auto-generated headers → Add missing
Validate CLI docs → Document all from pyproject.toml
Check env vars → Create reference page
Review jargon → Create/update glossary
Verify CI/CD → Add regeneration steps

运行自动化验证 → 修复高优先级问题
检查循环导航 → 修复致命循环
验证工具数量表述 → 统一为「1000+」
检查自动生成文件头 → 添加缺失的文件头
验证CLI文档 → 记录pyproject.toml中的所有CLI
检查环境变量 → 创建参考页面
审阅专属术语 → 创建/更新术语表
验证CI/CD → 添加文档再生步骤

Validation Checklist

验证检查清单

Before considering docs "done":

文档被视为「完成」前需满足：

Accuracy

准确性

Structure (ToolUniverse-specific)

结构（ToolUniverse专属）

Consistency

一致性

Completeness

完整性

All features documented
All CLIs in pyproject.toml covered
All env vars documented
Glossary includes all jargon

所有功能已文档化
pyproject.toml中的所有CLI已覆盖
所有环境变量已文档化
术语表包含所有技术术语

Output: Audit Report

输出：审计报告

markdown

undefined

markdown

undefined

Documentation Quality Report

Date: [date] Scope: Automated validation + ToolUniverse audit

Executive Summary

Files scanned: X
Issues found: Y (Critical: A, High: B, Medium: C, Low: D)

Files scanned: X
Issues found: Y (Critical: A, High: B, Medium: C, Low: D)

Critical Issues

[Issue] - Location: file:line
- Problem: [description]
- Fix: [action]
- Effort: [time]

[Issue] - Location: file:line
- Problem: [description]
- Fix: [action]
- Effort: [time]

Automated Validation Results

Deprecated commands: X instances
Inconsistent counts: Y instances
Broken links: Z instances

Deprecated commands: X instances
Inconsistent counts: Y instances
Broken links: Z instances

ToolUniverse-Specific Findings

Circular navigation: [yes/no]
Tool count variations: [list]
Missing CLI docs: [list]
Auto-generated headers: X missing

Circular navigation: [yes/no]
Tool count variations: [list]
Missing CLI docs: [list]
Auto-generated headers: X missing

Recommendations

Immediate (today): [list]
This week: [list]
Next sprint: [list]

Immediate (today): [list]
This week: [list]
Next sprint: [list]

Validation Command

Run

python scripts/validate_documentation.py

to verify fixes

undefined

Run

python scripts/validate_documentation.py

to verify fixes

undefined

CI/CD Integration

CI/CD集成

Add to

.github/workflows/validate-docs.yml

yaml

name: Validate Documentation
on: [pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Run validation
        run: python scripts/validate_documentation.py
      - name: Check auto-generated headers
        run: |
          for f in docs/tools/*_tools.rst; do
            if ! head -1 "$f" | grep -q "AUTO-GENERATED"; then
              echo "Missing header: $f"
              exit 1
            fi
          done

添加至

.github/workflows/validate-docs.yml

yaml

name: Validate Documentation
on: [pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Run validation
        run: python scripts/validate_documentation.py
      - name: Check auto-generated headers
        run: |
          for f in docs/tools/*_tools.rst; do
            if ! head -1 "$f" | grep -q "AUTO-GENERATED"; then
              echo "Missing header: $f"
              exit 1
            fi
          done

Common Issues Quick Reference

常见问题快速参考

Issue	Detection	Fix
Deprecated command	`rg "old-cmd" docs/`	Replace with `new-cmd`
Wrong tool count	`rg "[0-9]+ tools" docs/`	Change to "1000+ tools"
Circular nav	Manual trace	Remove back-references
Missing header	`head -1 file.rst`	Add AUTO-GENERATED header
Undocumented CLI	Check pyproject.toml	Add to cli_tools.rst
Missing env var	`rg "os.getenv" src/`	Add to env vars reference

问题	检测方法	修复措施
过时命令	`rg "old-cmd" docs/`	替换为 `new-cmd`
错误工具数量	`rg "[0-9]+ tools" docs/`	改为「1000+ tools」
循环导航	手动追踪	删除反向引用
缺失文件头	`head -1 file.rst`	添加AUTO-GENERATED头
未文档化CLI	检查pyproject.toml	添加至cli_tools.rst
缺失环境变量文档	`rg "os.getenv" src/`	添加至环境变量参考页

Best Practices

最佳实践

Automate first - Build validation before manual audit
Context matters - Smart pattern matching avoids false positives
Fix systematically - Batch similar issues together
Validate continuously - Add to CI/CD pipeline
ToolUniverse-specific last - Automated checks catch most issues

优先自动化 - 先构建验证脚本再进行手动审计
关注上下文 - 智能模式匹配避免误报
系统化修复 - 批量处理同类问题
持续验证 - 集成至CI/CD流水线
最后处理专属内容 - 自动化检查可覆盖大部分问题

Success Criteria

成功标准

Documentation quality achieved when:

✅ Automated validation reports 0 HIGH issues
✅ No circular navigation
✅ "1000+ tools" used consistently
✅ All auto-generated files have headers
✅ All CLIs from pyproject.toml documented
✅ All env vars have reference page
✅ Glossary covers all technical terms
✅ CI/CD validates on every PR

满足以下条件时，文档质量达标：

✅ 自动化验证报告0个高优先级问题
✅ 无循环导航
✅ 统一使用「1000+ tools」
✅ 所有自动生成文件包含指定头
✅ pyproject.toml中的所有CLI已文档化
✅ 所有环境变量有对应的参考页面
✅ 术语表覆盖所有技术术语
✅ CI/CD在每个PR上执行验证