devtu-optimize-descriptions

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ToolUniverse Tool Description Optimization

ToolUniverse工具描述优化

Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.
优化ToolUniverse JSON配置文件中的工具描述,确保其清晰、完整且用户友好。

When to Apply This Skill

何时应用此技能

Use when:
  • Reviewing newly created tool descriptions
  • User asks "are these tools easy to understand?"
  • Improving existing tool documentation
  • Adding new tools to ToolUniverse
  • User mentions tool usability, clarity, or documentation
适用于以下场景:
  • 审核新创建的工具描述
  • 用户询问「这些工具是否易于理解?」
  • 改进现有工具文档
  • 向ToolUniverse添加新工具
  • 用户提及工具的易用性、清晰度或文档相关问题

Quick Optimization Checklist

快速优化检查清单

markdown
Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage
markdown
Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage

Critical Improvements (Fix Immediately)

关键改进项(立即修复)

1. Clarify Required Input Requirements

1. 明确必填输入要求

Problem: Users don't know if they need ONE input or ALL inputs.
Fix: Use "Required: Provide ONE input type" for mutually exclusive options.
json
// Before
"description": "Process BED regions, motifs, or gene lists..."

// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."
Number the options and use bold for "Required".
问题:用户不清楚需要提供一项还是所有输入。
修复方案:对于互斥选项,使用「必填:提供其中一种输入类型」的表述。
json
// Before
"description": "Process BED regions, motifs, or gene lists..."

// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."
为选项编号,并将「Required」加粗。

2. Add Prerequisites to First Tool

2. 为同系列首个工具添加前置条件

Problem: Users don't know what to install/configure before use.
Fix: Add prerequisites note to first tool in each family.
json
"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."
Include:
  • Package installation command
  • API key requirements
  • Account creation instructions
问题:用户不知道使用前需要安装/配置什么。
修复方案:在每个工具系列的首个工具描述中添加前置条件说明。
json
"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."
需包含:
  • 包安装命令
  • API密钥要求
  • 账户创建说明

3. Expand Critical Abbreviations

3. 展开关键缩写

Problem: New users don't understand technical terms.
Fix: Expand on first use with format: "Abbreviation (Full Name)".
Common abbreviations to expand:
  • H5AD → HDF5-based AnnData
  • RPM → Reads Per Million
  • TSS → Transcription Start Site
  • TAD → Topologically Associating Domain
  • DRS → Data Repository Service
  • API names (MACS2, IUPAC, etc.)
json
// Before
"description": "Download H5AD files..."

// After  
"description": "Download H5AD (HDF5-based AnnData) files..."
问题:新用户无法理解技术术语。
修复方案:首次出现时使用「缩写(完整名称)」的格式展开。
需展开的常见缩写:
  • H5AD → HDF5-based AnnData
  • RPM → Reads Per Million
  • TSS → Transcription Start Site
  • TAD → Topologically Associating Domain
  • DRS → Data Repository Service
  • API名称(MACS2、IUPAC等)
json
// Before
"description": "Download H5AD files..."

// After  
"description": "Download H5AD (HDF5-based AnnData) files..."

High-Priority Improvements

高优先级改进项

4. Enhance Filter Parameter Descriptions

4. 增强过滤参数描述

Problem: Users don't know what fields are available or what syntax to use.
Fix: List operators, common fields, and provide multiple examples.
json
"parameter_name": {
  "type": "string",
  "description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}
Include:
  • Syntax format
  • Available operators
  • List of 5-10 common fields
  • 2-3 diverse examples
问题:用户不知道可用字段或语法规则。
修复方案:列出运算符、常见字段并提供多个示例。
json
"parameter_name": {
  "type": "string",
  "description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}
需包含:
  • 语法格式
  • 可用运算符
  • 5-10个常见字段列表
  • 2-3个多样化示例

5. Improve Parameter Guidance

5. 完善参数指南

Problem: Users don't know which value to choose or what trade-offs exist.
Fix: Explain what each value means and provide recommendations.
json
// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"

// After
"threshold": "Peak calling stringency. '05'=1e-5 (permissive, more peaks, broad features), '10'=1e-10 (moderate, balanced), '20'=1e-20 (strict, high confidence, narrow peaks). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."
For each parameter option, explain:
  • What it means practically
  • When to use it
  • Trade-offs involved
  • Recommended default
问题:用户不知道选择哪个值,也不清楚各选项的权衡。
修复方案:解释每个值的实际含义并给出推荐。
json
// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"

// After
"threshold": "Peak calling stringency. '05'=1e-5 (宽松,更多峰值,宽泛特征), '10'=1e-10 (适中,平衡), '20'=1e-20 (严格,高可信度,窄峰值). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."
对于每个参数选项,需解释:
  • 实际含义
  • 使用场景
  • 权衡利弊
  • 推荐默认值

6. Number Mutually Exclusive Options

6. 为互斥选项编号

Problem: Users provide multiple options when only one is allowed.
Fix: Label options as "Option 1", "Option 2", etc.
json
"bed_data": {
  "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
  "description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
  "description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}
问题:用户在仅允许选一个的情况下提供了多个选项。
修复方案:将选项标记为「选项1」「选项2」等。
json
"bed_data": {
  "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
  "description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
  "description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}

Medium-Priority Improvements

中优先级改进项

7. Add File Size Warnings

7. 添加文件大小警告

For tools that download or return large files:
json
"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."
对于下载或返回大文件的工具:
json
"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."

8. Clarify Web Form vs API Results

8. 区分Web表单与API结果

When tool returns submission URL instead of direct results:
json
"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."
当工具返回提交URL而非直接结果时:
json
"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."

9. Explain File Type Differences

9. 解释文件类型差异

For tools with multiple format options:
json
"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."
对于支持多种格式选项的工具:
json
"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."

Description Structure Template

描述结构模板

json
{
  "name": "Tool_operation_name",
  "type": "ToolClassName",
  "description": "[Action verb] to [purpose]. [Prerequisites if first tool]. [Key data/features]. [Required inputs if mutually exclusive]. [Note about limitations/requirements]. Use for: [use case 1], [use case 2], [use case 3].",
  "parameter": {
    "properties": {
      "param_name": {
        "type": "string",
        "description": "[What it does]. [Format/syntax if applicable]. [Options with trade-offs]. [Examples]. [Recommendation if applicable]."
      }
    }
  }
}
json
{
  "name": "Tool_operation_name",
  "type": "ToolClassName",
  "description": "[动作动词]以[目的]。[若为系列首个工具则添加前置条件]。[关键数据/特性]。[若为互斥输入则说明必填项]。[限制/要求说明]。适用场景:[场景1]、[场景2]、[场景3]。",
  "parameter": {
    "properties": {
      "param_name": {
        "type": "string",
        "description": "[功能说明]。[格式/语法(若有)]。[选项及权衡]。[示例]。[推荐建议(若有)]。"
      }
    }
  }
}

Description Quality Checklist

描述质量检查清单

Clarity Checks

清晰度检查

  • Purpose clear in first sentence
  • Technical terms expanded
  • Prerequisites stated upfront
  • Examples show realistic usage
  • "Use for:" section lists 3-5 concrete use cases
  • 首句明确说明工具用途
  • 技术术语已展开
  • 前置条件已提前说明
  • 示例展示真实使用场景
  • 「适用场景」部分列出3-5个具体用例

Completeness Checks

完整性检查

  • Required inputs clearly marked
  • Parameter choices explained
  • Limitations noted (file size, web form, etc.)
  • Available fields listed for filters
  • Default values recommended
  • 必填输入已明确标记
  • 参数选项已解释
  • 限制条件已注明(文件大小、Web表单等)
  • 过滤器可用字段已列出
  • 默认值已给出推荐

Usability Checks

易用性检查

  • New users can understand without external docs
  • Users know what to provide
  • Users can make informed parameter choices
  • Error prevention (mutually exclusive options labeled)
  • 新用户无需外部文档即可理解
  • 用户清楚需要提供什么
  • 用户能做出合理的参数选择
  • 已预防错误(互斥选项已标记)

Testing Description Quality

测试描述质量

To verify description quality, ask:
  1. Can a new user understand what the tool does?
    • Read only the description (no docs)
    • Should be clear within 30 seconds
  2. Can a user provide correct inputs on first try?
    • Required inputs obvious
    • Format/syntax clear
    • Mutually exclusive options labeled
  3. Can a user choose appropriate parameters?
    • Trade-offs explained
    • Recommendations provided
    • Defaults justified
  4. Are prerequisites obvious?
    • Installation instructions
    • API keys/accounts
    • File size warnings
为验证描述质量,可询问以下问题:
  1. 新用户能否理解工具用途?
    • 仅阅读描述(不看文档)
    • 应在30秒内理解
  2. 用户能否首次提供正确输入?
    • 必填输入清晰明了
    • 格式/语法明确
    • 互斥选项已标记
  3. 用户能否选择合适的参数?
    • 已解释权衡利弊
    • 已给出推荐
    • 默认值已说明理由
  4. 前置条件是否明确?
    • 安装说明
    • API密钥/账户要求
    • 文件大小警告

Common Patterns by Tool Type

按工具类型划分的常见模式

API Query Tools

API查询工具

json
"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."
Key elements:
  • What you're querying
  • How to filter
  • What you get back
  • Scale of data
  • Prerequisites
json
"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."
关键要素:
  • 查询内容
  • 过滤方式
  • 返回结果
  • 数据规模
  • 前置条件

Data Download Tools

数据下载工具

json
"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."
Key elements:
  • File formats available
  • Size warning
  • Authentication needs
  • What's in the files
json
"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."
关键要素:
  • 可用文件格式
  • 大小警告
  • 认证要求
  • 文件内容说明

Enrichment/Analysis Tools

富集/分析工具

json
"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."
Key elements:
  • Input requirements clear
  • Options numbered
  • What gets compared
  • What you learn
json
"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."
关键要素:
  • 输入要求明确
  • 选项已编号
  • 对比基准
  • 分析结果说明

Validation Commands

验证命令

After updating descriptions, validate JSON syntax:
bash
undefined
更新描述后,验证JSON语法:
bash
undefined

Validate all tool JSONs

Validate all tool JSONs

python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"

Check all tools in category

Check all tools in category

for f in src/tooluniverse/data/*_tools.json; do python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid" done
undefined
for f in src/tooluniverse/data/*_tools.json; do python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid" done
undefined

Example: Before and After

示例:优化前后对比

Before (Unclear):
json
{
  "name": "Tool_enrichment",
  "description": "Perform enrichment with tool to find factors.",
  "parameter": {
    "properties": {
      "bed": {"description": "BED data"},
      "motif": {"description": "Motif"},
      "genes": {"description": "Genes"},
      "threshold": {"description": "Threshold value"}
    }
  }
}
After (Clear):
json
{
  "name": "Tool_enrichment_analysis",
  "description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
  "parameter": {
    "properties": {
      "bed_data": {
        "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
      },
      "motif": {
        "description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
      },
      "gene_list": {
        "description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
      },
      "threshold": {
        "description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
      }
    }
  }
}
优化前(不清晰):
json
{
  "name": "Tool_enrichment",
  "description": "Perform enrichment with tool to find factors.",
  "parameter": {
    "properties": {
      "bed": {"description": "BED data"},
      "motif": {"description": "Motif"},
      "genes": {"description": "Genes"},
      "threshold": {"description": "Threshold value"}
    }
  }
}
优化后(清晰):
json
{
  "name": "Tool_enrichment_analysis",
  "description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
  "parameter": {
    "properties": {
      "bed_data": {
        "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
      },
      "motif": {
        "description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
      },
      "gene_list": {
        "description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
      },
      "threshold": {
        "description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
      }
    }
  }
}

Summary

总结

Priority order for optimization:
  1. Critical (fix immediately):
    • Clarify required inputs
    • Add prerequisites
    • Expand abbreviations
  2. High (fix soon):
    • Enhance filter descriptions
    • Improve parameter guidance
    • Number mutually exclusive options
  3. Medium (nice to have):
    • Add file size warnings
    • Clarify web form vs API
    • Explain file type differences
Expected impact: 50-75% reduction in user errors, 50-67% faster time to first successful use.
优化优先级顺序:
  1. 关键项(立即修复):
    • 明确必填输入
    • 添加前置条件
    • 展开缩写
  2. 高优先级(尽快修复):
    • 增强过滤器描述
    • 完善参数指南
    • 为互斥选项编号
  3. 中优先级(锦上添花):
    • 添加文件大小警告
    • 区分Web表单与API
    • 解释文件类型差异
预期效果:用户错误减少50-75%,首次成功使用时间缩短50-67%。