tooluniverse
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseToolUniverse General Usage Strategies
ToolUniverse通用使用策略
Master strategies for using ToolUniverse's 10000+ scientific tools effectively. These principles apply regardless of how you access ToolUniverse (MCP server, SDK, or direct tool calls).
掌握有效使用ToolUniverse的10000+科学工具的策略。这些原则适用于所有ToolUniverse的访问方式(MCP服务器、SDK或直接工具调用)。
Core Philosophy
核心理念
ToolUniverse has MANY tools - the challenge is discovering and using them effectively:
- Search widely - Don't assume you know all relevant tools; always search for more
- Query multiple databases - Cross-reference data across sources
- Multi-hop persistence - Many answers require 3-5 tool calls in sequence
- Never give up easily - If one tool fails, try alternatives
- Comprehensive reports - Use all available data; detail is valuable
- English-first queries - Always use English terms in tool calls, even if the user writes in another language
ToolUniverse拥有大量工具——挑战在于如何高效发现和使用它们:
- 广泛搜索 - 不要假设你了解所有相关工具;始终要搜索更多工具
- 查询多个数据库 - 跨来源交叉引用数据
- 多跳持续查询 - 许多答案需要连续3-5次工具调用
- 不要轻易放弃 - 如果一个工具失败,尝试替代工具
- 生成全面报告 - 利用所有可用数据;细节很有价值
- 优先使用英文查询 - 即使用户使用其他语言提问,工具调用时也始终使用英文术语
Step 0: Clarify the Request Before Researching
步骤0:研究前先明确需求
CRITICAL: Before starting any research, ensure you understand what the user actually needs. Wasted tool calls on the wrong entity or scope are expensive.
至关重要:在开始任何研究之前,确保你理解用户的实际需求。针对错误实体或范围的无效工具调用成本很高。
When to Ask Clarifying Questions
何时需要询问澄清问题
| Signal | Example | What to Clarify |
|---|---|---|
| Vague entity | "Research cancer" | Which cancer type? Which aspect (treatment, genetics, epidemiology)? |
| Ambiguous name | "Tell me about JAK" | JAK1/2/3? The gene family? A specific inhibitor? |
| Unclear scope | "Look into metformin" | Drug profile? Repurposing? Safety? Mechanism? |
| Missing context | "What targets this?" | Which compound/disease/pathway? |
| Multiple interpretations | "ACE" | ACE the gene? ACE inhibitors? ACE2? |
| 信号 | 示例 | 需要澄清的内容 |
|---|---|---|
| 模糊实体 | "研究癌症" | 哪种癌症类型?哪个方面(治疗、遗传学、流行病学)? |
| 名称歧义 | "告诉我关于JAK的信息" | JAK1/2/3?基因家族?特定抑制剂? |
| 范围不明确 | "研究二甲双胍" | 药物概况?重定位?安全性?作用机制? |
| 缺少上下文 | "什么靶向这个?" | 哪种化合物/疾病/通路? |
| 多种解读 | "ACE" | ACE基因?ACE抑制剂?ACE2? |
When NOT to Ask
何时无需询问
Proceed directly when the request is specific enough:
- "What is the structure of EGFR kinase domain?" - Clear entity + clear data type
- "Find all drugs targeting BRAF V600E" - Specific variant + clear task
- "Research Alzheimer's disease comprehensively" - Broad but unambiguous
当需求足够明确时直接进行:
- "EGFR激酶域的结构是什么?" - 实体明确 + 数据类型明确
- "找到所有靶向BRAF V600E的药物" - 特定变异体 + 任务明确
- "全面研究阿尔茨海默病" - 范围宽泛但无歧义
Clarification Checklist
澄清检查清单
Before starting research, confirm you know:
- Entity - Exactly which gene/protein/drug/disease?
- Species - Human unless stated otherwise
- Scope - Comprehensive profile or specific aspect?
- Output - Report, data table, quick answer, or comparison?
If any of these are unclear, ask the user one concise question covering all ambiguities rather than asking multiple rounds of questions.
开始研究前,确认你了解:
- 实体 - 具体是哪个基因/蛋白质/药物/疾病?
- 物种 - 除非特别说明,默认为人类
- 范围 - 全面概况还是特定方面?
- 输出形式 - 报告、数据表、快速答案还是对比分析?
如果有任何一项不明确,向用户提出一个简洁的问题涵盖所有歧义点,而非多轮提问。
Strategy 1: Exhaustive Tool Discovery
策略1:全面的工具发现
CRITICAL: ToolUniverse has 10000+ tools. Before any research task, search for ALL relevant tools.
至关重要:ToolUniverse拥有10000+工具。在任何研究任务开始前,搜索所有相关工具。
Tool Discovery Methods
工具发现方法
Use the tool finder tools to discover what's available:
| Method | Tool | Best For |
|---|---|---|
| Keyword | | Fast search by terms |
| LLM-based | | Intelligent matching by description |
| Embedding | | Semantic similarity search |
使用工具查找工具来发现可用资源:
| 方法 | 工具 | 最佳适用场景 |
|---|---|---|
| 关键词搜索 | | 通过术语快速搜索 |
| 基于LLM的搜索 | | 通过描述智能匹配 |
| 嵌入搜索 | | 语义相似度搜索 |
Discovery Best Practices
发现最佳实践
| Practice | Why | Example |
|---|---|---|
| Search with multiple terms | Same data from different angles | "protein expression", "gene expression", "tissue expression" |
| Search by database name | Find all tools for a source | "UniProt", "ChEMBL", "OpenTargets" |
| Search by data type | Comprehensive coverage | "variant", "mutation", "SNP", "polymorphism" |
| Search by use case | Task-oriented discovery | "druggability", "target validation" |
| 实践 | 原因 | 示例 |
|---|---|---|
| 使用多个术语搜索 | 从不同角度获取相同数据 | "protein expression", "gene expression", "tissue expression" |
| 按数据库名称搜索 | 查找针对某一来源的所有工具 | "UniProt", "ChEMBL", "OpenTargets" |
| 按数据类型搜索 | 全面覆盖 | "variant", "mutation", "SNP", "polymorphism" |
| 按使用场景搜索 | 面向任务的发现 | "druggability", "target validation" |
Minimum Discovery Queries
最低要求的发现查询
Before any research task, run at least these searches:
- Main topic query:
[your topic] - Related terms: ,
[synonym1][synonym2] - Database-specific:
[relevant database names] - Data type specific:
[required data types]
Example for target research:
- "protein information"
- "gene expression"
- "drug target"
- "UniProt", "OpenTargets"
- "protein interaction"
- "variant mutation"
在任何研究任务开始前,至少运行以下搜索:
- 主主题查询:
[你的主题] - 相关术语:,
[同义词1][同义词2] - 特定数据库:
[相关数据库名称] - 特定数据类型:
[所需数据类型]
靶点研究示例:
- "protein information"
- "gene expression"
- "drug target"
- "UniProt", "OpenTargets"
- "protein interaction"
- "variant mutation"
Strategy 2: Multi-Hop Tool Chains
策略2:多跳工具链
CRITICAL: Most scientific questions require multiple tool calls. A single tool rarely gives the complete answer.
至关重要:大多数科学问题需要多次工具调用。单一工具很少能给出完整答案。
Why Multi-Hop Matters
多跳的重要性
| Question Type | Single Tool Answer | Multi-Hop Answer |
|---|---|---|
| "Tell me about EGFR" | Basic protein info | Full profile with structure, expression, drugs, variants, literature |
| "What drugs target TP53?" | List of drug names | Drug details, mechanisms, clinical trials, bioactivity data |
| "Research Alzheimer's" | Disease definition | Genes, pathways, drugs, trials, phenotypes, GWAS, literature |
| 问题类型 | 单一工具答案 | 多跳答案 |
|---|---|---|
| "告诉我关于EGFR的信息" | 基础蛋白质信息 | 包含结构、表达、药物、变异体、文献的完整概况 |
| "哪些药物靶向TP53?" | 药物名称列表 | 药物详情、作用机制、临床试验、生物活性数据 |
| "研究阿尔茨海默病" | 疾病定义 | 基因、通路、药物、试验、表型、GWAS、文献 |
Common Multi-Hop Patterns
常见多跳模式
Pattern A: ID Resolution Chain
模式A:ID解析链
Name → ID → Data → Related Data
Example: Gene name to complete profile
1. gene_name → Ensembl ID
2. Ensembl ID → UniProt accession
3. UniProt accession → Protein entry
4. UniProt accession → Domains
5. UniProt accession → Structure名称 → ID → 数据 → 相关数据
示例:从基因名称到完整概况
1. gene_name → Ensembl ID
2. Ensembl ID → UniProt accession
3. UniProt accession → Protein entry
4. UniProt accession → Domains
5. UniProt accession → StructurePattern B: Cross-Database Enrichment
模式B:跨数据库富集
Primary Data → Cross-reference → Enriched View
Example: Drug compound enrichment
1. drug_name → PubChem CID
2. drug_name → ChEMBL ID
3. CID → properties
4. ChEMBL ID → bioactivity
5. ChEMBL ID → targets
6. SMILES → ADMET predictions原始数据 → 交叉引用 → 丰富视图
示例:药物化合物富集
1. drug_name → PubChem CID
2. drug_name → ChEMBL ID
3. CID → 性质
4. ChEMBL ID → 生物活性
5. ChEMBL ID → 靶点
6. SMILES → ADMET预测Pattern C: Network Expansion
模式C:网络扩展
Seed Entity → Connected Entities → Entity Details
Example: Target interaction network
1. gene → protein interactions
2. For each interactor → gene info
3. Interactor → disease associations种子实体 → 关联实体 → 实体详情
示例:靶点互作网络
1. gene → 蛋白质互作
2. 针对每个互作蛋白 → 基因信息
3. 互作蛋白 → 疾病关联Pattern D: Literature + Data Integration
模式D:文献+数据整合
Database Annotations → Literature Evidence → Synthesis
Example: Disease mechanism research
1. disease → associated genes
2. disease → phenotypes
3. disease → drugs
4. disease → literature
5. key papers → citations数据库注释 → 文献证据 → 综合分析
示例:疾病机制研究
1. disease → 关联基因
2. disease → 表型
3. disease → 药物
4. disease → 文献
5. 关键论文 → 引用文献Multi-Hop Persistence Rules
多跳持续规则
- Don't stop at first result - One tool gives partial data; keep going
- Follow cross-references - Use IDs from one tool to query others
- Chain until complete - 5-10 tool calls for comprehensive answers is normal
- Track all IDs - Save every identifier for potential future use
- 不要停留在第一个结果 - 单一工具只能提供部分数据;继续深入
- 跟踪交叉引用 - 使用一个工具的ID查询其他工具
- 持续直到完整 - 全面回答通常需要5-10次工具调用
- 记录所有ID - 保存每个标识符以备未来使用
Strategy 3: Query Multiple Databases for Same Data
策略3:针对同一数据查询多个数据库
CRITICAL: Different databases have different coverage. Query ALL relevant sources.
至关重要:不同数据库的覆盖范围不同。查询所有相关来源。
Database Redundancy Principle
数据库冗余原则
For any data type, query multiple sources:
| Data Type | Primary | Secondary | Tertiary |
|---|---|---|---|
| Protein info | UniProt | Proteins API | NCBI Protein |
| Gene expression | GTEx | Human Protein Atlas | ArrayExpress |
| Drug targets | ChEMBL | DGIdb | OpenTargets |
| Variants | gnomAD | ClinVar | OpenTargets |
| Literature | PubMed | Europe PMC | OpenAlex |
| Pathways | Reactome | KEGG | WikiPathways |
| Structures | RCSB PDB | PDBe | AlphaFold |
| Disease associations | OpenTargets | ClinVar | GWAS Catalog |
对于任何数据类型,查询多个来源:
| 数据类型 | 主要来源 | 次要来源 | 三级来源 |
|---|---|---|---|
| 蛋白质信息 | UniProt | Proteins API | NCBI Protein |
| 基因表达 | GTEx | Human Protein Atlas | ArrayExpress |
| 药物靶点 | ChEMBL | DGIdb | OpenTargets |
| 变异体 | gnomAD | ClinVar | OpenTargets |
| 文献 | PubMed | Europe PMC | OpenAlex |
| 通路 | Reactome | KEGG | WikiPathways |
| 结构 | RCSB PDB | PDBe | AlphaFold |
| 疾病关联 | OpenTargets | ClinVar | GWAS Catalog |
Merge Results Strategy
结果合并策略
When querying multiple databases:
- Collect all results - Don't stop at first success
- Note data source - Track where each datum came from
- Handle conflicts - Document when sources disagree
- Prefer curated - Weight RefSeq over GenBank, UniProt over predictions
查询多个数据库时:
- 收集所有结果 - 不要在第一次成功后停止
- 记录数据源 - 跟踪每条数据的来源
- 处理冲突 - 记录来源之间的不一致
- 优先选择 curated数据 - 优先选择RefSeq而非GenBank,UniProt而非预测数据
Strategy 3.1: Abstract Search vs Full-Text Search (Literature)
策略3.1:摘要搜索 vs 全文搜索(文献)
CRITICAL: Many biomedical “needle” terms (rsIDs like , reagent catalog numbers, supplementary-table IDs) never appear in titles/abstracts. If you only search abstracts, you will miss papers even when they are open access.
rs58542926至关重要:许多生物医学“精准”术语(如rsID 、试剂目录号、补充表格ID)从未出现在标题/摘要中。如果仅搜索摘要,即使论文是开放获取的,你也会错过相关内容。
rs58542926Quick rule
快速规则
- If your keywords look like body-only terms (rsIDs, figure/table references, “Supplementary Table”), use full-text-aware tools first.
- 如果你的关键词看起来是仅正文出现的术语(rsID、图/表引用、“补充表格”),首先使用支持全文搜索的工具。
Tools that can match full text (indexed or retrieved)
可匹配全文的工具(索引或检索)
| Goal | Tools | Notes |
|---|---|---|
| Indexed full-text search (biomed OA) | | NCBI “pmc” database indexes full text; good for rsIDs. |
| Indexed full-text search (Europe PMC subset) | | Uses Europe PMC |
| Best-effort full-text retrieval + keyword snippets | | Fetches full text (XML → HTML fallbacks) and returns bounded snippets with |
| OA aggregation + (sometimes) full-text search | | Coverage varies; a paper may not exist in CORE even if OA elsewhere. |
| Download-and-scan fallback | | Local PDF scan for body-only terms when index-based search misses; can fail if the “PDF” URL returns HTML/403 (check trace/content-type). |
| Partial full-text indexing (not guaranteed) | | Only matches works where OpenAlex has indexed full text; can miss PMC-hosted full text. Use as a secondary signal. |
| 目标 | 工具 | 说明 |
|---|---|---|
| 索引全文搜索(生物医学OA) | | NCBI“pmc”数据库索引全文;适合rsID搜索。 |
| 索引全文搜索(Europe PMC子集) | | 使用Europe PMC的 |
| 最佳尝试全文检索+关键词片段 | | 获取全文(XML→HTML备选)并返回带 |
| OA聚合+(有时)全文搜索 | | 覆盖范围不一;即使论文是OA的,也可能未收录在CORE中。 |
| 下载并扫描备选方案 | | 当基于索引的搜索失败时,本地PDF扫描正文专属术语;如果“PDF”URL返回HTML/403则可能失败(检查trace/content-type)。 |
| 部分全文索引(不保证) | | 仅匹配OpenAlex已索引全文的文献;可能错过PMC托管的全文。作为次要信号使用。 |
Recommended flow for body-only keywords
正文专属关键词的推荐流程
- Try and
PMC_search_papers(withEuropePMC_search_articles+require_has_ft).fulltext_terms - If you have a PMCID/PMID, use to confirm the term is in the paper.
EuropePMC_get_fulltext_snippets - If you only have a PDF URL, use as a last resort, and treat HTTP
CORE_get_fulltext_snippetsas “request succeeded”, not “PDF succeeded” (validate200).content_type
- 尝试和
PMC_search_papers(搭配EuropePMC_search_articles+require_has_ft)。fulltext_terms - 如果有PMCID/PMID,使用确认术语是否在论文中。
EuropePMC_get_fulltext_snippets - 如果只有PDF URL,最后尝试,将HTTP
CORE_get_fulltext_snippets视为“请求成功”而非“PDF获取成功”(验证200)。content_type
Strategy 4: Disambiguation First
策略4:先消除歧义
CRITICAL: Before any research, resolve entity identity to avoid wrong data and missed results.
至关重要:在任何研究之前,先解析实体身份,避免错误数据和遗漏结果。
Why Disambiguation Matters
消除歧义的重要性
| Problem | Example | Consequence |
|---|---|---|
| Naming collision | "JAK" = Janus kinase OR "just another kinase" | Wrong papers retrieved |
| Multiple IDs | Gene has symbol, Ensembl, Entrez, UniProt IDs | Miss data in some databases |
| Salt forms | Metformin vs metformin HCl (different CIDs) | Incomplete compound data |
| Species ambiguity | BRCA1 in human vs mouse | Wrong expression/function data |
| 问题 | 示例 | 后果 |
|---|---|---|
| 命名冲突 | "JAK" = Janus激酶 OR "just another kinase" | 检索到错误论文 |
| 多个ID | 基因有符号、Ensembl、Entrez、UniProt ID | 遗漏部分数据库的数据 |
| 盐形式 | 二甲双胍 vs 盐酸二甲双胍(不同CID) | 化合物数据不完整 |
| 物种歧义 | 人类 vs 小鼠的BRCA1 | 表达/功能数据错误 |
Disambiguation Workflow
歧义消除工作流
Step 1: Establish Canonical IDs
gene_name → UniProt, Ensembl, NCBI Gene, ChEMBL target
compound_name → PubChem CID, ChEMBL ID, SMILES
disease_name → EFO ID, ICD-10, UMLS CUI
Step 2: Gather Synonyms
All aliases, alternative names, historical names
Step 3: Detect Naming Collisions
Search "[TERM]"[Title] → check if results are on-topic
Build negative filters: NOT [collision_term]
Step 4: Species Confirmation
Verify organism is correct (default: Homo sapiens)步骤1:建立标准ID
gene_name → UniProt, Ensembl, NCBI Gene, ChEMBL target
compound_name → PubChem CID, ChEMBL ID, SMILES
disease_name → EFO ID, ICD-10, UMLS CUI
步骤2:收集同义词
所有别名、替代名称、历史名称
步骤3:检测命名冲突
搜索"[TERM]"[Title] → 检查结果是否相关
构建负面过滤器:NOT [冲突术语]
步骤4:物种确认
验证生物是否正确(默认:Homo sapiens)ID Types by Entity
按实体分类的ID类型
Genes/Proteins:
- Gene Symbol (EGFR, TP53)
- UniProt accession (P00533)
- Ensembl ID (ENSG00000146648)
- NCBI Gene ID (1956)
- ChEMBL Target ID (CHEMBL203)
Compounds:
- PubChem CID (2244)
- ChEMBL ID (CHEMBL25)
- SMILES string
- InChI/InChIKey
Diseases:
- EFO ID (EFO_0000249)
- ICD-10 code (G30)
- UMLS CUI (C0002395)
- SNOMED CT
基因/蛋白质:
- 基因符号(EGFR, TP53)
- UniProt accession(P00533)
- Ensembl ID(ENSG00000146648)
- NCBI Gene ID(1956)
- ChEMBL Target ID(CHEMBL203)
化合物:
- PubChem CID(2244)
- ChEMBL ID(CHEMBL25)
- SMILES字符串
- InChI/InChIKey
疾病:
- EFO ID(EFO_0000249)
- ICD-10代码(G30)
- UMLS CUI(C0002395)
- SNOMED CT
Strategy 5: Never Give Up on Search
策略5:搜索永不放弃
CRITICAL: When a tool fails or returns empty, don't give up. Try alternatives.
至关重要:当工具失败或返回空结果时,不要放弃。尝试替代方案。
Failure Handling Protocol
故障处理流程
Attempt 1: Primary tool
↓ fails
Wait briefly, then retry
↓ fails
Try fallback tool #1
↓ fails
Try fallback tool #2
↓ fails
Document as "unavailable" with reason尝试1:主工具
↓ 失败
短暂等待后重试
↓ 失败
尝试备选工具#1
↓ 失败
尝试备选工具#2
↓ 失败
记录为“不可用”并说明原因Common Fallback Chains
常见备选链
| Primary Tool | Fallback Options |
|---|---|
| PubMed citations | EuropePMC citations → OpenAlex citations |
| GTEx expression | Human Protein Atlas expression |
| PubChem compound lookup | ChEMBL search → SMILES-based lookup |
| ChEMBL bioactivity | PubChem bioactivity summary |
| DailyMed drug labels | PubChem drug label info |
| UniProt protein entry | Proteins API |
| 主工具 | 备选选项 |
|---|---|
| PubMed引用 | EuropePMC引用 → OpenAlex引用 |
| GTEx表达 | Human Protein Atlas表达 |
| PubChem化合物查询 | ChEMBL搜索 → 基于SMILES的查询 |
| ChEMBL生物活性 | PubChem生物活性摘要 |
| DailyMed药物标签 | PubChem药物标签信息 |
| UniProt蛋白质条目 | Proteins API |
Alternative Search Strategies
替代搜索策略
If keyword search fails:
- Try synonyms and aliases
- Use broader/narrower terms
- Try different databases
If database is empty:
- Query related databases
- Use literature to find mentions
- Check if entity exists under different name
If API rate-limited:
- Wait and retry
- Try same query on different database
- Use cached results if available
如果关键词搜索失败:
- 尝试同义词和别名
- 使用更宽泛/更具体的术语
- 尝试不同数据库
如果数据库为空:
- 查询相关数据库
- 使用文献查找提及内容
- 检查实体是否以其他名称存在
如果API速率受限:
- 等待并重试
- 在不同数据库尝试相同查询
- 如果可用,使用缓存结果
Strategy 6: Generate Comprehensive Reports
策略6:生成全面报告
CRITICAL: With access to many tools, reports should be detailed and thorough.
至关重要:借助众多工具的访问权限,报告应详细且全面。
Report-First Approach
以报告为导向的方法
- Create report structure FIRST - Define all sections before gathering data
- Progressively update - Fill sections as data is gathered
- Show findings, not process - Report results, not search methodology
- 先创建报告结构 - 在收集数据前定义所有章节
- 逐步更新 - 收集数据时填充章节内容
- 展示发现,而非过程 - 报告结果,而非搜索方法
Citation Requirements
引用要求
Every fact must have a source:
undefined每个事实都必须有来源:
undefinedProtein Function
蛋白质功能
EGFR is a receptor tyrosine kinase that regulates cell growth.
Source: UniProt (P00533)
EGFR是一种调节细胞生长的受体酪氨酸激酶。
来源:UniProt (P00533)
Expression Profile
表达谱
| Tissue | TPM | Source |
|---|---|---|
| Skin | 156.3 | GTEx |
| Lung | 98.4 | GTEx |
undefined| 组织 | TPM | 来源 |
|---|---|---|
| 皮肤 | 156.3 | GTEx |
| 肺 | 98.4 | GTEx |
undefinedEvidence Grading
证据分级
Grade claims by evidence strength:
| Tier | Symbol | Description | Example |
|---|---|---|---|
| T1 | ★★★ | Mechanistic with direct evidence | CRISPR KO study |
| T2 | ★★☆ | Functional study | siRNA knockdown |
| T3 | ★☆☆ | Association/screen hit | GWAS, high-throughput screen |
| T4 | ☆☆☆ | Review mention, text-mined | Review article |
In report:
ATP6V1A drives lysosomal acidification [★★★: PMID:12345678].
It has been implicated in cancer metabolism [★☆☆: TCGA data].根据证据强度对结论分级:
| 层级 | 符号 | 描述 | 示例 |
|---|---|---|---|
| T1 | ★★★ | 有直接证据的机制研究 | CRISPR敲除研究 |
| T2 | ★★☆ | 功能研究 | siRNA敲低 |
| T3 | ★☆☆ | 关联/筛选结果 | GWAS、高通量筛选 |
| T4 | ☆☆☆ | 综述提及、文本挖掘 | 综述文章 |
报告中示例:
ATP6V1A驱动溶酶体酸化 [★★★: PMID:12345678]。
它与癌症代谢有关 [★☆☆: TCGA数据]。Mandatory Completeness
强制完整性
All sections must exist, even if "data unavailable":
undefined所有章节必须存在,即使显示“数据不可用”:
undefinedPathogen Involvement
病原体关联
No pathogen interactions identified in literature or databases.
Source: Literature search, UniProt annotations
undefined在文献和数据库中未发现病原体互作。
来源:文献搜索、UniProt注释
undefinedReport Quality Metrics
报告质量指标
| Quality | Description | Tool Calls | Sections |
|---|---|---|---|
| Excellent | Multi-database, evidence-graded | 30+ | All mandatory, detailed |
| Good | Cross-referenced, sourced | 15-30 | All mandatory, adequate |
| Adequate | Single-database focus | 5-15 | Core sections only |
| Poor | Single tool, no sources | <5 | Incomplete |
| 质量 | 描述 | 工具调用次数 | 章节 |
|---|---|---|---|
| 优秀 | 多数据库、证据分级 | 30+ | 所有必填章节,内容详细 |
| 良好 | 交叉引用、有来源 | 15-30 | 所有必填章节,内容充足 |
| 合格 | 单一数据库聚焦 | 5-15 | 仅核心章节 |
| 较差 | 单一工具、无来源 | <5 | 内容不完整 |
Strategy 7: Use Specialized Skills for Specific Tasks
策略7:针对特定任务使用专业技能
CRITICAL: For specific research tasks, use specialized skills (not this general skill).
至关重要:对于特定研究任务,使用专业技能(而非本通用技能)。
Task-Specific Skill Selection
特定任务技能选择
| Task | Recommended Skill |
|---|---|
| Data Retrieval | |
| Chemical compounds | |
| Expression data | |
| Protein structure | |
| Sequence retrieval | |
| Research & Profiling | |
| Disease research | |
| Drug profiling | |
| Literature review | |
| Target analysis | |
| Clinical Decision Support | |
| Drug safety analysis | |
| Precision oncology treatment | |
| Rare disease diagnosis | |
| Variant interpretation | |
| Discovery & Design | |
| Small molecule binder discovery | |
| Drug repurposing | |
| Protein therapeutic design | |
| Outbreak Response | |
| Infectious disease analysis | |
| Infrastructure & Development | |
| ToolUniverse installation/setup | |
| Python SDK for AI scientist systems | |
| 任务 | 推荐技能 |
|---|---|
| 数据检索 | |
| 化学化合物 | |
| 表达数据 | |
| 蛋白质结构 | |
| 序列检索 | |
| 研究与分析 | |
| 疾病研究 | |
| 药物分析 | |
| 文献综述 | |
| 靶点分析 | |
| 临床决策支持 | |
| 药物安全性分析 | |
| 精准肿瘤治疗 | |
| 罕见病诊断 | |
| 变异体解读 | |
| 发现与设计 | |
| 小分子结合物发现 | |
| 药物重定位 | |
| 蛋白质治疗药物设计 | |
| 疫情响应 | |
| 传染病分析 | |
| 基础设施与开发 | |
| ToolUniverse安装/设置 | |
| AI科学家系统Python SDK | |
When to Use This General Skill
何时使用本通用技能
Use this skill when:
- Need general guidance on ToolUniverse usage
- Task doesn't fit a specialized skill
- Need to combine multiple specialized workflows
- Exploring what's possible with ToolUniverse
- Learning ToolUniverse best practices
在以下场景使用本技能:
- 需要ToolUniverse使用的通用指导
- 任务不适合专业技能
- 需要组合多个专业工作流
- 探索ToolUniverse的可用功能
- 学习ToolUniverse最佳实践
Strategy 8: Parallel Execution for Speed
策略8:并行执行以提升速度
CRITICAL: Run independent queries simultaneously for faster research.
至关重要:同时运行独立查询以加快研究速度。
When to Parallelize
何时并行执行
| Parallel | Sequential |
|---|---|
| Different databases for same entity | Tool B needs output from Tool A |
| Multiple entities, same data type | Building an ID → using the ID |
| Independent research paths | Iterating through a list of results |
| 并行 | 串行 |
|---|---|
| 同一实体的不同数据库查询 | 工具B需要工具A的输出 |
| 多个实体,相同数据类型 | 构建ID → 使用该ID |
| 独立研究路径 | 遍历结果列表 |
Parallel Research Paths Example
并行研究路径示例
For target research, run these 8 paths simultaneously:
- Identity - Names, IDs, sequence
- Structure - 3D structure, domains
- Function - GO terms, pathways
- Interactions - PPI network
- Expression - Tissue expression
- Variants - Genetic variation
- Drugs - Known drugs, druggability
- Literature - Publications, trends
对于靶点研究,同时运行以下8条路径:
- 身份 - 名称、ID、序列
- 结构 - 3D结构、结构域
- 功能 - GO术语、通路
- 互作 - PPI网络
- 表达 - 组织表达
- 变异体 - 遗传变异
- 药物 - 已知药物、成药性
- 文献 - 出版物、趋势
Strategy 9: Iterative Completeness Check
策略9:迭代式完整性检查
CRITICAL: After gathering data, always ask "What else is missing?" to ensure comprehensive coverage.
至关重要:收集数据后,始终问自己“还缺少什么?”以确保全面覆盖。
The Completeness Loop
完整性循环
Gather initial data
↓
Review what you have
↓
Ask: "What aspects are still missing?"
↓
Identify gaps
↓
Search for tools to fill gaps
↓
Gather additional data
↓
Repeat until comprehensive收集初始数据
↓
回顾已有内容
↓
提问:“哪些方面仍缺失?”
↓
识别缺口
↓
搜索填补缺口的工具
↓
收集额外数据
↓
重复直到全面Universal Completeness Questions
通用完整性问题
After each research phase, ask:
- Identity: Do I have all relevant identifiers and names?
- Core data: Do I have the fundamental information for this entity type?
- Context: Do I have surrounding/related information?
- Relationships: Do I know what this connects to?
- Variations: Do I know about variants, forms, or subtypes?
- Evidence: Do I have supporting data from multiple sources?
- Literature: Do I have recent publications on this topic?
- Gaps: Have I documented what's unavailable?
在每个研究阶段后,提问:
- 身份:我是否拥有所有相关标识符和名称?
- 核心数据:我是否拥有该实体类型的基础信息?
- 上下文:我是否拥有相关的背景信息?
- 关系:我是否了解它的关联对象?
- 变异:我是否了解变异体、形式或亚型?
- 证据:我是否拥有来自多个来源的支持数据?
- 文献:我是否拥有该主题的最新出版物?
- 缺口:我是否记录了不可用的内容?
Gap-Filling Strategies
缺口填补策略
| Gap Identified | Strategy |
|---|---|
| Missing data type | Search for tools with that data type |
| Single source only | Query additional databases |
| Outdated information | Check literature for recent updates |
| No experimental data | Look for predictions/computational data |
| Conflicting data | Find authoritative/curated sources |
| Shallow coverage | Dive deeper with specialized tools |
| 识别的缺口 | 策略 |
|---|---|
| 缺少数据类型 | 搜索提供该数据类型的工具 |
| 仅单一来源 | 查询额外数据库 |
| 信息过时 | 检查文献获取最新更新 |
| 无实验数据 | 查找预测/计算数据 |
| 数据冲突 | 查找权威/curated来源 |
| 覆盖较浅 | 使用专业工具深入研究 |
When to Stop
何时停止
Stop the completeness loop when:
- All relevant aspects have been addressed (even if "not found")
- Multiple sources queried for key data
- Gaps are documented, not ignored
- No obvious missing pieces remain
当满足以下条件时停止完整性循环:
- 所有相关方面已覆盖(即使显示“未找到”)
- 关键数据已查询多个来源
- 缺口已记录,未被忽略
- 无明显缺失内容
Self-Review Questions
自我审查问题
Before finalizing any research:
- Have I searched for ALL relevant tools?
- Have I queried multiple databases?
- Have I followed cross-references?
- Have I checked recent literature?
- Have I documented what's unavailable?
- Is there any obvious gap I haven't addressed?
- Would someone reading this ask "but what about X?"
在完成任何研究前:
- 我是否搜索了所有相关工具?
- 我是否查询了多个数据库?
- 我是否跟踪了交叉引用?
- 我是否检查了最新文献?
- 我是否记录了不可用的内容?
- 是否有任何明显缺口我未处理?
- 阅读报告的人是否会问“那X呢?”
Quick Reference: Tool Categories
快速参考:工具分类
Protein & Gene Tools
蛋白质与基因工具
UniProt, Proteins API, MyGene, Ensembl tools
UniProt, Proteins API, MyGene, Ensembl工具
Structure Tools
结构工具
RCSB PDB, PDBe, AlphaFold, InterPro tools
RCSB PDB, PDBe, AlphaFold, InterPro工具
Drug & Compound Tools
药物与化合物工具
ChEMBL, PubChem, DGIdb, ADMET-AI, DrugBank tools
ChEMBL, PubChem, DGIdb, ADMET-AI, DrugBank工具
Disease & Phenotype Tools
疾病与表型工具
OpenTargets, ClinVar, GWAS, HPO tools
OpenTargets, ClinVar, GWAS, HPO工具
Expression Tools
表达工具
GTEx, Human Protein Atlas, CELLxGENE tools
GTEx, Human Protein Atlas, CELLxGENE工具
Variant Tools
变异体工具
gnomAD, ClinVar, dbSNP tools
gnomAD, ClinVar, dbSNP工具
Pathway Tools
通路工具
Reactome, KEGG, WikiPathways, GO tools
Reactome, KEGG, WikiPathways, GO工具
Literature Tools
文献工具
PubMed, EuropePMC, OpenAlex, SemanticScholar tools
PubMed, EuropePMC, OpenAlex, SemanticScholar工具
Clinical Tools
临床工具
ClinicalTrials.gov, FAERS, PharmGKB, DailyMed tools
ClinicalTrials.gov, FAERS, PharmGKB, DailyMed工具
Troubleshooting Common Issues
常见问题排查
"Tool not found"
"工具未找到"
- Search for similar tools using Tool_Finder
- Check spelling of tool name
- Try alternative tools for same data type
- 使用Tool_Finder搜索类似工具
- 检查工具名称拼写
- 尝试同一数据类型的替代工具
"Empty results"
"结果为空"
- Check spelling of query terms
- Try synonyms/aliases
- Try alternative databases
- Verify IDs are correct type
- 检查查询术语拼写
- 尝试同义词/别名
- 尝试替代数据库
- 验证ID类型是否正确
"Conflicting data"
"数据冲突"
- Note all sources
- Prefer curated databases
- Document the conflict in report
- Use evidence grading
- 记录所有来源
- 优先选择curated数据库
- 在报告中记录冲突
- 使用证据分级
"Incomplete picture"
"视图不完整"
- Search for more tools
- Query additional databases
- Follow cross-references
- Expand via literature
- 搜索更多工具
- 查询额外数据库
- 跟踪交叉引用
- 通过文献扩展
Strategy 10: English-First Tool Queries
策略10:工具查询优先使用英文
CRITICAL: Most ToolUniverse tools only accept English terms. Always translate queries to English before calling tools, regardless of the user's language.
至关重要:大多数ToolUniverse工具仅接受英文术语。无论用户使用何种语言,在调用工具前始终将查询转换为英文。
Language Handling Rules
语言处理规则
- Default to English - All tool calls must use English search terms, entity names, and parameters
- Translate non-English input - If the user's question is in Chinese, Japanese, Korean, or any other language, translate the relevant scientific terms to English before making tool calls
- Respond in the user's language - While tools must be queried in English, deliver the final report/answer in the user's original language
- Fallback to original language - Only if an English search returns no results, retry with the original-language terms
- Check tool descriptions - A few tools may explicitly document multi-language support; use the original language only when the tool description says so
- 默认使用英文 - 所有工具调用必须使用英文搜索术语、实体名称和参数
- 翻译非英文输入 - 如果用户的问题是中文、日文、韩文或其他语言,在调用工具前将相关科学术语翻译为英文
- 用用户的语言回复 - 虽然工具调用必须使用英文,但最终报告/答案需使用用户的原始语言
- ** fallback到原始语言** - 仅当英文搜索无结果时,使用原始语言术语重试
- 查看工具描述 - 少数工具可能明确支持多语言;仅当工具描述说明时才使用原始语言
Examples
示例
User (Chinese): "研究EGFR靶点"
→ Tool calls: use "EGFR", "epidermal growth factor receptor" (English)
→ Report: deliver in Chinese
User (Japanese): "メトホルミンの安全性プロファイル"
→ Tool calls: use "metformin", "safety profile" (English)
→ Report: deliver in Japanese
User (Korean): "알츠하이머병 관련 유전자"
→ Tool calls: use "Alzheimer's disease", "associated genes" (English)
→ Report: deliver in Korean用户(中文):"研究EGFR靶点"
→ 工具调用:使用"EGFR", "epidermal growth factor receptor"(英文)
→ 报告:用中文交付
用户(日文):"メトホルミンの安全性プロファイル"
→ 工具调用:使用"metformin", "safety profile"(英文)
→ 报告:用日文交付
用户(韩文):"알츠하이머병 관련 유전자"
→ 工具调用:使用"Alzheimer's disease", "associated genes"(英文)
→ 报告:用韩文交付Why This Matters
为何这很重要
| Scenario | Wrong Approach | Correct Approach |
|---|---|---|
| User asks in Chinese about "二甲双胍" | Pass "二甲双胍" to PubChem search | Translate to "metformin", search in English |
| User asks in Japanese about a disease | Pass Japanese disease name to OpenTargets | Translate to English disease name first |
| User asks in Spanish about a gene | Pass Spanish description to tool | Use standard gene symbol (e.g., TP53) |
| 场景 | 错误做法 | 正确做法 |
|---|---|---|
| 用户用中文询问"二甲双胍" | 将"二甲双胍"传入PubChem搜索 | 翻译为"metformin",用英文搜索 |
| 用户用日文询问某疾病 | 将日文疾病名称传入OpenTargets | 先翻译为英文疾病名称 |
| 用户用西班牙文询问某基因 | 将西班牙文描述传入工具 | 使用标准基因符号(如TP53) |
Summary: The ToolUniverse Mindset
总结:ToolUniverse思维模式
| Principle | Action |
|---|---|
| Clarify first | Confirm entity, scope, species, and output before researching |
| Search widely | 10000+ tools; always discover more |
| Multi-hop persistence | 5-10 tool calls per question is normal |
| Cross-reference | Query multiple databases for same data |
| Disambiguate first | Resolve IDs before research |
| Never give up | Fallbacks for every failure |
| Report comprehensively | Detail with sources and evidence grades |
| Use specialized skills | Apply domain-specific skills for focused tasks |
| Execute in parallel | Speed through concurrent execution |
| Check completeness | Ask "what's missing?" and fill gaps iteratively |
| English-first queries | Translate to English for tool calls; respond in user's language |
The goal: Transform 10000+ tools into comprehensive, reliable scientific intelligence.
| 原则 | 行动 |
|---|---|
| 先澄清 | 研究前确认实体、范围、物种和输出形式 |
| 广泛搜索 | 10000+工具;始终发现更多工具 |
| 多跳持续查询 | 每个问题通常需要5-10次工具调用 |
| 交叉引用 | 针对同一数据查询多个数据库 |
| 先消除歧义 | 研究前解析ID |
| 永不放弃 | 每个失败都有备选方案 |
| 全面报告 | 详细内容搭配来源和证据分级 |
| 使用专业技能 | 针对聚焦任务应用领域特定技能 |
| 并行执行 | 通过并发执行加快速度 |
| 检查完整性 | 问自己“还缺少什么?”并迭代填补缺口 |
| 优先英文查询 | 翻译为英文进行工具调用;用用户的语言回复 |
目标:将10000+工具转化为全面、可靠的科学智能。