tooluniverse-sdk
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseToolUniverse Python SDK
ToolUniverse Python SDK
ToolUniverse provides programmatic access to 1000++ scientific tools through a unified interface. It implements the AI-Tool Interaction Protocol for building AI scientist systems that integrate ML models, databases, APIs, and scientific packages.
IMPORTANT - Language Handling: Most tools accept English terms only. When building workflows, always translate non-English input to English before passing to tool parameters. Only try original-language terms as a fallback if English returns no results.
ToolUniverse通过统一接口提供对1000+科研工具的编程访问能力。它实现了AI-工具交互协议,可用于构建整合ML模型、数据库、API和科研包的AI科研系统。
重要提示 - 语言处理:大多数工具仅支持英文术语。构建工作流时,请务必先将非英文输入翻译成英文,再传入工具参数。只有当英文查询无结果时,才尝试使用原始语言术语作为备选方案。
Installation
安装
bash
undefinedbash
undefinedStandard installation
Standard installation
pip install tooluniverse
pip install tooluniverse
With optional features
With optional features
pip install tooluniverse[embedding] # Embedding search (GPU)
pip install tooluniverse[ml] # ML model tools
pip install tooluniverse[all] # All features
undefinedpip install tooluniverse[embedding] # Embedding search (GPU)
pip install tooluniverse[ml] # ML model tools
pip install tooluniverse[all] # All features
undefinedEnvironment Setup
环境配置
bash
undefinedbash
undefinedRequired for LLM-based tool search and hooks
Required for LLM-based tool search and hooks
export OPENAI_API_KEY="sk-..."
export OPENAI_API_KEY="sk-..."
Optional for higher rate limits
Optional for higher rate limits
export NCBI_API_KEY="..."
Or use `.env` file:
```python
from dotenv import load_dotenv
load_dotenv()export NCBI_API_KEY="..."
或使用.env文件:
```python
from dotenv import load_dotenv
load_dotenv()Quick Start
快速开始
python
from tooluniverse import ToolUniversepython
from tooluniverse import ToolUniverse1. Initialize and load tools
1. Initialize and load tools
tu = ToolUniverse()
tu.load_tools() # Loads 1000++ tools (~5-10 seconds first time)
tu = ToolUniverse()
tu.load_tools() # Loads 1000++ tools (~5-10 seconds first time)
2. Find tools (three methods)
2. Find tools (three methods)
Method A: Keyword (fast, no API key)
Method A: Keyword (fast, no API key)
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "protein structure", "limit": 10}
})
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "protein structure", "limit": 10}
})
Method B: LLM (intelligent, requires OPENAI_API_KEY)
Method B: LLM (intelligent, requires OPENAI_API_KEY)
tools = tu.run({
"name": "Tool_Finder_LLM",
"arguments": {"description": "predict drug toxicity", "limit": 5}
})
tools = tu.run({
"name": "Tool_Finder_LLM",
"arguments": {"description": "predict drug toxicity", "limit": 5}
})
Method C: Embedding (semantic, requires GPU)
Method C: Embedding (semantic, requires GPU)
tools = tu.run({
"name": "Tool_Finder",
"arguments": {"description": "protein interactions", "limit": 10}
})
tools = tu.run({
"name": "Tool_Finder",
"arguments": {"description": "protein interactions", "limit": 10}
})
3. Execute tools (two ways)
3. Execute tools (two ways)
Dictionary API
Dictionary API
result = tu.run({
"name": "UniProt_get_entry_by_accession",
"arguments": {"accession": "P05067"}
})
result = tu.run({
"name": "UniProt_get_entry_by_accession",
"arguments": {"accession": "P05067"}
})
Function API (recommended)
Function API (recommended)
result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefinedresult = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefinedCore Patterns
核心使用模式
Pattern 1: Discovery → Execute
模式1:发现→执行
python
undefinedpython
undefinedFind tools
Find tools
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "ADMET prediction", "limit": 3}
})
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "ADMET prediction", "limit": 3}
})
Check results structure
Check results structure
if isinstance(tools, dict) and 'tools' in tools:
for tool in tools['tools']:
print(f"{tool['name']}: {tool['description']}")
if isinstance(tools, dict) and 'tools' in tools:
for tool in tools['tools']:
print(f"{tool['name']}: {tool['description']}")
Execute tool
Execute tool
result = tu.tools.ADMETAI_predict_admet(
smiles="CC(C)Cc1ccc(cc1)C(C)C(O)=O"
)
undefinedresult = tu.tools.ADMETAI_predict_admet(
smiles="CC(C)Cc1ccc(cc1)C(C)C(O)=O"
)
undefinedPattern 2: Batch Execution
模式2:批量执行
python
undefinedpython
undefinedDefine calls
Define calls
calls = [
{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P05067"}},
{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P12345"}},
{"name": "RCSB_PDB_get_structure_by_id", "arguments": {"pdb_id": "1ABC"}}
]
calls = [
{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P05067"}},
{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P12345"}},
{"name": "RCSB_PDB_get_structure_by_id", "arguments": {"pdb_id": "1ABC"}}
]
Execute in parallel
Execute in parallel
results = tu.run_batch(calls)
undefinedresults = tu.run_batch(calls)
undefinedPattern 3: Scientific Workflow
模式3:科研工作流
python
def drug_discovery_pipeline(disease_id):
tu = ToolUniverse(use_cache=True)
tu.load_tools()
try:
# Get targets
targets = tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(
efoId=disease_id
)
# Get compounds (batch)
compound_calls = [
{"name": "ChEMBL_search_molecule_by_target",
"arguments": {"target_id": t['id'], "limit": 10}}
for t in targets['data'][:5]
]
compounds = tu.run_batch(compound_calls)
# Predict ADMET
admet_results = []
for comp_list in compounds:
if comp_list and 'molecules' in comp_list:
for mol in comp_list['molecules'][:3]:
admet = tu.tools.ADMETAI_predict_admet(
smiles=mol['smiles'],
use_cache=True
)
admet_results.append(admet)
return {"targets": targets, "compounds": compounds, "admet": admet_results}
finally:
tu.close()python
def drug_discovery_pipeline(disease_id):
tu = ToolUniverse(use_cache=True)
tu.load_tools()
try:
# Get targets
targets = tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(
efoId=disease_id
)
# Get compounds (batch)
compound_calls = [
{"name": "ChEMBL_search_molecule_by_target",
"arguments": {"target_id": t['id'], "limit": 10}}
for t in targets['data'][:5]
]
compounds = tu.run_batch(compound_calls)
# Predict ADMET
admet_results = []
for comp_list in compounds:
if comp_list and 'molecules' in comp_list:
for mol in comp_list['molecules'][:3]:
admet = tu.tools.ADMETAI_predict_admet(
smiles=mol['smiles'],
use_cache=True
)
admet_results.append(admet)
return {"targets": targets, "compounds": compounds, "admet": admet_results}
finally:
tu.close()Configuration
配置
Caching
缓存
python
undefinedpython
undefinedEnable globally
Enable globally
tu = ToolUniverse(use_cache=True)
tu.load_tools()
tu = ToolUniverse(use_cache=True)
tu.load_tools()
Or per-call
Or per-call
result = tu.tools.ADMETAI_predict_admet(
smiles="...",
use_cache=True # Cache expensive predictions
)
result = tu.tools.ADMETAI_predict_admet(
smiles="...",
use_cache=True # Cache expensive predictions
)
Manage cache
Manage cache
stats = tu.get_cache_stats()
tu.clear_cache()
undefinedstats = tu.get_cache_stats()
tu.clear_cache()
undefinedHooks (Auto-summarization)
钩子(自动摘要)
python
undefinedpython
undefinedEnable hooks for large outputs
Enable hooks for large outputs
tu = ToolUniverse(hooks_enabled=True)
tu.load_tools()
result = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID(
ensemblId="ENSG00000012048"
)
tu = ToolUniverse(hooks_enabled=True)
tu.load_tools()
result = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID(
ensemblId="ENSG00000012048"
)
Check if summarized
Check if summarized
if isinstance(result, dict) and "summary" in result:
print(f"Summarized: {result['summary']}")
undefinedif isinstance(result, dict) and "summary" in result:
print(f"Summarized: {result['summary']}")
undefinedLoad Specific Categories
加载特定分类
python
undefinedpython
undefinedFaster loading
Faster loading
tu = ToolUniverse()
tu.load_tools(categories=["proteins", "drugs"])
undefinedtu = ToolUniverse()
tu.load_tools(categories=["proteins", "drugs"])
undefinedCritical Things to Know
关键注意事项
⚠️ Always Call load_tools()
⚠️ 务必调用load_tools()
python
undefinedpython
undefined❌ Wrong - will fail
❌ Wrong - will fail
tu = ToolUniverse()
result = tu.tools.some_tool() # Error!
tu = ToolUniverse()
result = tu.tools.some_tool() # Error!
✅ Correct
✅ Correct
tu = ToolUniverse()
tu.load_tools()
result = tu.tools.some_tool()
undefinedtu = ToolUniverse()
tu.load_tools()
result = tu.tools.some_tool()
undefined⚠️ Tool Finder Returns Nested Structure
⚠️ 工具查找返回嵌套结构
python
undefinedpython
undefined❌ Wrong
❌ Wrong
tools = tu.run({"name": "Tool_Finder_Keyword", "arguments": {"description": "protein"}})
for tool in tools: # Error: tools is dict
print(tool['name'])
tools = tu.run({"name": "Tool_Finder_Keyword", "arguments": {"description": "protein"}})
for tool in tools: # Error: tools is dict
print(tool['name'])
✅ Correct
✅ Correct
if isinstance(tools, dict) and 'tools' in tools:
for tool in tools['tools']:
print(tool['name'])
undefinedif isinstance(tools, dict) and 'tools' in tools:
for tool in tools['tools']:
print(tool['name'])
undefined⚠️ Check Required Parameters
⚠️ 检查必填参数
python
undefinedpython
undefinedCheck tool schema first
Check tool schema first
tool_info = tu.all_tool_dict["UniProt_get_entry_by_accession"]
required = tool_info['parameter'].get('required', [])
print(f"Required: {required}")
tool_info = tu.all_tool_dict["UniProt_get_entry_by_accession"]
required = tool_info['parameter'].get('required', [])
print(f"Required: {required}")
Then call
Then call
result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefinedresult = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefined⚠️ Cache Strategy
⚠️ 缓存策略
python
undefinedpython
undefined✅ Cache: ML predictions, database queries (deterministic)
✅ Cache: ML predictions, database queries (deterministic)
result = tu.tools.ADMETAI_predict_admet(smiles="...", use_cache=True)
result = tu.tools.ADMETAI_predict_admet(smiles="...", use_cache=True)
❌ Don't cache: real-time data, time-sensitive results
❌ Don't cache: real-time data, time-sensitive results
result = tu.tools.get_latest_publications() # No cache
undefinedresult = tu.tools.get_latest_publications() # No cache
undefined⚠️ Error Handling
⚠️ 错误处理
python
from tooluniverse.exceptions import ToolError, ToolUnavailableError
try:
result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
except ToolUnavailableError as e:
print(f"Tool unavailable: {e}")
except ToolError as e:
print(f"Execution failed: {e}")python
from tooluniverse.exceptions import ToolError, ToolUnavailableError
try:
result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
except ToolUnavailableError as e:
print(f"Tool unavailable: {e}")
except ToolError as e:
print(f"Execution failed: {e}")⚠️ Tool Names Are Case-Sensitive
⚠️ 工具名称区分大小写
python
undefinedpython
undefined❌ Wrong
❌ Wrong
result = tu.tools.uniprot_get_entry_by_accession(accession="P05067")
result = tu.tools.uniprot_get_entry_by_accession(accession="P05067")
✅ Correct
✅ Correct
result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefinedresult = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
undefinedExecution Options
执行选项
python
result = tu.tools.tool_name(
param="value",
use_cache=True, # Cache this call
validate=True, # Validate parameters (default)
stream_callback=None # Streaming output
)python
result = tu.tools.tool_name(
param="value",
use_cache=True, # Cache this call
validate=True, # Validate parameters (default)
stream_callback=None # Streaming output
)Performance Tips
性能优化建议
python
undefinedpython
undefined1. Load specific categories
1. Load specific categories
tu.load_tools(categories=["proteins"])
tu.load_tools(categories=["proteins"])
2. Use batch execution
2. Use batch execution
results = tu.run_batch(calls)
results = tu.run_batch(calls)
3. Enable caching
3. Enable caching
tu = ToolUniverse(use_cache=True)
tu = ToolUniverse(use_cache=True)
4. Disable validation (after testing)
4. Disable validation (after testing)
result = tu.tools.tool_name(param="value", validate=False)
undefinedresult = tu.tools.tool_name(param="value", validate=False)
undefinedTroubleshooting
故障排查
Tool Not Found
工具未找到
python
undefinedpython
undefinedSearch for tool
Search for tool
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "partial_name", "limit": 10}
})
tools = tu.run({
"name": "Tool_Finder_Keyword",
"arguments": {"description": "partial_name", "limit": 10}
})
Check if exists
Check if exists
if "Tool_Name" in tu.all_tool_dict:
print("Found!")
undefinedif "Tool_Name" in tu.all_tool_dict:
print("Found!")
undefinedAPI Key Issues
API密钥问题
python
import os
if not os.environ.get("OPENAI_API_KEY"):
print("⚠️ OPENAI_API_KEY not set")
print("Set: export OPENAI_API_KEY='sk-...'")python
import os
if not os.environ.get("OPENAI_API_KEY"):
print("⚠️ OPENAI_API_KEY not set")
print("Set: export OPENAI_API_KEY='sk-...'")Validation Errors
验证错误
python
from tooluniverse.exceptions import ToolValidationError
try:
result = tu.tools.some_tool(param="value")
except ToolValidationError as e:
# Check schema
tool_info = tu.all_tool_dict["some_tool"]
print(f"Required: {tool_info['parameter'].get('required', [])}")
print(f"Properties: {tool_info['parameter']['properties'].keys()}")python
from tooluniverse.exceptions import ToolValidationError
try:
result = tu.tools.some_tool(param="value")
except ToolValidationError as e:
# Check schema
tool_info = tu.all_tool_dict["some_tool"]
print(f"Required: {tool_info['parameter'].get('required', [])}")
print(f"Properties: {tool_info['parameter']['properties'].keys()}")Enable Debug Logging
启用调试日志
python
from tooluniverse.logging_config import set_log_level
set_log_level("DEBUG")python
from tooluniverse.logging_config import set_log_level
set_log_level("DEBUG")Tool Categories
工具分类
| Category | Tools | Use Cases |
|---|---|---|
| Proteins | UniProt, RCSB PDB, AlphaFold | Protein analysis, structure |
| Drugs | DrugBank, ChEMBL, PubChem | Drug discovery, compounds |
| Genomics | Ensembl, NCBI Gene, gnomAD | Gene analysis, variants |
| Diseases | OpenTargets, ClinVar | Disease-target associations |
| Literature | PubMed, Europe PMC | Literature search |
| ML Models | ADMET-AI, AlphaFold | Predictions, modeling |
| Pathways | KEGG, Reactome | Pathway analysis |
| 分类 | 工具 | 适用场景 |
|---|---|---|
| 蛋白质 | UniProt, RCSB PDB, AlphaFold | 蛋白质分析、结构解析 |
| 药物 | DrugBank, ChEMBL, PubChem | 药物发现、化合物研究 |
| 基因组学 | Ensembl, NCBI Gene, gnomAD | 基因分析、变异研究 |
| 疾病 | OpenTargets, ClinVar | 疾病-靶点关联分析 |
| 文献 | PubMed, Europe PMC | 文献检索 |
| ML模型 | ADMET-AI, AlphaFold | 预测分析、建模 |
| 通路 | KEGG, Reactome | 通路分析 |
Resources
资源
- Documentation: https://zitniklab.hms.harvard.edu/ToolUniverse/
- Tool List: https://zitniklab.hms.harvard.edu/ToolUniverse/tools/tools_config_index.html
- GitHub: https://github.com/mims-harvard/ToolUniverse
- Examples: See directory in repository
examples/ - Slack: https://join.slack.com/t/tooluniversehq/shared_invite/zt-3dic3eoio-5xxoJch7TLNibNQn5_AREQ
For detailed guides, see REFERENCE.md.
- 官方文档: https://zitniklab.hms.harvard.edu/ToolUniverse/
- 工具列表: https://zitniklab.hms.harvard.edu/ToolUniverse/tools/tools_config_index.html
- GitHub仓库: https://github.com/mims-harvard/ToolUniverse
- 示例代码: 请查看仓库中的目录
examples/ - Slack社区: https://join.slack.com/t/tooluniversehq/shared_invite/zt-3dic3eoio-5xxoJch7TLNibNQn5_AREQ
如需详细指南,请查阅REFERENCE.md。