ToolUniverse Python SDK

ToolUniverse provides programmatic access to 1000++ scientific tools through a unified interface. It implements the AI-Tool Interaction Protocol for building AI scientist systems that integrate ML models, databases, APIs, and scientific packages.

IMPORTANT - Language Handling: Most tools accept English terms only. When building workflows, always translate non-English input to English before passing to tool parameters. Only try original-language terms as a fallback if English returns no results.

ToolUniverse通过统一接口提供对1000+科研工具的编程访问能力。它实现了AI-工具交互协议，可用于构建整合ML模型、数据库、API和科研包的AI科研系统。

重要提示 - 语言处理：大多数工具仅支持英文术语。构建工作流时，请务必先将非英文输入翻译成英文，再传入工具参数。只有当英文查询无结果时，才尝试使用原始语言术语作为备选方案。

Installation

安装

bash

undefined

bash

undefined

Standard installation

pip install tooluniverse

With optional features

pip install tooluniverse[embedding] # Embedding search (GPU) pip install tooluniverse[ml] # ML model tools pip install tooluniverse[all] # All features

undefined

pip install tooluniverse[embedding] # Embedding search (GPU) pip install tooluniverse[ml] # ML model tools pip install tooluniverse[all] # All features

undefined

Environment Setup

环境配置

bash

undefined

bash

undefined

Required for LLM-based tool search and hooks

export OPENAI_API_KEY="sk-..."

Optional for higher rate limits

export NCBI_API_KEY="..."


Or use `.env` file:

```python
from dotenv import load_dotenv
load_dotenv()

export NCBI_API_KEY="..."


或使用.env文件：

```python
from dotenv import load_dotenv
load_dotenv()

Quick Start

快速开始

python

from tooluniverse import ToolUniverse

python

from tooluniverse import ToolUniverse

1. Initialize and load tools

tu = ToolUniverse() tu.load_tools() # Loads 1000++ tools (~5-10 seconds first time)

2. Find tools (three methods)

Method A: Keyword (fast, no API key)

tools = tu.run({ "name": "Tool_Finder_Keyword", "arguments": {"description": "protein structure", "limit": 10} })

Method B: LLM (intelligent, requires OPENAI_API_KEY)

tools = tu.run({ "name": "Tool_Finder_LLM", "arguments": {"description": "predict drug toxicity", "limit": 5} })

Method C: Embedding (semantic, requires GPU)

tools = tu.run({ "name": "Tool_Finder", "arguments": {"description": "protein interactions", "limit": 10} })

3. Execute tools (two ways)

Dictionary API

result = tu.run({ "name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P05067"} })

Function API (recommended)

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

Core Patterns

核心使用模式

Pattern 1: Discovery → Execute

模式1：发现→执行

python

undefined

python

undefined

Find tools

tools = tu.run({ "name": "Tool_Finder_Keyword", "arguments": {"description": "ADMET prediction", "limit": 3} })

Check results structure

if isinstance(tools, dict) and 'tools' in tools: for tool in tools['tools']: print(f"{tool['name']}: {tool['description']}")

Execute tool

result = tu.tools.ADMETAI_predict_admet( smiles="CC(C)Cc1ccc(cc1)C(C)C(O)=O" )

undefined

result = tu.tools.ADMETAI_predict_admet( smiles="CC(C)Cc1ccc(cc1)C(C)C(O)=O" )

undefined

Pattern 2: Batch Execution

模式2：批量执行

python

undefined

python

undefined

Define calls

calls = [ {"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P05067"}}, {"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P12345"}}, {"name": "RCSB_PDB_get_structure_by_id", "arguments": {"pdb_id": "1ABC"}} ]

Execute in parallel

results = tu.run_batch(calls)

undefined

results = tu.run_batch(calls)

undefined

Pattern 3: Scientific Workflow

模式3：科研工作流

python

def drug_discovery_pipeline(disease_id):
    tu = ToolUniverse(use_cache=True)
    tu.load_tools()
    
    try:
        # Get targets
        targets = tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(
            efoId=disease_id
        )
        
        # Get compounds (batch)
        compound_calls = [
            {"name": "ChEMBL_search_molecule_by_target", 
             "arguments": {"target_id": t['id'], "limit": 10}}
            for t in targets['data'][:5]
        ]
        compounds = tu.run_batch(compound_calls)
        
        # Predict ADMET
        admet_results = []
        for comp_list in compounds:
            if comp_list and 'molecules' in comp_list:
                for mol in comp_list['molecules'][:3]:
                    admet = tu.tools.ADMETAI_predict_admet(
                        smiles=mol['smiles'],
                        use_cache=True
                    )
                    admet_results.append(admet)
        
        return {"targets": targets, "compounds": compounds, "admet": admet_results}
    finally:
        tu.close()

python

def drug_discovery_pipeline(disease_id):
    tu = ToolUniverse(use_cache=True)
    tu.load_tools()
    
    try:
        # Get targets
        targets = tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(
            efoId=disease_id
        )
        
        # Get compounds (batch)
        compound_calls = [
            {"name": "ChEMBL_search_molecule_by_target", 
             "arguments": {"target_id": t['id'], "limit": 10}}
            for t in targets['data'][:5]
        ]
        compounds = tu.run_batch(compound_calls)
        
        # Predict ADMET
        admet_results = []
        for comp_list in compounds:
            if comp_list and 'molecules' in comp_list:
                for mol in comp_list['molecules'][:3]:
                    admet = tu.tools.ADMETAI_predict_admet(
                        smiles=mol['smiles'],
                        use_cache=True
                    )
                    admet_results.append(admet)
        
        return {"targets": targets, "compounds": compounds, "admet": admet_results}
    finally:
        tu.close()

Configuration

配置

Caching

缓存

python

undefined

python

undefined

Enable globally

tu = ToolUniverse(use_cache=True) tu.load_tools()

Or per-call

result = tu.tools.ADMETAI_predict_admet( smiles="...", use_cache=True # Cache expensive predictions )

Manage cache

stats = tu.get_cache_stats() tu.clear_cache()

undefined

stats = tu.get_cache_stats() tu.clear_cache()

undefined

Hooks (Auto-summarization)

钩子（自动摘要）

python

undefined

python

undefined

Enable hooks for large outputs

tu = ToolUniverse(hooks_enabled=True) tu.load_tools()

result = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID( ensemblId="ENSG00000012048" )

tu = ToolUniverse(hooks_enabled=True) tu.load_tools()

result = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID( ensemblId="ENSG00000012048" )

Check if summarized

if isinstance(result, dict) and "summary" in result: print(f"Summarized: {result['summary']}")

undefined

if isinstance(result, dict) and "summary" in result: print(f"Summarized: {result['summary']}")

undefined

Load Specific Categories

加载特定分类

python

undefined

python

undefined

Faster loading

tu = ToolUniverse() tu.load_tools(categories=["proteins", "drugs"])

undefined

tu = ToolUniverse() tu.load_tools(categories=["proteins", "drugs"])

undefined

Critical Things to Know

关键注意事项

⚠️ Always Call load_tools()

⚠️ 务必调用load_tools()

python

undefined

python

undefined

❌ Wrong - will fail

tu = ToolUniverse() result = tu.tools.some_tool() # Error!

✅ Correct

tu = ToolUniverse() tu.load_tools() result = tu.tools.some_tool()

undefined

tu = ToolUniverse() tu.load_tools() result = tu.tools.some_tool()

undefined

⚠️ Tool Finder Returns Nested Structure

⚠️ 工具查找返回嵌套结构

python

undefined

python

undefined

❌ Wrong

tools = tu.run({"name": "Tool_Finder_Keyword", "arguments": {"description": "protein"}}) for tool in tools: # Error: tools is dict print(tool['name'])

✅ Correct

if isinstance(tools, dict) and 'tools' in tools: for tool in tools['tools']: print(tool['name'])

undefined

if isinstance(tools, dict) and 'tools' in tools: for tool in tools['tools']: print(tool['name'])

undefined

⚠️ Check Required Parameters

⚠️ 检查必填参数

python

undefined

python

undefined

Check tool schema first

tool_info = tu.all_tool_dict["UniProt_get_entry_by_accession"] required = tool_info['parameter'].get('required', []) print(f"Required: {required}")

Then call

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

⚠️ Cache Strategy

⚠️ 缓存策略

python

undefined

python

undefined

✅ Cache: ML predictions, database queries (deterministic)

result = tu.tools.ADMETAI_predict_admet(smiles="...", use_cache=True)

❌ Don't cache: real-time data, time-sensitive results

result = tu.tools.get_latest_publications() # No cache

undefined

result = tu.tools.get_latest_publications() # No cache

undefined

⚠️ Error Handling

⚠️ 错误处理

python

from tooluniverse.exceptions import ToolError, ToolUnavailableError

try:
    result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
except ToolUnavailableError as e:
    print(f"Tool unavailable: {e}")
except ToolError as e:
    print(f"Execution failed: {e}")

python

from tooluniverse.exceptions import ToolError, ToolUnavailableError

try:
    result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")
except ToolUnavailableError as e:
    print(f"Tool unavailable: {e}")
except ToolError as e:
    print(f"Execution failed: {e}")

⚠️ Tool Names Are Case-Sensitive

⚠️ 工具名称区分大小写

python

undefined

python

undefined

❌ Wrong

result = tu.tools.uniprot_get_entry_by_accession(accession="P05067")

✅ Correct

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

undefined

Execution Options

执行选项

python

result = tu.tools.tool_name(
    param="value",
    use_cache=True,      # Cache this call
    validate=True,       # Validate parameters (default)
    stream_callback=None # Streaming output
)

python

result = tu.tools.tool_name(
    param="value",
    use_cache=True,      # Cache this call
    validate=True,       # Validate parameters (default)
    stream_callback=None # Streaming output
)

Performance Tips

性能优化建议

python

undefined

python

undefined

1. Load specific categories

tu.load_tools(categories=["proteins"])

2. Use batch execution

results = tu.run_batch(calls)

3. Enable caching

tu = ToolUniverse(use_cache=True)

4. Disable validation (after testing)

result = tu.tools.tool_name(param="value", validate=False)

undefined

result = tu.tools.tool_name(param="value", validate=False)

undefined

Troubleshooting

故障排查

Tool Not Found

工具未找到

python

undefined

python

undefined

Search for tool

tools = tu.run({ "name": "Tool_Finder_Keyword", "arguments": {"description": "partial_name", "limit": 10} })

Check if exists

if "Tool_Name" in tu.all_tool_dict: print("Found!")

undefined

if "Tool_Name" in tu.all_tool_dict: print("Found!")

undefined

API Key Issues

API密钥问题

python

import os
if not os.environ.get("OPENAI_API_KEY"):
    print("⚠️ OPENAI_API_KEY not set")
    print("Set: export OPENAI_API_KEY='sk-...'")

python

import os
if not os.environ.get("OPENAI_API_KEY"):
    print("⚠️ OPENAI_API_KEY not set")
    print("Set: export OPENAI_API_KEY='sk-...'")

Validation Errors

验证错误

python

from tooluniverse.exceptions import ToolValidationError

try:
    result = tu.tools.some_tool(param="value")
except ToolValidationError as e:
    # Check schema
    tool_info = tu.all_tool_dict["some_tool"]
    print(f"Required: {tool_info['parameter'].get('required', [])}")
    print(f"Properties: {tool_info['parameter']['properties'].keys()}")

python

from tooluniverse.exceptions import ToolValidationError

try:
    result = tu.tools.some_tool(param="value")
except ToolValidationError as e:
    # Check schema
    tool_info = tu.all_tool_dict["some_tool"]
    print(f"Required: {tool_info['parameter'].get('required', [])}")
    print(f"Properties: {tool_info['parameter']['properties'].keys()}")

Enable Debug Logging

启用调试日志

python

from tooluniverse.logging_config import set_log_level
set_log_level("DEBUG")

python

from tooluniverse.logging_config import set_log_level
set_log_level("DEBUG")

Tool Categories

工具分类

Category	Tools	Use Cases
Proteins	UniProt, RCSB PDB, AlphaFold	Protein analysis, structure
Drugs	DrugBank, ChEMBL, PubChem	Drug discovery, compounds
Genomics	Ensembl, NCBI Gene, gnomAD	Gene analysis, variants
Diseases	OpenTargets, ClinVar	Disease-target associations
Literature	PubMed, Europe PMC	Literature search
ML Models	ADMET-AI, AlphaFold	Predictions, modeling
Pathways	KEGG, Reactome	Pathway analysis

分类	工具	适用场景
蛋白质	UniProt, RCSB PDB, AlphaFold	蛋白质分析、结构解析
药物	DrugBank, ChEMBL, PubChem	药物发现、化合物研究
基因组学	Ensembl, NCBI Gene, gnomAD	基因分析、变异研究
疾病	OpenTargets, ClinVar	疾病-靶点关联分析
文献	PubMed, Europe PMC	文献检索
ML模型	ADMET-AI, AlphaFold	预测分析、建模
通路	KEGG, Reactome	通路分析

Resources

资源

Documentation: https://zitniklab.hms.harvard.edu/ToolUniverse/
Tool List: https://zitniklab.hms.harvard.edu/ToolUniverse/tools/tools_config_index.html
GitHub: https://github.com/mims-harvard/ToolUniverse
Examples: See
```
examples/
```
directory in repository
Slack: https://join.slack.com/t/tooluniversehq/shared_invite/zt-3dic3eoio-5xxoJch7TLNibNQn5_AREQ

For detailed guides, see REFERENCE.md.

官方文档: https://zitniklab.hms.harvard.edu/ToolUniverse/
工具列表: https://zitniklab.hms.harvard.edu/ToolUniverse/tools/tools_config_index.html
GitHub仓库: https://github.com/mims-harvard/ToolUniverse
示例代码: 请查看仓库中的
```
examples/
```
目录
Slack社区: https://join.slack.com/t/tooluniversehq/shared_invite/zt-3dic3eoio-5xxoJch7TLNibNQn5_AREQ

如需详细指南，请查阅REFERENCE.md。

tooluniverse-sdk

Original

Translation

ToolUniverse Python SDK

ToolUniverse Python SDK

Installation

安装

Standard installation

Standard installation

With optional features

With optional features

Environment Setup

环境配置

Required for LLM-based tool search and hooks

Required for LLM-based tool search and hooks

Optional for higher rate limits

Optional for higher rate limits

Quick Start

快速开始

1. Initialize and load tools

1. Initialize and load tools

2. Find tools (three methods)

2. Find tools (three methods)

Method A: Keyword (fast, no API key)

Method A: Keyword (fast, no API key)

Method B: LLM (intelligent, requires OPENAI_API_KEY)

Method B: LLM (intelligent, requires OPENAI_API_KEY)

Method C: Embedding (semantic, requires GPU)

Method C: Embedding (semantic, requires GPU)

3. Execute tools (two ways)

3. Execute tools (two ways)

Dictionary API

Dictionary API

Function API (recommended)

Function API (recommended)

Core Patterns

核心使用模式

Pattern 1: Discovery → Execute

模式1：发现→执行

Find tools

Find tools

Check results structure

Check results structure

Execute tool

Execute tool

Pattern 2: Batch Execution

模式2：批量执行

Define calls

Define calls

Execute in parallel

Execute in parallel

Pattern 3: Scientific Workflow

模式3：科研工作流

Configuration

配置

Caching

缓存

Enable globally

Enable globally

Or per-call

Or per-call

Manage cache

Manage cache

Hooks (Auto-summarization)

钩子（自动摘要）

Enable hooks for large outputs

Enable hooks for large outputs

Check if summarized

Check if summarized

Load Specific Categories

加载特定分类

Faster loading

Faster loading

Critical Things to Know

关键注意事项

⚠️ Always Call load_tools()

⚠️ 务必调用load_tools()

❌ Wrong - will fail

❌ Wrong - will fail

✅ Correct