biomni

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Biomni

Overview

概述

Biomni is an open-source biomedical AI agent framework from Stanford's SNAP lab that autonomously executes complex research tasks across biomedical domains. Use this skill when working on multi-step biological reasoning tasks, analyzing biomedical data, or conducting research spanning genomics, drug discovery, molecular biology, and clinical analysis.

Biomni是斯坦福大学SNAP实验室开发的开源生物医学AI Agent框架，可自主执行跨生物医学领域的复杂研究任务。当你处理多步骤生物学推理任务、分析生物医学数据，或开展涵盖基因组学、药物发现、分子生物学和临床分析的研究时，可使用该工具。

Core Capabilities

核心功能

Biomni excels at:

Multi-step biological reasoning - Autonomous task decomposition and planning for complex biomedical queries
Code generation and execution - Dynamic analysis pipeline creation for data processing
Knowledge retrieval - Access to ~11GB of integrated biomedical databases and literature
Cross-domain problem solving - Unified interface for genomics, proteomics, drug discovery, and clinical tasks

Biomni擅长以下方面：

多步骤生物学推理 - 针对复杂生物医学查询进行自主任务分解与规划
代码生成与执行 - 动态创建数据处理分析流水线
知识检索 - 访问约11GB的集成生物医学数据库与文献
跨领域问题解决 - 为基因组学、蛋白质组学、药物发现和临床任务提供统一接口

When to Use This Skill

适用场景

Use biomni for:

CRISPR screening - Design screens, prioritize genes, analyze knockout effects
Single-cell RNA-seq - Cell type annotation, differential expression, trajectory analysis
Drug discovery - ADMET prediction, target identification, compound optimization
GWAS analysis - Variant interpretation, causal gene identification, pathway enrichment
Clinical genomics - Rare disease diagnosis, variant pathogenicity, phenotype-genotype mapping
Lab protocols - Protocol optimization, literature synthesis, experimental design

在以下场景中使用Biomni：

CRISPR筛选 - 设计筛选方案、优先排序基因、分析敲除效果
单细胞RNA-seq分析 - 细胞类型注释、差异表达分析、轨迹分析
药物发现 - ADMET预测、靶点识别、化合物优化
GWAS分析 - 变异解读、致病基因识别、通路富集分析
临床基因组学 - 罕见病诊断、变异致病性分析、表型-基因型映射
实验室方案 - 方案优化、文献整合、实验设计

Quick Start

快速开始

Installation and Setup

安装与配置

Install Biomni and configure API keys for LLM providers:

bash

uv pip install biomni --upgrade

Configure API keys (store in

.env

file or environment variables):

bash

export ANTHROPIC_API_KEY="your-key-here"

安装Biomni并配置LLM提供商的API密钥：

bash

uv pip install biomni --upgrade

配置API密钥（存储在

.env

文件或环境变量中）：

bash

export ANTHROPIC_API_KEY="your-key-here"

Optional: OpenAI, Azure, Google, Groq, AWS Bedrock keys

可选：OpenAI、Azure、Google、Groq、AWS Bedrock密钥


Use `scripts/setup_environment.py` for interactive setup assistance.


使用`scripts/setup_environment.py`获取交互式配置协助。

Basic Usage Pattern

基本使用模式

python

from biomni.agent import A1

python

from biomni.agent import A1

Initialize agent with data path and LLM choice

使用数据路径和LLM选择初始化Agent

agent = A1(path='./data', llm='claude-sonnet-4-20250514')

Execute biomedical task autonomously

自主执行生物医学任务

agent.go("Your biomedical research question or task")

Save conversation history and results

保存对话历史与结果

agent.save_conversation_history("report.pdf")

undefined

agent.save_conversation_history("report.pdf")

undefined

Working with Biomni

使用Biomni

1. Agent Initialization

1. Agent初始化

The A1 class is the primary interface for biomni:

python

from biomni.agent import A1
from biomni.config import default_config

A1类是Biomni的主要交互接口：

python

from biomni.agent import A1
from biomni.config import default_config

Basic initialization

基础初始化

agent = A1( path='./data', # Path to data lake (~11GB downloaded on first use) llm='claude-sonnet-4-20250514' # LLM model selection )

agent = A1( path='./data', # 数据湖路径（首次使用时将下载约11GB数据） llm='claude-sonnet-4-20250514' # LLM模型选择 )

Advanced configuration

高级配置

default_config.llm = "gpt-4" default_config.timeout_seconds = 1200 default_config.max_iterations = 50


**Supported LLM Providers:**
- Anthropic Claude (recommended): `claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI: `gpt-4`, `gpt-4-turbo`
- Azure OpenAI: via Azure configuration
- Google Gemini: `gemini-2.0-flash-exp`
- Groq: `llama-3.3-70b-versatile`
- AWS Bedrock: Various models via Bedrock API

See `references/llm_providers.md` for detailed LLM configuration instructions.

default_config.llm = "gpt-4" default_config.timeout_seconds = 1200 default_config.max_iterations = 50


**支持的LLM提供商：**
- Anthropic Claude（推荐）：`claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI：`gpt-4`, `gpt-4-turbo`
- Azure OpenAI：通过Azure配置
- Google Gemini：`gemini-2.0-flash-exp`
- Groq：`llama-3.3-70b-versatile`
- AWS Bedrock：通过Bedrock API访问多种模型

查看`references/llm_providers.md`获取详细LLM配置说明。

2. Task Execution Workflow

2. 任务执行流程

Biomni follows an autonomous agent workflow:

python

undefined

Biomni遵循自主Agent工作流：

python

undefined

Step 1: Initialize agent

步骤1：初始化Agent

agent = A1(path='./data', llm='claude-sonnet-4-20250514')

Step 2: Execute task with natural language query

步骤2：使用自然语言查询执行任务

result = agent.go(""" Design a CRISPR screen to identify genes regulating autophagy in HEK293 cells. Prioritize genes based on essentiality and pathway relevance. """)

Step 3: Review generated code and analysis

步骤3：查看生成的代码与分析结果

Agent autonomously:

Agent会自主完成：

- Decomposes task into sub-steps

- 将任务分解为子步骤

- Retrieves relevant biological knowledge

- 检索相关生物学知识

- Generates and executes analysis code

- 生成并执行分析代码

- Interprets results and provides insights

- 解读结果并提供见解

Step 4: Save results

步骤4：保存结果

agent.save_conversation_history("autophagy_screen_report.pdf")

undefined

agent.save_conversation_history("autophagy_screen_report.pdf")

undefined

3. Common Task Patterns

3. 常见任务模式

CRISPR Screening Design

CRISPR筛选设计

python

agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")

python

agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")

Single-Cell RNA-seq Analysis

单细胞RNA-seq分析

python

agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")

python

agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")

Drug ADMET Prediction

药物ADMET预测

python

agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")

python

agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")

GWAS Variant Interpretation

GWAS变异解读

python

agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")

See

references/use_cases.md

for comprehensive task examples across all biomedical domains.

python

agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")

查看

references/use_cases.md

获取跨所有生物医学领域的全面任务示例。

4. Data Integration

4. 数据整合

Biomni integrates ~11GB of biomedical knowledge sources:

Gene databases - Ensembl, NCBI Gene, UniProt
Protein structures - PDB, AlphaFold
Clinical datasets - ClinVar, OMIM, HPO
Literature indices - PubMed abstracts, biomedical ontologies
Pathway databases - KEGG, Reactome, GO

Data is automatically downloaded to the specified

path

on first use.

Biomni整合了约11GB的生物医学知识来源：

基因数据库 - Ensembl、NCBI Gene、UniProt
蛋白质结构数据库 - PDB、AlphaFold
临床数据集 - ClinVar、OMIM、HPO
文献索引 - PubMed摘要、生物医学本体
通路数据库 - KEGG、Reactome、GO

首次使用时，数据会自动下载到指定的

path

路径下。

5. MCP Server Integration

5. MCP服务器集成

Extend biomni with external tools via Model Context Protocol:

python

undefined

通过模型上下文协议（Model Context Protocol）扩展Biomni的外部工具能力：

python

undefined

MCP servers can provide:

MCP服务器可提供：

- FDA drug databases

- FDA药物数据库

- Web search for literature

- 文献网络搜索

- Custom biomedical APIs

- 自定义生物医学API

- Laboratory equipment interfaces

- 实验室设备接口

Configure MCP servers in .biomni/mcp_config.json

在.biomni/mcp_config.json中配置MCP服务器

undefined

undefined

6. Evaluation Framework

6. 评估框架

Benchmark agent performance on biomedical tasks:

python

from biomni.eval import BiomniEval1

evaluator = BiomniEval1()

在生物医学任务上基准测试Agent性能：

python

from biomni.eval import BiomniEval1

evaluator = BiomniEval1()

Evaluate on specific task types

评估特定任务类型

score = evaluator.evaluate( task_type='crispr_design', instance_id='test_001', answer=agent_output )

Access evaluation dataset

访问评估数据集

dataset = evaluator.load_dataset()

undefined

dataset = evaluator.load_dataset()

undefined

Best Practices

最佳实践

Task Formulation

任务表述

Be specific - Include biological context, organism, cell type, conditions
Specify outputs - Clearly state desired analysis outputs and formats
Provide data paths - Include file paths for datasets to analyze
Set constraints - Mention time/computational limits if relevant

明确具体 - 包含生物学背景、生物、细胞类型、实验条件
指定输出 - 清晰说明所需的分析输出与格式
提供数据路径 - 包含要分析的数据集文件路径
设置约束 - 如有相关，提及时间或计算限制

Security Considerations

安全注意事项

⚠️ Important: Biomni executes LLM-generated code with full system privileges. For production use:

Run in isolated environments (Docker, VMs)
Avoid exposing sensitive credentials
Review generated code before execution in sensitive contexts
Use sandboxed execution environments when possible

⚠️ 重要提示：Biomni会以完整系统权限执行LLM生成的代码。生产环境使用时：

在隔离环境中运行（Docker、虚拟机）
避免暴露敏感凭证
在敏感环境中执行前先审查生成的代码
尽可能使用沙箱执行环境

Performance Optimization

性能优化

Choose appropriate LLMs - Claude Sonnet 4 recommended for balance of speed/quality
Set reasonable timeouts - Adjust
```
default_config.timeout_seconds
```
for complex tasks
Monitor iterations - Track
```
max_iterations
```
to prevent runaway loops
Cache data - Reuse downloaded data lake across sessions

选择合适的LLM - 推荐使用Claude Sonnet 4，兼顾速度与质量
设置合理的超时时间 - 针对复杂任务调整
```
default_config.timeout_seconds
```
监控迭代次数 - 跟踪
```
max_iterations
```
以防止无限循环
缓存数据 - 在多个会话中重复使用已下载的数据湖

Result Documentation

结果文档

python

undefined

python

undefined

Always save conversation history for reproducibility

始终保存对话历史以确保可复现性

agent.save_conversation_history("results/project_name_YYYYMMDD.pdf")

Include in reports:

报告中应包含：

- Original task description

- 原始任务描述

- Generated analysis code

- 生成的分析代码

- Results and interpretations

- 结果与解读

- Data sources used

- 使用的数据源

undefined

undefined

Resources

资源

References

参考资料

Detailed documentation available in the

references/

directory:

api_reference.md
- Complete API documentation for A1 class, configuration, and evaluation
llm_providers.md
- LLM provider setup (Anthropic, OpenAI, Azure, Google, Groq, AWS)
use_cases.md
- Comprehensive task examples for all biomedical domains

详细文档位于

references/

目录中：

api_reference.md
- A1类、配置与评估的完整API文档
llm_providers.md
- LLM提供商设置（Anthropic、OpenAI、Azure、Google、Groq、AWS）
use_cases.md
- 跨所有生物医学领域的全面任务示例

Scripts

脚本

Helper scripts in the

scripts/

directory:

setup_environment.py
- Interactive environment and API key configuration
generate_report.py
- Enhanced PDF report generation with custom formatting

scripts/

目录中的辅助脚本：

setup_environment.py
- 交互式环境与API密钥配置
generate_report.py
- 带自定义格式的增强型PDF报告生成工具

External Resources

外部资源

GitHub: https://github.com/snap-stanford/biomni
Web Platform: https://biomni.stanford.edu
Paper: https://www.biorxiv.org/content/10.1101/2025.05.30.656746v1
Model: https://huggingface.co/biomni/Biomni-R0-32B-Preview
Evaluation Dataset: https://huggingface.co/datasets/biomni/Eval1

GitHub：https://github.com/snap-stanford/biomni
Web平台：https://biomni.stanford.edu
论文：https://www.biorxiv.org/content/10.1101/2025.05.30.656746v1
模型：https://huggingface.co/biomni/Biomni-R0-32B-Preview
评估数据集：https://huggingface.co/datasets/biomni/Eval1

Troubleshooting

故障排除

Common Issues

常见问题

Data download fails

python

undefined

数据下载失败

python

undefined

Manually trigger data lake download

手动触发数据湖下载

agent = A1(path='./data', llm='your-llm')

First .go() call will download data

首次调用.go()时会下载数据


**API key errors**
```bash


**API密钥错误**
```bash

Verify environment variables

验证环境变量

echo $ANTHROPIC_API_KEY

Or check .env file in working directory

或检查工作目录中的.env文件


**Timeout on complex tasks**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600  # 1 hour

Memory issues with large datasets

Use streaming for large files
Process data in chunks
Increase system memory allocation


**复杂任务超时**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600  # 1小时

大数据集内存问题

对大文件使用流式处理
分块处理数据
增加系统内存分配

Getting Help

获取帮助

For issues or questions:

GitHub Issues: https://github.com/snap-stanford/biomni/issues
Documentation: Check
```
references/
```
files for detailed guidance
Community: Stanford SNAP lab and biomni contributors

如有问题或疑问：

GitHub Issues：https://github.com/snap-stanford/biomni/issues
文档：查看
```
references/
```
文件获取详细指导
社区：斯坦福SNAP实验室与Biomni贡献者