biomni

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Biomni

Biomni

Overview

概述

Biomni is an open-source biomedical AI agent framework from Stanford's SNAP lab that autonomously executes complex research tasks across biomedical domains. Use this skill when working on multi-step biological reasoning tasks, analyzing biomedical data, or conducting research spanning genomics, drug discovery, molecular biology, and clinical analysis.
Biomni是斯坦福大学SNAP实验室开发的开源生物医学AI Agent框架,可自主执行跨生物医学领域的复杂研究任务。当你处理多步骤生物学推理任务、分析生物医学数据,或开展涵盖基因组学、药物发现、分子生物学和临床分析的研究时,可使用该工具。

Core Capabilities

核心功能

Biomni excels at:
  1. Multi-step biological reasoning - Autonomous task decomposition and planning for complex biomedical queries
  2. Code generation and execution - Dynamic analysis pipeline creation for data processing
  3. Knowledge retrieval - Access to ~11GB of integrated biomedical databases and literature
  4. Cross-domain problem solving - Unified interface for genomics, proteomics, drug discovery, and clinical tasks
Biomni擅长以下方面:
  1. 多步骤生物学推理 - 针对复杂生物医学查询进行自主任务分解与规划
  2. 代码生成与执行 - 动态创建数据处理分析流水线
  3. 知识检索 - 访问约11GB的集成生物医学数据库与文献
  4. 跨领域问题解决 - 为基因组学、蛋白质组学、药物发现和临床任务提供统一接口

When to Use This Skill

适用场景

Use biomni for:
  • CRISPR screening - Design screens, prioritize genes, analyze knockout effects
  • Single-cell RNA-seq - Cell type annotation, differential expression, trajectory analysis
  • Drug discovery - ADMET prediction, target identification, compound optimization
  • GWAS analysis - Variant interpretation, causal gene identification, pathway enrichment
  • Clinical genomics - Rare disease diagnosis, variant pathogenicity, phenotype-genotype mapping
  • Lab protocols - Protocol optimization, literature synthesis, experimental design
在以下场景中使用Biomni:
  • CRISPR筛选 - 设计筛选方案、优先排序基因、分析敲除效果
  • 单细胞RNA-seq分析 - 细胞类型注释、差异表达分析、轨迹分析
  • 药物发现 - ADMET预测、靶点识别、化合物优化
  • GWAS分析 - 变异解读、致病基因识别、通路富集分析
  • 临床基因组学 - 罕见病诊断、变异致病性分析、表型-基因型映射
  • 实验室方案 - 方案优化、文献整合、实验设计

Quick Start

快速开始

Installation and Setup

安装与配置

Install Biomni and configure API keys for LLM providers:
bash
uv pip install biomni --upgrade
Configure API keys (store in
.env
file or environment variables):
bash
export ANTHROPIC_API_KEY="your-key-here"
安装Biomni并配置LLM提供商的API密钥:
bash
uv pip install biomni --upgrade
配置API密钥(存储在
.env
文件或环境变量中):
bash
export ANTHROPIC_API_KEY="your-key-here"

Optional: OpenAI, Azure, Google, Groq, AWS Bedrock keys

可选:OpenAI、Azure、Google、Groq、AWS Bedrock密钥


Use `scripts/setup_environment.py` for interactive setup assistance.

使用`scripts/setup_environment.py`获取交互式配置协助。

Basic Usage Pattern

基本使用模式

python
from biomni.agent import A1
python
from biomni.agent import A1

Initialize agent with data path and LLM choice

使用数据路径和LLM选择初始化Agent

agent = A1(path='./data', llm='claude-sonnet-4-20250514')
agent = A1(path='./data', llm='claude-sonnet-4-20250514')

Execute biomedical task autonomously

自主执行生物医学任务

agent.go("Your biomedical research question or task")
agent.go("Your biomedical research question or task")

Save conversation history and results

保存对话历史与结果

agent.save_conversation_history("report.pdf")
undefined
agent.save_conversation_history("report.pdf")
undefined

Working with Biomni

使用Biomni

1. Agent Initialization

1. Agent初始化

The A1 class is the primary interface for biomni:
python
from biomni.agent import A1
from biomni.config import default_config
A1类是Biomni的主要交互接口:
python
from biomni.agent import A1
from biomni.config import default_config

Basic initialization

基础初始化

agent = A1( path='./data', # Path to data lake (~11GB downloaded on first use) llm='claude-sonnet-4-20250514' # LLM model selection )
agent = A1( path='./data', # 数据湖路径(首次使用时将下载约11GB数据) llm='claude-sonnet-4-20250514' # LLM模型选择 )

Advanced configuration

高级配置

default_config.llm = "gpt-4" default_config.timeout_seconds = 1200 default_config.max_iterations = 50

**Supported LLM Providers:**
- Anthropic Claude (recommended): `claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI: `gpt-4`, `gpt-4-turbo`
- Azure OpenAI: via Azure configuration
- Google Gemini: `gemini-2.0-flash-exp`
- Groq: `llama-3.3-70b-versatile`
- AWS Bedrock: Various models via Bedrock API

See `references/llm_providers.md` for detailed LLM configuration instructions.
default_config.llm = "gpt-4" default_config.timeout_seconds = 1200 default_config.max_iterations = 50

**支持的LLM提供商:**
- Anthropic Claude(推荐):`claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI:`gpt-4`, `gpt-4-turbo`
- Azure OpenAI:通过Azure配置
- Google Gemini:`gemini-2.0-flash-exp`
- Groq:`llama-3.3-70b-versatile`
- AWS Bedrock:通过Bedrock API访问多种模型

查看`references/llm_providers.md`获取详细LLM配置说明。

2. Task Execution Workflow

2. 任务执行流程

Biomni follows an autonomous agent workflow:
python
undefined
Biomni遵循自主Agent工作流:
python
undefined

Step 1: Initialize agent

步骤1:初始化Agent

agent = A1(path='./data', llm='claude-sonnet-4-20250514')
agent = A1(path='./data', llm='claude-sonnet-4-20250514')

Step 2: Execute task with natural language query

步骤2:使用自然语言查询执行任务

result = agent.go(""" Design a CRISPR screen to identify genes regulating autophagy in HEK293 cells. Prioritize genes based on essentiality and pathway relevance. """)
result = agent.go(""" Design a CRISPR screen to identify genes regulating autophagy in HEK293 cells. Prioritize genes based on essentiality and pathway relevance. """)

Step 3: Review generated code and analysis

步骤3:查看生成的代码与分析结果

Agent autonomously:

Agent会自主完成:

- Decomposes task into sub-steps

- 将任务分解为子步骤

- Retrieves relevant biological knowledge

- 检索相关生物学知识

- Generates and executes analysis code

- 生成并执行分析代码

- Interprets results and provides insights

- 解读结果并提供见解

Step 4: Save results

步骤4:保存结果

agent.save_conversation_history("autophagy_screen_report.pdf")
undefined
agent.save_conversation_history("autophagy_screen_report.pdf")
undefined

3. Common Task Patterns

3. 常见任务模式

CRISPR Screening Design

CRISPR筛选设计

python
agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")
python
agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")

Single-Cell RNA-seq Analysis

单细胞RNA-seq分析

python
agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")
python
agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")

Drug ADMET Prediction

药物ADMET预测

python
agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")
python
agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")

GWAS Variant Interpretation

GWAS变异解读

python
agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")
See
references/use_cases.md
for comprehensive task examples across all biomedical domains.
python
agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")
查看
references/use_cases.md
获取跨所有生物医学领域的全面任务示例。

4. Data Integration

4. 数据整合

Biomni integrates ~11GB of biomedical knowledge sources:
  • Gene databases - Ensembl, NCBI Gene, UniProt
  • Protein structures - PDB, AlphaFold
  • Clinical datasets - ClinVar, OMIM, HPO
  • Literature indices - PubMed abstracts, biomedical ontologies
  • Pathway databases - KEGG, Reactome, GO
Data is automatically downloaded to the specified
path
on first use.
Biomni整合了约11GB的生物医学知识来源:
  • 基因数据库 - Ensembl、NCBI Gene、UniProt
  • 蛋白质结构数据库 - PDB、AlphaFold
  • 临床数据集 - ClinVar、OMIM、HPO
  • 文献索引 - PubMed摘要、生物医学本体
  • 通路数据库 - KEGG、Reactome、GO
首次使用时,数据会自动下载到指定的
path
路径下。

5. MCP Server Integration

5. MCP服务器集成

Extend biomni with external tools via Model Context Protocol:
python
undefined
通过模型上下文协议(Model Context Protocol)扩展Biomni的外部工具能力:
python
undefined

MCP servers can provide:

MCP服务器可提供:

- FDA drug databases

- FDA药物数据库

- Web search for literature

- 文献网络搜索

- Custom biomedical APIs

- 自定义生物医学API

- Laboratory equipment interfaces

- 实验室设备接口

Configure MCP servers in .biomni/mcp_config.json

在.biomni/mcp_config.json中配置MCP服务器

undefined
undefined

6. Evaluation Framework

6. 评估框架

Benchmark agent performance on biomedical tasks:
python
from biomni.eval import BiomniEval1

evaluator = BiomniEval1()
在生物医学任务上基准测试Agent性能:
python
from biomni.eval import BiomniEval1

evaluator = BiomniEval1()

Evaluate on specific task types

评估特定任务类型

score = evaluator.evaluate( task_type='crispr_design', instance_id='test_001', answer=agent_output )
score = evaluator.evaluate( task_type='crispr_design', instance_id='test_001', answer=agent_output )

Access evaluation dataset

访问评估数据集

dataset = evaluator.load_dataset()
undefined
dataset = evaluator.load_dataset()
undefined

Best Practices

最佳实践

Task Formulation

任务表述

  • Be specific - Include biological context, organism, cell type, conditions
  • Specify outputs - Clearly state desired analysis outputs and formats
  • Provide data paths - Include file paths for datasets to analyze
  • Set constraints - Mention time/computational limits if relevant
  • 明确具体 - 包含生物学背景、生物、细胞类型、实验条件
  • 指定输出 - 清晰说明所需的分析输出与格式
  • 提供数据路径 - 包含要分析的数据集文件路径
  • 设置约束 - 如有相关,提及时间或计算限制

Security Considerations

安全注意事项

⚠️ Important: Biomni executes LLM-generated code with full system privileges. For production use:
  • Run in isolated environments (Docker, VMs)
  • Avoid exposing sensitive credentials
  • Review generated code before execution in sensitive contexts
  • Use sandboxed execution environments when possible
⚠️ 重要提示:Biomni会以完整系统权限执行LLM生成的代码。生产环境使用时:
  • 在隔离环境中运行(Docker、虚拟机)
  • 避免暴露敏感凭证
  • 在敏感环境中执行前先审查生成的代码
  • 尽可能使用沙箱执行环境

Performance Optimization

性能优化

  • Choose appropriate LLMs - Claude Sonnet 4 recommended for balance of speed/quality
  • Set reasonable timeouts - Adjust
    default_config.timeout_seconds
    for complex tasks
  • Monitor iterations - Track
    max_iterations
    to prevent runaway loops
  • Cache data - Reuse downloaded data lake across sessions
  • 选择合适的LLM - 推荐使用Claude Sonnet 4,兼顾速度与质量
  • 设置合理的超时时间 - 针对复杂任务调整
    default_config.timeout_seconds
  • 监控迭代次数 - 跟踪
    max_iterations
    以防止无限循环
  • 缓存数据 - 在多个会话中重复使用已下载的数据湖

Result Documentation

结果文档

python
undefined
python
undefined

Always save conversation history for reproducibility

始终保存对话历史以确保可复现性

agent.save_conversation_history("results/project_name_YYYYMMDD.pdf")
agent.save_conversation_history("results/project_name_YYYYMMDD.pdf")

Include in reports:

报告中应包含:

- Original task description

- 原始任务描述

- Generated analysis code

- 生成的分析代码

- Results and interpretations

- 结果与解读

- Data sources used

- 使用的数据源

undefined
undefined

Resources

资源

References

参考资料

Detailed documentation available in the
references/
directory:
  • api_reference.md
    - Complete API documentation for A1 class, configuration, and evaluation
  • llm_providers.md
    - LLM provider setup (Anthropic, OpenAI, Azure, Google, Groq, AWS)
  • use_cases.md
    - Comprehensive task examples for all biomedical domains
详细文档位于
references/
目录中:
  • api_reference.md
    - A1类、配置与评估的完整API文档
  • llm_providers.md
    - LLM提供商设置(Anthropic、OpenAI、Azure、Google、Groq、AWS)
  • use_cases.md
    - 跨所有生物医学领域的全面任务示例

Scripts

脚本

Helper scripts in the
scripts/
directory:
  • setup_environment.py
    - Interactive environment and API key configuration
  • generate_report.py
    - Enhanced PDF report generation with custom formatting
scripts/
目录中的辅助脚本:
  • setup_environment.py
    - 交互式环境与API密钥配置
  • generate_report.py
    - 带自定义格式的增强型PDF报告生成工具

External Resources

外部资源

Troubleshooting

故障排除

Common Issues

常见问题

Data download fails
python
undefined
数据下载失败
python
undefined

Manually trigger data lake download

手动触发数据湖下载

agent = A1(path='./data', llm='your-llm')
agent = A1(path='./data', llm='your-llm')

First .go() call will download data

首次调用.go()时会下载数据


**API key errors**
```bash

**API密钥错误**
```bash

Verify environment variables

验证环境变量

echo $ANTHROPIC_API_KEY
echo $ANTHROPIC_API_KEY

Or check .env file in working directory

或检查工作目录中的.env文件


**Timeout on complex tasks**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600  # 1 hour
Memory issues with large datasets
  • Use streaming for large files
  • Process data in chunks
  • Increase system memory allocation

**复杂任务超时**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600  # 1小时
大数据集内存问题
  • 对大文件使用流式处理
  • 分块处理数据
  • 增加系统内存分配

Getting Help

获取帮助

For issues or questions:
如有问题或疑问: