biomni
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBiomni
Biomni
Overview
概述
Biomni is an open-source biomedical AI agent framework from Stanford's SNAP lab that autonomously executes complex research tasks across biomedical domains. Use this skill when working on multi-step biological reasoning tasks, analyzing biomedical data, or conducting research spanning genomics, drug discovery, molecular biology, and clinical analysis.
Biomni是斯坦福大学SNAP实验室开发的开源生物医学AI Agent框架,可自主执行跨生物医学领域的复杂研究任务。当你处理多步骤生物学推理任务、分析生物医学数据,或开展涵盖基因组学、药物发现、分子生物学和临床分析的研究时,可使用该工具。
Core Capabilities
核心功能
Biomni excels at:
- Multi-step biological reasoning - Autonomous task decomposition and planning for complex biomedical queries
- Code generation and execution - Dynamic analysis pipeline creation for data processing
- Knowledge retrieval - Access to ~11GB of integrated biomedical databases and literature
- Cross-domain problem solving - Unified interface for genomics, proteomics, drug discovery, and clinical tasks
Biomni擅长以下方面:
- 多步骤生物学推理 - 针对复杂生物医学查询进行自主任务分解与规划
- 代码生成与执行 - 动态创建数据处理分析流水线
- 知识检索 - 访问约11GB的集成生物医学数据库与文献
- 跨领域问题解决 - 为基因组学、蛋白质组学、药物发现和临床任务提供统一接口
When to Use This Skill
适用场景
Use biomni for:
- CRISPR screening - Design screens, prioritize genes, analyze knockout effects
- Single-cell RNA-seq - Cell type annotation, differential expression, trajectory analysis
- Drug discovery - ADMET prediction, target identification, compound optimization
- GWAS analysis - Variant interpretation, causal gene identification, pathway enrichment
- Clinical genomics - Rare disease diagnosis, variant pathogenicity, phenotype-genotype mapping
- Lab protocols - Protocol optimization, literature synthesis, experimental design
在以下场景中使用Biomni:
- CRISPR筛选 - 设计筛选方案、优先排序基因、分析敲除效果
- 单细胞RNA-seq分析 - 细胞类型注释、差异表达分析、轨迹分析
- 药物发现 - ADMET预测、靶点识别、化合物优化
- GWAS分析 - 变异解读、致病基因识别、通路富集分析
- 临床基因组学 - 罕见病诊断、变异致病性分析、表型-基因型映射
- 实验室方案 - 方案优化、文献整合、实验设计
Quick Start
快速开始
Installation and Setup
安装与配置
Install Biomni and configure API keys for LLM providers:
bash
uv pip install biomni --upgradeConfigure API keys (store in file or environment variables):
.envbash
export ANTHROPIC_API_KEY="your-key-here"安装Biomni并配置LLM提供商的API密钥:
bash
uv pip install biomni --upgrade配置API密钥(存储在文件或环境变量中):
.envbash
export ANTHROPIC_API_KEY="your-key-here"Optional: OpenAI, Azure, Google, Groq, AWS Bedrock keys
可选:OpenAI、Azure、Google、Groq、AWS Bedrock密钥
Use `scripts/setup_environment.py` for interactive setup assistance.
使用`scripts/setup_environment.py`获取交互式配置协助。Basic Usage Pattern
基本使用模式
python
from biomni.agent import A1python
from biomni.agent import A1Initialize agent with data path and LLM choice
使用数据路径和LLM选择初始化Agent
agent = A1(path='./data', llm='claude-sonnet-4-20250514')
agent = A1(path='./data', llm='claude-sonnet-4-20250514')
Execute biomedical task autonomously
自主执行生物医学任务
agent.go("Your biomedical research question or task")
agent.go("Your biomedical research question or task")
Save conversation history and results
保存对话历史与结果
agent.save_conversation_history("report.pdf")
undefinedagent.save_conversation_history("report.pdf")
undefinedWorking with Biomni
使用Biomni
1. Agent Initialization
1. Agent初始化
The A1 class is the primary interface for biomni:
python
from biomni.agent import A1
from biomni.config import default_configA1类是Biomni的主要交互接口:
python
from biomni.agent import A1
from biomni.config import default_configBasic initialization
基础初始化
agent = A1(
path='./data', # Path to data lake (~11GB downloaded on first use)
llm='claude-sonnet-4-20250514' # LLM model selection
)
agent = A1(
path='./data', # 数据湖路径(首次使用时将下载约11GB数据)
llm='claude-sonnet-4-20250514' # LLM模型选择
)
Advanced configuration
高级配置
default_config.llm = "gpt-4"
default_config.timeout_seconds = 1200
default_config.max_iterations = 50
**Supported LLM Providers:**
- Anthropic Claude (recommended): `claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI: `gpt-4`, `gpt-4-turbo`
- Azure OpenAI: via Azure configuration
- Google Gemini: `gemini-2.0-flash-exp`
- Groq: `llama-3.3-70b-versatile`
- AWS Bedrock: Various models via Bedrock API
See `references/llm_providers.md` for detailed LLM configuration instructions.default_config.llm = "gpt-4"
default_config.timeout_seconds = 1200
default_config.max_iterations = 50
**支持的LLM提供商:**
- Anthropic Claude(推荐):`claude-sonnet-4-20250514`, `claude-opus-4-20250514`
- OpenAI:`gpt-4`, `gpt-4-turbo`
- Azure OpenAI:通过Azure配置
- Google Gemini:`gemini-2.0-flash-exp`
- Groq:`llama-3.3-70b-versatile`
- AWS Bedrock:通过Bedrock API访问多种模型
查看`references/llm_providers.md`获取详细LLM配置说明。2. Task Execution Workflow
2. 任务执行流程
Biomni follows an autonomous agent workflow:
python
undefinedBiomni遵循自主Agent工作流:
python
undefinedStep 1: Initialize agent
步骤1:初始化Agent
agent = A1(path='./data', llm='claude-sonnet-4-20250514')
agent = A1(path='./data', llm='claude-sonnet-4-20250514')
Step 2: Execute task with natural language query
步骤2:使用自然语言查询执行任务
result = agent.go("""
Design a CRISPR screen to identify genes regulating autophagy in
HEK293 cells. Prioritize genes based on essentiality and pathway
relevance.
""")
result = agent.go("""
Design a CRISPR screen to identify genes regulating autophagy in
HEK293 cells. Prioritize genes based on essentiality and pathway
relevance.
""")
Step 3: Review generated code and analysis
步骤3:查看生成的代码与分析结果
Agent autonomously:
Agent会自主完成:
- Decomposes task into sub-steps
- 将任务分解为子步骤
- Retrieves relevant biological knowledge
- 检索相关生物学知识
- Generates and executes analysis code
- 生成并执行分析代码
- Interprets results and provides insights
- 解读结果并提供见解
Step 4: Save results
步骤4:保存结果
agent.save_conversation_history("autophagy_screen_report.pdf")
undefinedagent.save_conversation_history("autophagy_screen_report.pdf")
undefined3. Common Task Patterns
3. 常见任务模式
CRISPR Screening Design
CRISPR筛选设计
python
agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")python
agent.go("""
Design a genome-wide CRISPR knockout screen for identifying genes
affecting [phenotype] in [cell type]. Include:
1. sgRNA library design
2. Gene prioritization criteria
3. Expected hit genes based on pathway analysis
""")Single-Cell RNA-seq Analysis
单细胞RNA-seq分析
python
agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")python
agent.go("""
Analyze this single-cell RNA-seq dataset:
- Perform quality control and filtering
- Identify cell populations via clustering
- Annotate cell types using marker genes
- Conduct differential expression between conditions
File path: [path/to/data.h5ad]
""")Drug ADMET Prediction
药物ADMET预测
python
agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")python
agent.go("""
Predict ADMET properties for these drug candidates:
[SMILES strings or compound IDs]
Focus on:
- Absorption (Caco-2 permeability, HIA)
- Distribution (plasma protein binding, BBB penetration)
- Metabolism (CYP450 interaction)
- Excretion (clearance)
- Toxicity (hERG liability, hepatotoxicity)
""")GWAS Variant Interpretation
GWAS变异解读
python
agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")See for comprehensive task examples across all biomedical domains.
references/use_cases.mdpython
agent.go("""
Interpret GWAS results for [trait/disease]:
- Identify genome-wide significant variants
- Map variants to causal genes
- Perform pathway enrichment analysis
- Predict functional consequences
Summary statistics file: [path/to/gwas_summary.txt]
""")查看获取跨所有生物医学领域的全面任务示例。
references/use_cases.md4. Data Integration
4. 数据整合
Biomni integrates ~11GB of biomedical knowledge sources:
- Gene databases - Ensembl, NCBI Gene, UniProt
- Protein structures - PDB, AlphaFold
- Clinical datasets - ClinVar, OMIM, HPO
- Literature indices - PubMed abstracts, biomedical ontologies
- Pathway databases - KEGG, Reactome, GO
Data is automatically downloaded to the specified on first use.
pathBiomni整合了约11GB的生物医学知识来源:
- 基因数据库 - Ensembl、NCBI Gene、UniProt
- 蛋白质结构数据库 - PDB、AlphaFold
- 临床数据集 - ClinVar、OMIM、HPO
- 文献索引 - PubMed摘要、生物医学本体
- 通路数据库 - KEGG、Reactome、GO
首次使用时,数据会自动下载到指定的路径下。
path5. MCP Server Integration
5. MCP服务器集成
Extend biomni with external tools via Model Context Protocol:
python
undefined通过模型上下文协议(Model Context Protocol)扩展Biomni的外部工具能力:
python
undefinedMCP servers can provide:
MCP服务器可提供:
- FDA drug databases
- FDA药物数据库
- Web search for literature
- 文献网络搜索
- Custom biomedical APIs
- 自定义生物医学API
- Laboratory equipment interfaces
- 实验室设备接口
Configure MCP servers in .biomni/mcp_config.json
在.biomni/mcp_config.json中配置MCP服务器
undefinedundefined6. Evaluation Framework
6. 评估框架
Benchmark agent performance on biomedical tasks:
python
from biomni.eval import BiomniEval1
evaluator = BiomniEval1()在生物医学任务上基准测试Agent性能:
python
from biomni.eval import BiomniEval1
evaluator = BiomniEval1()Evaluate on specific task types
评估特定任务类型
score = evaluator.evaluate(
task_type='crispr_design',
instance_id='test_001',
answer=agent_output
)
score = evaluator.evaluate(
task_type='crispr_design',
instance_id='test_001',
answer=agent_output
)
Access evaluation dataset
访问评估数据集
dataset = evaluator.load_dataset()
undefineddataset = evaluator.load_dataset()
undefinedBest Practices
最佳实践
Task Formulation
任务表述
- Be specific - Include biological context, organism, cell type, conditions
- Specify outputs - Clearly state desired analysis outputs and formats
- Provide data paths - Include file paths for datasets to analyze
- Set constraints - Mention time/computational limits if relevant
- 明确具体 - 包含生物学背景、生物、细胞类型、实验条件
- 指定输出 - 清晰说明所需的分析输出与格式
- 提供数据路径 - 包含要分析的数据集文件路径
- 设置约束 - 如有相关,提及时间或计算限制
Security Considerations
安全注意事项
⚠️ Important: Biomni executes LLM-generated code with full system privileges. For production use:
- Run in isolated environments (Docker, VMs)
- Avoid exposing sensitive credentials
- Review generated code before execution in sensitive contexts
- Use sandboxed execution environments when possible
⚠️ 重要提示:Biomni会以完整系统权限执行LLM生成的代码。生产环境使用时:
- 在隔离环境中运行(Docker、虚拟机)
- 避免暴露敏感凭证
- 在敏感环境中执行前先审查生成的代码
- 尽可能使用沙箱执行环境
Performance Optimization
性能优化
- Choose appropriate LLMs - Claude Sonnet 4 recommended for balance of speed/quality
- Set reasonable timeouts - Adjust for complex tasks
default_config.timeout_seconds - Monitor iterations - Track to prevent runaway loops
max_iterations - Cache data - Reuse downloaded data lake across sessions
- 选择合适的LLM - 推荐使用Claude Sonnet 4,兼顾速度与质量
- 设置合理的超时时间 - 针对复杂任务调整
default_config.timeout_seconds - 监控迭代次数 - 跟踪以防止无限循环
max_iterations - 缓存数据 - 在多个会话中重复使用已下载的数据湖
Result Documentation
结果文档
python
undefinedpython
undefinedAlways save conversation history for reproducibility
始终保存对话历史以确保可复现性
agent.save_conversation_history("results/project_name_YYYYMMDD.pdf")
agent.save_conversation_history("results/project_name_YYYYMMDD.pdf")
Include in reports:
报告中应包含:
- Original task description
- 原始任务描述
- Generated analysis code
- 生成的分析代码
- Results and interpretations
- 结果与解读
- Data sources used
- 使用的数据源
undefinedundefinedResources
资源
References
参考资料
Detailed documentation available in the directory:
references/- - Complete API documentation for A1 class, configuration, and evaluation
api_reference.md - - LLM provider setup (Anthropic, OpenAI, Azure, Google, Groq, AWS)
llm_providers.md - - Comprehensive task examples for all biomedical domains
use_cases.md
详细文档位于目录中:
references/- - A1类、配置与评估的完整API文档
api_reference.md - - LLM提供商设置(Anthropic、OpenAI、Azure、Google、Groq、AWS)
llm_providers.md - - 跨所有生物医学领域的全面任务示例
use_cases.md
Scripts
脚本
Helper scripts in the directory:
scripts/- - Interactive environment and API key configuration
setup_environment.py - - Enhanced PDF report generation with custom formatting
generate_report.py
scripts/- - 交互式环境与API密钥配置
setup_environment.py - - 带自定义格式的增强型PDF报告生成工具
generate_report.py
External Resources
外部资源
- GitHub: https://github.com/snap-stanford/biomni
- Web Platform: https://biomni.stanford.edu
- Paper: https://www.biorxiv.org/content/10.1101/2025.05.30.656746v1
- Model: https://huggingface.co/biomni/Biomni-R0-32B-Preview
- Evaluation Dataset: https://huggingface.co/datasets/biomni/Eval1
Troubleshooting
故障排除
Common Issues
常见问题
Data download fails
python
undefined数据下载失败
python
undefinedManually trigger data lake download
手动触发数据湖下载
agent = A1(path='./data', llm='your-llm')
agent = A1(path='./data', llm='your-llm')
First .go() call will download data
首次调用.go()时会下载数据
**API key errors**
```bash
**API密钥错误**
```bashVerify environment variables
验证环境变量
echo $ANTHROPIC_API_KEY
echo $ANTHROPIC_API_KEY
Or check .env file in working directory
或检查工作目录中的.env文件
**Timeout on complex tasks**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600 # 1 hourMemory issues with large datasets
- Use streaming for large files
- Process data in chunks
- Increase system memory allocation
**复杂任务超时**
```python
from biomni.config import default_config
default_config.timeout_seconds = 3600 # 1小时大数据集内存问题
- 对大文件使用流式处理
- 分块处理数据
- 增加系统内存分配
Getting Help
获取帮助
For issues or questions:
- GitHub Issues: https://github.com/snap-stanford/biomni/issues
- Documentation: Check files for detailed guidance
references/ - Community: Stanford SNAP lab and biomni contributors
如有问题或疑问:
- GitHub Issues:https://github.com/snap-stanford/biomni/issues
- 文档:查看文件获取详细指导
references/ - 社区:斯坦福SNAP实验室与Biomni贡献者