tooluniverse-systems-biology

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Systems Biology & Pathway Analysis

系统生物学与通路分析

Comprehensive pathway and systems biology analysis integrating multiple curated databases to provide multi-dimensional view of biological systems, pathway enrichment, and protein-pathway relationships.

整合多个经过人工筛选的数据库，提供生物系统的多维视角、通路富集分析及蛋白质-通路关系的全面通路与系统生物学分析。

When to Use This Skill

何时使用该技能

Triggers:

"Analyze pathways for this gene list"
"What pathways is [protein] involved in?"
"Find pathways related to [keyword/process]"
"Perform pathway enrichment analysis"
"Map proteins to biological pathways"
"Find computational models for [process]"
"Systems biology analysis of [genes/proteins]"

Use Cases:

Gene Set Analysis: Identify enriched pathways from RNA-seq, proteomics, or screen results
Protein Function: Discover pathways and processes a protein participates in
Pathway Discovery: Find pathways related to diseases, processes, or phenotypes
Systems Integration: Connect genes → pathways → processes → diseases
Model Discovery: Find computational systems biology models (SBML)
Cross-Database Validation: Compare pathway annotations across multiple sources

触发场景:

"分析该基因列表的相关通路"
"[蛋白质]参与哪些通路？"
"查找与[关键词/生物学过程]相关的通路"
"执行通路富集分析"
"将蛋白质映射到生物通路"
"查找[生物学过程]的计算模型"
"对[基因/蛋白质]进行系统生物学分析"

适用场景:

基因集分析: 从RNA-seq、蛋白质组学或筛选结果中识别富集通路
蛋白质功能: 发现蛋白质参与的通路与生物学过程
通路发现: 查找与疾病、生物学过程或表型相关的通路
系统整合: 关联基因→通路→生物学过程→疾病
模型发现: 查找计算系统生物学模型（SBML格式）
跨数据库验证: 对比多来源的通路注释信息

Core Databases Integrated

整合的核心数据库

Database	Coverage	Strengths
Reactome	Human-curated reactions & pathways	Detailed mechanistic pathways with reactions
KEGG	Reference pathways across organisms	Metabolic maps, disease pathways, drug targets
WikiPathways	Community-curated pathways	Emerging processes, collaborative updates
Pathway Commons	Integrated meta-database	Aggregates multiple sources (Reactome, KEGG, etc.)
BioModels	Computational SBML models	Mathematical/dynamic systems biology models
Enrichr	Statistical enrichment	Pathway over-representation analysis

数据库	覆盖范围	优势
Reactome	人工筛选的人类反应与通路	包含反应过程的详细机制通路
KEGG	跨物种的参考通路	代谢图谱、疾病通路、药物靶点信息
WikiPathways	社区协作筛选的通路	新兴生物学过程、协作式更新机制
Pathway Commons	整合型元数据库	聚合多来源数据（Reactome、KEGG等）
BioModels	计算型SBML模型	数学/动态系统生物学模型
Enrichr	统计富集分析	通路过度表达分析

Workflow Overview

工作流程概述

Input → Phase 1: Enrichment → Phase 2: Protein Mapping → Phase 3: Keyword Search → Phase 4: Top Pathways → Report

输入 → 阶段1：富集分析 → 阶段2：蛋白质映射 → 阶段3：关键词搜索 → 阶段4：核心通路汇总 → 生成报告

Phase 1: Pathway Enrichment Analysis

阶段1：通路富集分析

When: Gene list provided (from experiments, screens, differentially expressed genes)

Objective: Identify biological pathways statistically over-represented in gene list

触发条件: 提供基因列表（来自实验、筛选或差异表达基因结果）

目标: 识别基因列表中统计学上过度表达的生物通路

Tools Used

使用工具

enrichr_gene_enrichment_analysis:

Input:
- ```
gene_list
```
  : Array of gene symbols (e.g., ["TP53", "BRCA1", "EGFR"])
- ```
library
```
  : Pathway database (e.g., "KEGG_2021_Human", "Reactome_2022")
Output: Array of enriched pathways with p-values, adjusted p-values, genes
Use: Statistical over-representation analysis

enrichr_gene_enrichment_analysis:

输入:
- ```
gene_list
```
  : 基因符号数组（例如：["TP53", "BRCA1", "EGFR"]）
- ```
library
```
  : 通路数据库（例如："KEGG_2021_Human", "Reactome_2022"）
输出: 包含p值、校正p值及相关基因的富集通路数组
用途: 统计学过度表达分析

Workflow

工作流程

Submit gene list to Enrichr
Query KEGG pathway library for human
Get enriched pathways sorted by significance
Extract:
- Pathway names and IDs
- P-values (raw and adjusted)
- Genes from input list in each pathway
- Enrichment scores

将基因列表提交至Enrichr
查询人类KEGG通路库
获取按显著性排序的富集通路
提取以下信息:
- 通路名称与ID
- 原始p值与校正p值
- 每个通路中包含的输入列表基因
- 富集分数

Decision Logic

决策逻辑

Significance threshold: Adjusted p-value < 0.05 (default)
Minimum genes: At least 2 genes from input list in pathway
Report top pathways: Show 10-20 most significant
Empty results: If no enrichment → note "no significant pathways" (don't fail)

显著性阈值: 校正p值 < 0.05（默认值）
最小基因数: 每个通路至少包含2个输入列表中的基因
核心通路展示: 显示10-20个最显著的通路
无结果处理: 若无富集结果，标注“无显著性通路”（不终止流程）

Phase 2: Protein-Pathway Mapping

阶段2：蛋白质-通路映射

When: Protein UniProt ID provided

Objective: Map protein to all known pathways it participates in

触发条件: 提供蛋白质UniProt ID

目标: 将蛋白质映射到所有已知的参与通路

Tools Used

使用工具

Reactome_map_uniprot_to_pathways:

Input:
- ```
id
```
  : UniProt accession (e.g., "P53350")
Output: Array of Reactome pathways containing this protein
Note: Parameter is
```
id
```
(not
```
uniprot_id
```
)

Reactome_get_pathway_reactions:

Input:
- ```
stId
```
  : Reactome pathway stable ID (e.g., "R-HSA-73817")
Output: Array of reactions and subpathways
Use: Get mechanistic details of pathways

Reactome_map_uniprot_to_pathways:

输入:
- ```
id
```
  : UniProt登录号（例如："P53350"）
输出: 包含该蛋白质的Reactome通路数组
注意: 参数为
```
id
```
（而非
```
uniprot_id
```
）

Reactome_get_pathway_reactions:

输入:
- ```
stId
```
  : Reactome通路稳定ID（例如："R-HSA-73817"）
输出: 反应与子通路数组
用途: 获取通路的机制细节

Workflow

工作流程

Map UniProt ID to Reactome pathways
Get all pathways this protein appears in
For top pathway (or user-specified):
- Retrieve detailed reactions and subpathways
- Extract event names, types (Reaction vs Pathway)
- Note disease associations if present

将UniProt ID映射到Reactome通路
获取该蛋白质参与的所有通路
针对核心通路（或用户指定通路）:
- 检索详细的反应与子通路
- 提取事件名称、类型（反应 vs 通路）
- 标注相关疾病关联（若存在）

Decision Logic

决策逻辑

Multiple pathways: Report all pathways, prioritize by hierarchical level
Top pathway details: Get detailed reactions for 1-3 most relevant
Versioned IDs: Reactome uses unversioned IDs - strip version if present
Empty results: Check if protein ID valid; suggest alternative databases if Reactome empty

多通路处理: 报告所有通路，按层级优先级排序
核心通路细节: 获取1-3个最相关通路的详细反应信息
版本ID处理: Reactome使用无版本ID，若输入含版本号则去除
无结果处理: 检查蛋白质ID有效性；若Reactome无结果，建议尝试其他数据库

Phase 3: Keyword-Based Pathway Search

阶段3：基于关键词的通路搜索

When: User provides keyword or biological process name

Objective: Search multiple pathway databases to find relevant pathways

触发条件: 用户提供关键词或生物过程名称

目标: 搜索多个通路数据库以找到相关通路

Tools Used

使用工具

KEGG Search

KEGG搜索

kegg_search_pathway:

Input:
```
keyword
```
(e.g., "diabetes", "apoptosis")
Output: Array of pathway IDs and descriptions
Coverage: Reference pathways, metabolism, diseases

kegg_get_pathway_info:

Input:
```
pathway_id
```
(e.g., "hsa04930")
Output: Pathway details, genes, compounds
Use: Get detailed information for specific pathway

kegg_search_pathway:

输入:
```
keyword
```
（例如："diabetes", "apoptosis"）
输出: 通路ID与描述数组
覆盖范围: 参考通路、代谢、疾病

kegg_get_pathway_info:

输入:
```
pathway_id
```
（例如："hsa04930"）
输出: 通路详情、基因、化合物信息
用途: 获取特定通路的详细信息

WikiPathways Search

WikiPathways搜索

WikiPathways_search:

Input:
- ```
query
```
  : Keyword or gene symbol
- ```
organism
```
  : Species filter (e.g., "Homo sapiens")
Output: Array of pathway matches with IDs, names, URLs
Coverage: Community-curated, includes emerging pathways

WikiPathways_search:

输入:
- ```
query
```
  : 关键词或基因符号
- ```
organism
```
  : 物种筛选（例如："Homo sapiens"）
输出: 包含ID、名称、URL的通路匹配结果数组
覆盖范围: 社区协作筛选，包含新兴通路

Pathway Commons Search

Pathway Commons搜索

pc_search_pathways:

Input:
- ```
action
```
  : "search_pathways"
- ```
keyword
```
  : Search term
- ```
datasource
```
  : Optional filter (e.g., "reactome", "kegg")
- ```
limit
```
  : Max results (default: 10)
Output: Total hits and array of pathways with source attribution
Coverage: Meta-database aggregating multiple sources

pc_search_pathways:

输入:
- ```
action
```
  : "search_pathways"
- ```
keyword
```
  : 搜索词
- ```
datasource
```
  : 可选筛选条件（例如："reactome", "kegg"）
- ```
limit
```
  : 最大结果数（默认：10）
输出: 总命中数及带来源标注的通路数组
覆盖范围: 聚合多来源数据的元数据库

BioModels Search

BioModels搜索

biomodels_search:

Input:
- ```
query
```
  : Keyword for computational models
- ```
limit
```
  : Max results
Output: Array of SBML models with IDs, names, publications
Coverage: Mathematical/computational systems biology models

biomodels_search:

输入:
- ```
query
```
  : 计算模型关键词
- ```
limit
```
  : 最大结果数
输出: 包含ID、名称、文献的SBML模型数组
覆盖范围: 数学/计算系统生物学模型

Workflow

工作流程

Search KEGG pathways by keyword
Search WikiPathways with organism filter
Search Pathway Commons (aggregates multiple sources)
Search BioModels for computational models
Compile results from all sources
Note overlaps and source-specific pathways

按关键词搜索KEGG通路
按物种筛选搜索WikiPathways
搜索Pathway Commons（聚合多来源数据）
搜索BioModels获取计算模型
整合所有来源的结果
标注结果重叠情况及来源特异性通路

Decision Logic

决策逻辑

Parallel queries: Search all databases simultaneously (independent)
Empty from one source: Continue with other sources (common for specialized keywords)
Result consolidation: Group by pathway concept, note which databases contain each
Model availability: BioModels may be empty for many processes - this is normal

并行查询: 同时搜索所有数据库（相互独立）
单来源无结果: 继续处理其他来源结果（专业关键词常见情况）
结果整合: 按通路概念分组，标注各通路所在数据库
模型可用性: 多数生物学过程在BioModels中无结果为正常情况

Phase 4: Top-Level Pathway Catalog

阶段4：核心通路目录

When: Always included to provide context

Objective: Show major biological systems/pathways for organism

触发条件: 始终包含以提供上下文

目标: 展示目标物种的主要生物系统/通路

Tools Used

使用工具

Reactome_list_top_pathways:

Input:
```
species
```
(e.g., "Homo sapiens")
Output: Array of top-level pathway categories
Use: Provides hierarchical pathway organization

Reactome_list_top_pathways:

输入:
```
species
```
（例如："Homo sapiens"）
输出: 核心通路分类数组
用途: 提供通路层级组织结构

Workflow

工作流程

Retrieve top-level pathways for specified organism
Display pathway categories (metabolism, signaling, disease, etc.)
Serve as reference for pathway hierarchy

获取指定物种的核心通路
展示通路分类（代谢、信号传导、疾病等）
作为通路层级的参考

Decision Logic

决策逻辑

Always show: Provides context even if other phases empty
Organism-specific: Filter by species of interest
Hierarchical view: These are parent pathways with many subpathways

强制展示: 即使其他阶段无结果，也需提供上下文
物种特异性: 按目标物种筛选
层级视图: 这些是包含多个子通路的父通路

Output Structure

输出结构

Report Format

报告格式

Progressive Markdown Report:

Create report file first
Add sections progressively
Each section self-contained (handles empty gracefully)

Required Sections:

Header: Analysis parameters (genes, protein, keyword, organism)
Phase 1 Results: Pathway enrichment (if gene list)
Phase 2 Results: Protein-pathway mapping (if protein ID)
Phase 3 Results: Keyword search across databases (if keyword)
Phase 4 Results: Top-level pathway catalog (always)

Per-Database Subsections:

Database name and result count
Table of pathways with key metadata
Note if database returns no results
Links or IDs for follow-up

渐进式Markdown报告:

先创建报告文件
逐步添加章节
每个章节独立（可优雅处理无结果情况）

必填章节:

页眉: 分析参数（基因、蛋白质、关键词、物种）
阶段1结果: 通路富集分析（若提供基因列表）
阶段2结果: 蛋白质-通路映射（若提供蛋白质ID）
阶段3结果: 跨数据库关键词搜索（若提供关键词）
阶段4结果: 核心通路目录（始终包含）

数据库子章节:

数据库名称与结果数量
含关键元数据的通路表格
标注数据库无结果情况
提供后续分析的链接或ID

Data Tables

数据表格

Enrichment Results: | Pathway | P-value | Adjusted P-value | Genes | | ... | ... | ... | ... |

Protein Pathways: | Pathway Name | Pathway ID | Species | | ... | ... | ... |

Keyword Search: | Pathway/Model ID | Name | Source/Database | | ... | ... | ... |

富集分析结果: | 通路 | P值 | 校正P值 | 基因 | | ... | ... | ... | ... |

蛋白质通路: | 通路名称 | 通路ID | 物种 | | ... | ... | ... |

关键词搜索结果: | 通路/模型ID | 名称 | 来源/数据库 | | ... | ... | ... |

Tool Parameter Reference

工具参数参考

Critical Parameter Notes (from testing):

Tool	Parameter	CORRECT Name	Common Mistake
Reactome_map_uniprot_to_pathways	`id`	✅ `id`	❌ `uniprot_id`
kegg_search_pathway	`keyword`	✅ `keyword`	-
WikiPathways_search	`query`	✅ `query`	-
pc_search_pathways	`action` + `keyword`	✅ Both required	❌ `action` optional
enrichr_gene_enrichment_analysis	`gene_list`	✅ `gene_list`	-

Response Format Notes:

Reactome: Returns list directly (not wrapped in
```
{status, data}
```
)
Pathway Commons: Returns dict directly with
```
total_hits
```
and
```
pathways
```
Others: Standard
```
{status: "success", data: [...]}
```
format

关键参数说明（来自测试）:

工具	参数	正确名称	常见错误
Reactome_map_uniprot_to_pathways	`id`	✅ `id`	❌ `uniprot_id`
kegg_search_pathway	`keyword`	✅ `keyword`	-
WikiPathways_search	`query`	✅ `query`	-
pc_search_pathways	`action` + `keyword`	✅ 两者均必填	❌ `action` 可选
enrichr_gene_enrichment_analysis	`gene_list`	✅ `gene_list`	-

响应格式说明:

Reactome: 直接返回列表（未包裹在
```
{status, data}
```
中）
Pathway Commons: 直接返回含
```
total_hits
```
与
```
pathways
```
的字典
其他工具: 标准
```
{status: "success", data: [...]}
```
格式

Fallback Strategies

fallback策略

Enrichment Analysis

富集分析

Primary: Enrichr with KEGG library
Fallback: Try alternative libraries (Reactome, GO Biological Process)
If all fail: Note "enrichment analysis unavailable" and continue

主方案: 使用Enrichr结合KEGG库
备选方案: 尝试其他库（Reactome、GO生物过程）
全部失败: 标注“富集分析不可用”并继续流程

Protein Mapping

蛋白质映射

Primary: Reactome protein-pathway mapping
Fallback: Use keyword search with protein name
If empty: Check if protein ID valid; suggest checking gene symbol

主方案: Reactome蛋白质-通路映射
备选方案: 使用蛋白质名称进行关键词搜索
无结果处理: 检查蛋白质ID有效性；建议尝试基因符号

Keyword Search

关键词搜索

Primary: Search all databases (KEGG, WikiPathways, Pathway Commons, BioModels)
Fallback: If all empty, broaden keyword (e.g., "diabetes" → "glucose")
If still empty: Note "no pathways found for [keyword]"

主方案: 搜索所有数据库（KEGG、WikiPathways、Pathway Commons、BioModels）
备选方案: 若全部无结果，放宽关键词范围（例如："diabetes" → "glucose"）
仍无结果: 标注“未找到与[关键词]相关的通路”

Common Use Patterns

常见使用模式

Pattern 1: Differential Expression Analysis

模式1：差异表达分析

Input: Gene list from RNA-seq (upregulated genes)
Workflow: Phase 1 (Enrichment) → Phase 4 (Context)
Output: Enriched pathways explaining expression changes

输入：RNA-seq得到的上调基因列表
工作流程：阶段1（富集分析）→ 阶段4（上下文）
输出：解释表达变化的富集通路

Pattern 2: Protein Function Investigation

模式2：蛋白质功能研究

Input: UniProt ID of protein of interest
Workflow: Phase 2 (Protein mapping) → Phase 3 (Keyword with protein name)
Output: All pathways involving protein + related pathways

输入：目标蛋白质的UniProt ID
工作流程：阶段2（蛋白质映射）→ 阶段3（蛋白质名称关键词搜索）
输出：蛋白质参与的所有通路 + 相关通路

Pattern 3: Disease Pathway Exploration

模式3：疾病通路探索

Input: Disease name or process keyword
Workflow: Phase 3 (Keyword search) → Phase 4 (Context)
Output: Pathways from multiple databases related to disease

输入：疾病名称或过程关键词
工作流程：阶段3（关键词搜索）→ 阶段4（上下文）
输出：多数据库中与疾病相关的通路

Pattern 4: Comprehensive Multi-Input

模式4：综合多输入分析

Input: Gene list + protein ID + keyword
Workflow: All phases
Output: Complete systems view with enrichment, specific mappings, and context

输入：基因列表 + 蛋白质ID + 关键词
工作流程：所有阶段
输出：包含富集分析、特异性映射与上下文的完整系统视图

Quality Checks

质量检查

Data Completeness

数据完整性

Biological Validity

生物学有效性

Enrichment p-values show significance threshold
Protein mappings consistent with known function
Keyword results relevant to query
Cross-database results show expected overlaps

富集分析P值标注显著性阈值
蛋白质映射与已知功能一致
关键词结果与查询相关
跨数据库结果显示预期重叠

Report Quality

报告质量

All sections present even if "no data"
Tables formatted consistently
Source databases clearly attributed
Follow-up recommendations if data sparse

所有章节均存在（即使“无数据”）
表格格式一致
来源数据库标注清晰
数据稀疏时提供后续建议

Limitations & Known Issues

局限性与已知问题

Database-Specific

数据库特异性

Reactome: Strong human coverage; limited for non-model organisms
KEGG: Requires keyword match; may miss synonyms
WikiPathways: Variable curation quality; check pathway version dates
Pathway Commons: Aggregation can have duplicates; check source
BioModels: Sparse for many processes; often returns no results
Enrichr: Requires gene symbols (not IDs); case-sensitive

Reactome: 人类覆盖全面；非模式生物覆盖有限
KEGG: 需精确关键词匹配；可能遗漏同义词
WikiPathways: 筛选质量参差不齐；需检查通路版本日期
Pathway Commons: 聚合数据可能存在重复；需核对来源
BioModels: 多数过程数据稀疏；常无结果返回
Enrichr: 仅支持基因符号（不支持ID）；区分大小写

Technical

技术问题

Response formats: Different databases use different response structures (handled in implementation)
Rate limits: Some databases have rate limits for heavy usage
Version differences: Pathway databases updated at different rates

响应格式: 不同数据库使用不同响应结构（已在实现中处理）
速率限制: 部分数据库对高频使用有限制
版本差异: 各通路数据库更新频率不同

Analysis

分析局限性

Enrichment bias: Pathway enrichment depends on pathway size and annotation completeness
Organism specificity: Not all databases cover all organisms equally
Pathway definitions: Same biological process may be modeled differently across databases

富集偏差: 通路富集结果依赖通路大小与注释完整性
物种特异性: 并非所有数据库对所有物种的覆盖程度一致
通路定义: 同一生物过程在不同数据库中的建模方式可能不同

Summary

总结

Systems Biology & Pathway Analysis Skill provides comprehensive pathway analysis by integrating:

✅ Statistical pathway enrichment (Enrichr)
✅ Protein-pathway mapping (Reactome)
✅ Multi-database keyword search (KEGG, WikiPathways, Pathway Commons, BioModels)
✅ Hierarchical pathway context (Reactome top-level)

Outputs: Markdown report with pathway tables, enrichment statistics, and cross-database comparisons

Best for: Gene set analysis, protein function investigation, pathway discovery, systems-level biology

系统生物学与通路分析技能通过整合以下功能提供全面的通路分析：

✅ 统计通路富集分析（Enrichr）
✅ 蛋白质-通路映射（Reactome）
✅ 多数据库关键词搜索（KEGG、WikiPathways、Pathway Commons、BioModels）
✅ 层级通路上下文（Reactome核心通路）

输出: 含通路表格、富集统计与跨数据库对比的Markdown报告

最佳适用场景: 基因集分析、蛋白质功能研究、通路发现、系统层面生物学研究