human-protein-atlas-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHuman Protein Atlas (HPA) Database Integration
人类蛋白质图谱(HPA)数据库集成
This skill provides semi-quantitative protein expression and spatial
localisation data from the Human Protein Atlas (HPA). While RNA-seq (e.g., GTEx)
tells us if a gene is being transcribed, HPA confirms if the protein product
actually exists, where it is located within the cell (e.g. nucleus vs
cytoplasm), and its concentration in systemic blood circulation. The data is
based on Immunohistochemistry (IHC) across normal human tissues and cancer
types.
本技能可从人类蛋白质图谱(HPA)获取半定量蛋白质表达和空间定位数据。RNA-seq(如GTEx)能告诉我们基因是否正在转录,而HPA则可以确认蛋白质产物是否真实存在、在细胞内的位置(如细胞核vs细胞质)以及其在全身血液循环中的浓度。这些数据基于正常人体组织和癌症类型的免疫组化(IHC)实验结果。
Prerequisites
前置条件
- : Read the
uvskill and follow its Setup instructions to ensureuvis installed and on PATH.uv - User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://www.proteinatlas.org/about/licence, then (2) create the file recording the notification text and timestamp.
- :阅读
uv技能文档并按照其设置说明确保uv已安装且在PATH路径中。uv - 用户通知:如果本技能目录中不存在LICENSE_NOTIFICATION.txt文件,则(1)显著通知用户查看https://www.proteinatlas.org/about/licence上的条款,然后(2)创建该文件并记录通知文本和时间戳。
When to Use
使用场景
Use this skill when you need to:
- Map a gene symbol to its Ensembl ID for HPA queries.
- Retrieve the semi-quantitative protein abundance in normal human tissues and cancer types based on IHC staining (High, Medium, Low, or Not Detected).
- Find the specific organelles or subcellular structures where a protein has been localized (e.g., nucleoplasm, mitochondria).
- Check the consistency/agreement between RNA-seq consensus and protein expression levels.
- Search for genes based on specific protein expression criteria (e.g., "elevated in amygdala" or "secreted proteins").
Do NOT use when you need to:
- Query eQTLs, pQTLs, or any variant-level associations. HPA provides wild-type expression data and knows nothing about QTLs.
- Query gene expression in non-human species. HPA is strictly for human proteins.
- Retrieve purely quantitative RNA expression without interest in the protein product (consider using the GTEx skill instead).
在以下场景中使用本技能:
- 将基因符号映射为用于HPA查询的Ensembl ID。
- 根据IHC染色结果获取正常人体组织和癌症类型中的半定量蛋白质丰度(高、中、低或未检测到)。
- 查找蛋白质定位的特定细胞器或亚细胞结构(如核质、线粒体)。
- 检查RNA-seq共识与蛋白质表达水平之间的一致性。
- 根据特定蛋白质表达标准搜索基因(如“在杏仁核中高表达”或“分泌蛋白”)。
请勿在以下场景中使用:
- 查询eQTL、pQTL或任何变异水平的关联。HPA仅提供野生型表达数据,不涉及QTL相关内容。
- 查询非人类物种的基因表达。HPA仅针对人类蛋白质。
- 仅需获取定量RNA表达而不关注蛋白质产物(可考虑使用GTEx技能)。
Command Selection Guide
命令选择指南
Pick the right command on the first try. Match the user's input to the
correct subcommand below.
- Map a gene symbol to Ensembl ID:
resolve-ensembl-id - Get tissue protein expression levels:
get-tissue-expression - Get subcellular location of a protein:
get-subcellular-location - Get the full HPA metadata entry for a gene:
get-atlas-entry - Search HPA for genes matching specific criteria:
search-hpa
首次尝试就选择正确的命令。将用户输入与以下子命令匹配:
- 将基因符号映射为Ensembl ID:
resolve-ensembl-id - 获取组织蛋白质表达水平:
get-tissue-expression - 获取蛋白质的亚细胞定位:
get-subcellular-location - 获取基因的完整HPA元数据条目:
get-atlas-entry - 根据特定条件搜索HPA中的基因:
search-hpa
Quick Start
快速开始
bash
undefinedbash
undefinedMap the ERBB2 gene symbol to its Ensembl ID
将ERBB2基因符号映射为其Ensembl ID
uv run scripts/hpa_cli.py resolve-ensembl-id ERBB2 --output /tmp/erbb2_id.json
uv run scripts/hpa_cli.py resolve-ensembl-id ERBB2 --output /tmp/erbb2_id.json
Get subcellular location by Ensembl ID
通过Ensembl ID获取亚细胞定位
uv run scripts/hpa_cli.py get-subcellular-location ENSG00000141736 --output /tmp/erbb2_location.json
All subcommands write JSON to disk. Always save output in the `/tmp/` directory.
The default output file is `/tmp/hpa_output.json` if `--output` is not
specified.uv run scripts/hpa_cli.py get-subcellular-location ENSG00000141736 --output /tmp/erbb2_location.json
所有子命令都会将JSON数据写入磁盘。请始终将输出保存到`/tmp/`目录。如果未指定`--output`参数,默认输出文件为`/tmp/hpa_output.json`。Commands
命令详情
1. resolve-ensembl-id
— Gene Symbol → Ensembl ID
resolve-ensembl-id1. resolve-ensembl-id
— 基因符号 → Ensembl ID
resolve-ensembl-idMaps a common gene symbol (e.g., "TP53", "ERBB2") to its Ensembl gene ID. HPA
endpoints are strictly Ensembl-based.
bash
uv run scripts/hpa_cli.py resolve-ensembl-id TP53 --output /tmp/tp53_id.jsonArguments:
- (positional): The standard gene symbol (e.g., "TP53").
gene_symbol - : Output file path (default:
--output)./tmp/hpa_output.json
将通用基因符号(如“TP53”、“ERBB2”)映射为其Ensembl基因ID。HPA的接口严格基于Ensembl ID。
bash
uv run scripts/hpa_cli.py resolve-ensembl-id TP53 --output /tmp/tp53_id.json参数:
- (位置参数):标准基因符号(如“TP53”)。
gene_symbol - :输出文件路径(默认:
--output)。/tmp/hpa_output.json
2. get-tissue-expression
— Get Tissue Protein Levels
get-tissue-expression2. get-tissue-expression
— 获取组织蛋白质水平
get-tissue-expressionReturns a list of tissues and their corresponding protein expression levels
(High, Medium, Low, or Not Detected) based on IHC staining.
bash
uv run scripts/hpa_cli.py get-tissue-expression ENSG00000130234 \
--tissues "duodenum,thyroid gland" --output /tmp/tissue_expr.jsonArguments:
- (positional): The Ensembl Gene ID.
ensembl_id - : Comma-separated list of tissues to filter by (optional, defaults to all available tissues).
--tissues - : Output file path (default:
--output)./tmp/hpa_output.json
返回基于IHC染色结果的组织列表及其对应的蛋白质表达水平(高、中、低或未检测到)。
bash
uv run scripts/hpa_cli.py get-tissue-expression ENSG00000130234 \
--tissues "duodenum,thyroid gland" --output /tmp/tissue_expr.json参数:
- (位置参数):Ensembl基因ID。
ensembl_id - :逗号分隔的组织过滤列表(可选,默认包含所有可用组织)。
--tissues - :输出文件路径(默认:
--output)。/tmp/hpa_output.json
3. get-subcellular-location
— Get Subcellular Location
get-subcellular-location3. get-subcellular-location
— 获取亚细胞定位
get-subcellular-locationRetrieves the specific organelles or cellular structures where the protein has
been localized.
bash
uv run scripts/hpa_cli.py get-subcellular-location ENSG00000141736 \
--output /tmp/subcellular.jsonArguments:
- (positional): The Ensembl Gene ID.
ensembl_id - : Output file path.
--output
获取蛋白质定位的特定细胞器或细胞结构信息。
bash
uv run scripts/hpa_cli.py get-subcellular-location ENSG00000141736 \
--output /tmp/subcellular.json参数:
- (位置参数):Ensembl基因ID。
ensembl_id - :输出文件路径。
--output
4. get-atlas-entry
— Get Full HPA Entry
get-atlas-entry4. get-atlas-entry
— 获取完整HPA条目
get-atlas-entryFetches the full metadata for a gene, including IHC scores, RNA-seq consensus,
and subcellular location.
bash
uv run scripts/hpa_cli.py get-atlas-entry ENSG00000254647 \
--output /tmp/ins_entry.jsonArguments:
- (positional): The Ensembl Gene ID.
ensembl_id - : Format of the returned entry, e.g., json (default:
--format).json - : Output file path.
--output
获取基因的完整元数据,包括IHC评分、RNA-seq共识和亚细胞定位信息。
bash
uv run scripts/hpa_cli.py get-atlas-entry ENSG00000254647 \
--output /tmp/ins_entry.json参数:
- (位置参数):Ensembl基因ID。
ensembl_id - :返回条目的格式,例如json(默认:
--format)。json - :输出文件路径。
--output
5. search-hpa
— Search by Attribute
search-hpa5. search-hpa
— 按属性搜索
search-hpaAllows filtering for genes based on specific criteria (e.g., "elevated in
amygdala").
bash
uv run scripts/hpa_cli.py search-hpa \
--query "brain_category_rna:amygdala" \
--output /tmp/search_results.jsonArguments:
- : The search query string. Refer to references/search-api.md for details.
--query - : Output file path.
--output
允许根据特定条件过滤基因(如“在杏仁核中高表达”)。
bash
uv run scripts/hpa_cli.py search-hpa \
--query "brain_category_rna:amygdala" \
--output /tmp/search_results.json参数:
- :搜索查询字符串。详情请参考references/search-api.md。
--query - :输出文件路径。
--output
Core Rules
核心规则
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce fair use and implement retry logic.
- Notification: If this skill is used, ensure this is mentioned in the output.
- 使用封装脚本:请始终执行提供的辅助脚本来查询数据库,而非直接访问数据库。脚本会自动执行公平使用限制并实现重试逻辑。
- 通知要求:如果使用了本技能,请确保在输出中提及这一点。
Common Errors
常见错误
- If no results are returned, confirm the query is detailed enough starting with the api reference in references/search-api.md
- If you cannot find the results, search the web for example HPA queries and use these to construct a better query.
- The output is usually large. Use jq or write your own python data parsing library to process the search results. Never output to stdout, or cat the output file.
- 如果未返回结果,请确认查询足够详细,可从references/search-api.md中的API参考开始排查。
- 如果无法找到结果,请在网上搜索HPA查询示例,并用这些示例构建更合适的查询。
- 输出内容通常较大。请使用jq或编写自己的Python数据解析库来处理搜索结果。切勿将输出打印到标准输出(stdout)或直接查看输出文件内容。