gtex-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGTEx Database Integration
GTEx数据库集成
This skill retrieves transcriptomics data (RNA expression baselines) and
expression Quantitative Trait Loci (eQTLs) from the GTEx Portal API V2. It
provides access to median TPM (Transcripts Per Million) values for genes and
significant eQTLs for variants across 54 human tissue sites.
本技能从GTEx Portal API V2获取转录组数据(RNA表达基线)和表达数量性状位点(eQTLs)。它提供了54个人类组织位点中基因的中位TPM(每百万转录本)值以及变异的显著eQTLs的访问权限。
Prerequisites
前提条件
- : Read the
uvskill and follow its Setup instructions to ensureuvis installed and on PATH.uv - User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://gtexportal.org/home/license and https://gtexportal.org/home/documentationPage#gtexApi, then (2) create the file recording the notification text and timestamp.
- :阅读
uv技能并遵循其设置说明,确保uv已安装并在PATH中。uv - 用户通知:如果本技能目录中不存在LICENSE_NOTIFICATION.txt文件,则(1) 显著通知用户查看https://gtexportal.org/home/license和https://gtexportal.org/home/documentationPage#gtexApi上的条款,然后(2) 创建记录通知文本和时间戳的文件。
When to Use
使用场景
Use this skill when you need to:
- Map a gene symbol to its Versioned GENCODE ID.
- Retrieve the baseline median expression level (in TPM) of a gene across various tissues.
- Find the top tissues where a particular gene is most highly expressed.
- Fetch significant single-tissue eQTLs for a variant or within a chromosomal window.
- Get all significant eQTLs associated with a specific gene.
- Contextualise a variant within GWAS loci using eQTL data.
Do NOT use when you need to:
- Query for protein-level expression or post-translational modifications (PTMs). GTEx only measures mRNA abundance.
- Query gene expression in diseased tissues (e.g., tumor samples, cirrhosis). GTEx is a baseline atlas of normal, non-diseased tissues.
- Query embryonic or fetal gene expression. GTEx donors are adults only.
当您需要以下操作时使用本技能:
- 将基因符号映射到带版本号的GENCODE ID。
- 获取基因在不同组织中的基线中位表达水平(以TPM为单位)。
- 找到特定基因表达量最高的顶级组织。
- 获取变异或染色体窗口内的显著单组织eQTLs。
- 获取与特定基因相关的所有显著eQTLs。
- 使用eQTL数据将变异置于GWAS位点的背景中。
请勿在以下场景使用:
- 查询蛋白质水平表达或翻译后修饰(PTMs)。GTEx仅测量mRNA丰度。
- 查询病变组织(如肿瘤样本、肝硬化)中的基因表达。GTEx是正常非病变组织的基线图谱。
- 查询胚胎或胎儿的基因表达。GTEx的供体均为成年人。
Core Rules
核心规则
CRITICAL: You MUST respect GTEx Portal API Terms of Use.
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
- Limit requests to maximum 250 items per page where applicable.
- Notification: If this skill is used, ensure this is mentioned in the output.
关键提示:您必须遵守GTEx Portal API的使用条款。
- 使用包装器:始终执行提供的辅助脚本来查询数据库,而不是直接访问数据库。这些脚本会自动优雅地执行所需的速率限制。
- 适用时,每页请求最多限制为250个条目。
- 通知:如果使用本技能,请确保在输出中提及这一点。
Command Selection Guide
命令选择指南
Pick the right command on the first try. Match the user's input to the
correct subcommand below.
- Map a gene symbol to GENCODE ID:
resolve-gencode-id - Get median expression (TPM) for a gene:
get-median-expression - Find tissues with highest expression for a gene:
get-top-expressed-tissues - Get all eQTLs for a specific gene:
get-gene-eqtls - Find eQTLs within a chromosomal region:
get-eqtls-in-region
首次尝试就选择正确的命令。将用户输入与以下正确的子命令匹配。
- 将基因符号映射到GENCODE ID:
resolve-gencode-id - 获取基因的中位表达(TPM):
get-median-expression - 找到基因表达量最高的组织:
get-top-expressed-tissues - 获取特定基因的所有eQTLs:
get-gene-eqtls - 查找染色体区域内的eQTLs:
get-eqtls-in-region
Quick Start
快速开始
bash
undefinedbash
undefinedMap the TNF gene symbol to its GENCODE ID
将TNF基因符号映射到其GENCODE ID
uv run scripts/gtex_cli.py resolve-gencode-id TNF --output /tmp/tnf_id.json
uv run scripts/gtex_cli.py resolve-gencode-id TNF --output /tmp/tnf_id.json
Get median expression of a gene by GENCODE ID
通过GENCODE ID获取基因的中位表达
uv run scripts/gtex_cli.py get-median-expression ENSG00000232810.2 --output /tmp/tnf_expr.json
All subcommands write JSON to disk. Always save output in the `/tmp/` directory.
The default output file is `/tmp/gtex_output.json` if `--output` is not
specified.uv run scripts/gtex_cli.py get-median-expression ENSG00000232810.2 --output /tmp/tnf_expr.json
所有子命令都会将JSON写入磁盘。始终将输出保存到`/tmp/`目录。如果未指定`--output`,默认输出文件为`/tmp/gtex_output.json`。Commands
命令
1. resolve-gencode-id
— Gene Symbol → GENCODE ID
resolve-gencode-id1. resolve-gencode-id
— 基因符号 → GENCODE ID
resolve-gencode-idMaps a standard gene symbol (e.g., "JUN", "TNF") to its Versioned GENCODE ID.
This ID is required for all other expression and eQTL calls.
bash
uv run scripts/gtex_cli.py resolve-gencode-id TNF --output /tmp/tnf_id.jsonArguments:
- (positional): The standard gene symbol (e.g., "TNF").
gene_symbol - : Output file path (default:
--output)./tmp/gtex_output.json
将标准基因符号(如“JUN”、“TNF”)映射到带版本号的GENCODE ID。该ID是所有其他表达和eQTL调用所必需的。
bash
uv run scripts/gtex_cli.py resolve-gencode-id TNF --output /tmp/tnf_id.json参数:
- (位置参数):标准基因符号(如“TNF”)。
gene_symbol - :输出文件路径(默认:
--output)。/tmp/gtex_output.json
2. get-median-expression
— Get Median Expression (TPM)
get-median-expression2. get-median-expression
— 获取中位表达(TPM)
get-median-expressionRetrieves the median TPM for a gene across all 54 GTEx tissue sites or specified
tissues.
bash
uv run scripts/gtex_cli.py get-median-expression ENSG00000232810.2 \
--tissues "Whole Blood,Spleen" --output /tmp/expr.jsonArguments:
- (positional): The Versioned GENCODE ID.
gencode_id - : Comma-separated list of tissue IDs (optional, defaults to all 54 tissues).
--tissues - : Output file path (default:
--output)./tmp/gtex_output.json
获取基因在所有54个GTEx组织位点或指定组织中的中位TPM。
bash
uv run scripts/gtex_cli.py get-median-expression ENSG00000232810.2 \
--tissues "Whole Blood,Spleen" --output /tmp/expr.json参数:
- (位置参数):带版本号的GENCODE ID。
gencode_id - :逗号分隔的组织ID列表(可选,默认所有54个组织)。
--tissues - :输出文件路径(默认:
--output)。/tmp/gtex_output.json
3. get-top-expressed-tissues
— Get Top Expressed Tissues
get-top-expressed-tissues3. get-top-expressed-tissues
— 获取高表达顶级组织
get-top-expressed-tissuesReturns the tissues with the highest median expression for the target gene.
nbash
uv run scripts/gtex_cli.py get-top-expressed-tissues ENSG00000232810.2 \
--n 5 --output /tmp/top_tissues.jsonArguments:
- (positional): The Versioned GENCODE ID.
gencode_id - : Number of top tissues to return (default: 5).
--n - : Output file path.
--output
返回目标基因中位表达量最高的个组织。
nbash
uv run scripts/gtex_cli.py get-top-expressed-tissues ENSG00000232810.2 \
--n 5 --output /tmp/top_tissues.json参数:
- (位置参数):带版本号的GENCODE ID。
gencode_id - :要返回的顶级组织数量(默认:5)。
--n - :输出文件路径。
--output
4. get-gene-eqtls
— Get All eQTLs for a Gene
get-gene-eqtls4. get-gene-eqtls
— 获取基因的所有eQTLs
get-gene-eqtlsReturns every significant eQTL associated with the gene across specified
tissues.
bash
uv run scripts/gtex_cli.py get-gene-eqtls ENSG00000232810.2 \
--tissues "Whole Blood" --output /tmp/eqtls.jsonArguments:
- (positional): The Versioned GENCODE ID.
gencode_id - : Comma-separated list of tissue IDs (optional, defaults to all).
--tissues - : Output file path.
--output
返回指定组织中与该基因相关的所有显著eQTLs。
bash
uv run scripts/gtex_cli.py get-gene-eqtls ENSG00000232810.2 \
--tissues "Whole Blood" --output /tmp/eqtls.json参数:
- (位置参数):带版本号的GENCODE ID。
gencode_id - :逗号分隔的组织ID列表(可选,默认所有组织)。
--tissues - :输出文件路径。
--output
5. get-eqtls-in-region
— Get eQTLs in Chromosomal Region
get-eqtls-in-region5. get-eqtls-in-region
— 获取染色体区域内的eQTLs
get-eqtls-in-regionReturns all significant single-tissue eQTLs within a chromosomal window (up to
8Mb).
bash
uv run scripts/gtex_cli.py get-eqtls-in-region chr17 7000000 7100000 "Esophagus - Muscularis" \
--output /tmp/region_eqtls.jsonArguments:
- (positional): Chromosome name (e.g.,
chromosome).chr17 - (positional): Start position.
start - (positional): End position (max 8Mb from start).
end - (positional): The target tissue ID.
tissue_id - : Output file path.
--output
返回染色体窗口内(最大8Mb)的所有显著单组织eQTLs。
bash
uv run scripts/gtex_cli.py get-eqtls-in-region chr17 7000000 7100000 "Esophagus - Muscularis" \
--output /tmp/region_eqtls.json参数:
- (位置参数):染色体名称(如
chromosome)。chr17 - (位置参数):起始位置。
start - (位置参数):结束位置(距起始位置最大8Mb)。
end - (位置参数):目标组织ID。
tissue_id - :输出文件路径。
--output
Typical Workflows
典型工作流程
Identify highest expressing tissues for a gene
识别基因表达量最高的组织
bash
undefinedbash
undefinedStep 1: Map symbol to GENCODE ID
步骤1:将符号映射到GENCODE ID
uv run scripts/gtex_cli.py resolve-gencode-id GATA4 --output /tmp/gata4_id.json
uv run scripts/gtex_cli.py resolve-gencode-id GATA4 --output /tmp/gata4_id.json
Step 2: Query for top tissues using the resolved ID
步骤2:使用解析后的ID查询顶级组织
uv run scripts/gtex_cli.py get-top-expressed-tissues <gencode_id> --n 5
--output /tmp/gata4_top.json
--output /tmp/gata4_top.json
undefineduv run scripts/gtex_cli.py get-top-expressed-tissues <gencode_id> --n 5
--output /tmp/gata4_top.json
--output /tmp/gata4_top.json
undefined