jaspar-database

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

JASPAR Skill

JASPAR 技能

JASPAR is the definitive open-access database for Transcription Factor (TF) binding profiles, stored as Position Frequency Matrices (PFMs).
Use this skill to map abstract sequence motifs or genomic regions to specific biological regulators (e.g., "what TFs bind here?" or "what is the motif for CTCF?").
JASPAR是权威的开放获取转录因子(TF)结合谱数据库,数据以位置频率矩阵(PFMs)形式存储。
使用本技能可将抽象序列基序或基因组区域映射到特定的生物调控因子(例如:“哪些TF在此处结合?”或“CTCF的基序是什么?”)。

Prerequisites

前置条件

  1. uv
    : Read the
    uv
    skill and follow its Setup instructions to ensure
    uv
    is installed and on PATH.
  2. User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://jaspar.elixir.no/ and https://jaspar.elixir.no/api/, then (2) create the file recording the notification text and timestamp.
  1. uv
    :阅读
    uv
    技能并遵循其设置说明,确保
    uv
    已安装且在PATH路径中。
  2. 用户通知:若本技能目录中不存在LICENSE_NOTIFICATION.txt文件,则需(1)显著通知用户查看https://jaspar.elixir.no/和https://jaspar.elixir.no/api/上的条款,然后(2)创建该文件并记录通知文本和时间戳。

Core Rules

核心规则

CRITICAL: You MUST respect the JASPAR API Terms of Use by adhering to the following:
  • Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
  • Maximum API Window Size: The genomic window for a single API query MUST NOT exceed 100,000 bp (100kb). The
    jaspar_api.py
    script automatically chunks larger requests for you to bypass this limitation when querying larger regions.
  • Valid Matrix IDs:
    get_tf_motif
    ,
    get_tf_metadata
    , and
    get_tf_pwm
    require a stable JASPAR Matrix ID (e.g.,
    MA0488.2
    ). If a user provides a gene symbol (e.g.,
    JUN
    ), you must resolve it first using
    resolve_tf_id
    .
  • Taxonomy Required: Resolving IDs requires a
    tax_id
    to ensure targeted searches. Common IDs: Human=9606, Mouse=10090.
  • Notification: If this skill is used, ensure this is mentioned in the output.
关键要求:你必须遵守JASPAR API使用条款,严格遵循以下规则:
  • 使用封装脚本:始终执行提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地执行所需的速率限制。
  • 最大API窗口大小:单次API查询的基因组窗口不得超过100,000 bp(100kb)。当查询更大区域时,
    jaspar_api.py
    脚本会自动拆分请求以绕过此限制。
  • 有效矩阵ID
    get_tf_motif
    get_tf_metadata
    get_tf_pwm
    需要稳定的JASPAR矩阵ID(例如:
    MA0488.2
    )。若用户提供基因符号(例如:
    JUN
    ),必须先使用
    resolve_tf_id
    解析为矩阵ID。
  • 必填分类学ID:解析ID时需要
    tax_id
    以确保精准搜索。常见ID:人类=9606,小鼠=10090。
  • 通知要求:若使用本技能,需在输出中提及这一点。

Utility Scripts

实用脚本

Run all commands using the bundled Python script:
所有命令均通过捆绑的Python脚本运行:

1. Resolve TF to Matrix ID

1. 将TF解析为矩阵ID

Maps a transcription factor name to a stable Matrix ID. Required step before fetching motifs if only a gene name is provided.
bash
uv run scripts/jaspar_api.py resolve_tf_id --name "JUN" --tax-id 9606
将转录因子名称映射为稳定的矩阵ID。若仅提供基因名称,这是获取基序前的必填步骤。
bash
uv run scripts/jaspar_api.py resolve_tf_id --name "JUN" --tax-id 9606

2. Get TF Motif (PFM)

2. 获取TF基序(PFM)

Retrieves the raw Position Frequency Matrix for a specific TF. Supports
--format
flag.
bash
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2" --format meme
检索特定TF的原始位置频率矩阵。支持
--format
参数。
bash
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2" --format meme

3. Get TF Metadata

3. 获取TF元数据

Retrieves TF class, family, and links to external databases (e.g., UniProt). Supports
--format
flag.
bash
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2" --format yaml
检索TF的类别、家族以及外部数据库(如UniProt)的链接。支持
--format
参数。
bash
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2" --format yaml

4. Compute PWM (Position Weight Matrix)

4. 计算PWM(位置权重矩阵)

Fetches the PFM for a matrix and converts it to log-odds scores (PWM).
bash
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2" --pseudocount 0.1
获取矩阵的PFM并将其转换为对数似然得分(PWM)。
bash
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2" --pseudocount 0.1

5. Infer Matrix from Protein Sequence

5. 从蛋白质序列推断矩阵

Infers potential JASPAR matrix profiles from a raw transcription factor protein sequence.
bash
uv run scripts/jaspar_api.py infer_from_sequence --sequence "QAQLLPSHHVG"
从原始转录因子蛋白质序列推断潜在的JASPAR矩阵谱。
bash
uv run scripts/jaspar_api.py infer_from_sequence --sequence "QAQLLPSHHVG"

6. Get TF Flexible Model (TFFM)

6. 获取TF柔性模型(TFFM)

Retrieves metadata for a JASPAR TF Flexible Model. (Note: The JASPAR TFFM endpoints occasionally experience 500 Internal Server errors).
bash
uv run scripts/jaspar_api.py get_tffm --tffm-id "TFFM0001.1"
检索JASPAR TF柔性模型的元数据。(注意:JASPAR TFFM端点偶尔会出现500内部服务器错误)。
bash
uv run scripts/jaspar_api.py get_tffm --tffm-id "TFFM0001.1"

Output Formats

输出格式

The
get_tf_motif
and
get_tf_metadata
commands accept an optional
--format
flag. Supported formats:
json
(default),
jsonp
,
jaspar
,
meme
,
transfac
,
pfm
,
yaml
.
get_tf_motif
get_tf_metadata
命令接受可选的
--format
参数。支持的格式:
json
(默认)、
jsonp
jaspar
meme
transfac
pfm
yaml

Anti-Patterns

反模式

  • DON'T pass gene symbols (e.g.,
    JUN
    ) to
    get_tf_motif
    . You must pass the
    MA...
    Matrix ID.
  • DON'T forget the
    --tax-id
    when resolving a TF name.
  • DON'T use this skill for determining tissue-specific epigenetic availability (JASPAR shows potential binding, not actual tissue expression context).
  • DON'T use this skill to model how a specific protein mutation affects binding.
  • 请勿将基因符号(如
    JUN
    )传入
    get_tf_motif
    ,必须传入
    MA...
    格式的矩阵ID。
  • 请勿在解析TF名称时遗漏
    --tax-id
    参数。
  • 请勿使用本技能确定组织特异性表观遗传可用性(JASPAR仅显示潜在结合情况,而非实际组织表达背景)。
  • 请勿使用本技能模拟特定蛋白质突变对结合的影响。