jaspar-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseJASPAR Skill
JASPAR 技能
JASPAR is the definitive open-access database for Transcription Factor (TF)
binding profiles, stored as Position Frequency Matrices (PFMs).
Use this skill to map abstract sequence motifs or genomic regions to specific
biological regulators (e.g., "what TFs bind here?" or "what is the motif for
CTCF?").
JASPAR是权威的开放获取转录因子(TF)结合谱数据库,数据以位置频率矩阵(PFMs)形式存储。
使用本技能可将抽象序列基序或基因组区域映射到特定的生物调控因子(例如:“哪些TF在此处结合?”或“CTCF的基序是什么?”)。
Prerequisites
前置条件
- : Read the
uvskill and follow its Setup instructions to ensureuvis installed and on PATH.uv - User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://jaspar.elixir.no/ and https://jaspar.elixir.no/api/, then (2) create the file recording the notification text and timestamp.
- :阅读
uv技能并遵循其设置说明,确保uv已安装且在PATH路径中。uv - 用户通知:若本技能目录中不存在LICENSE_NOTIFICATION.txt文件,则需(1)显著通知用户查看https://jaspar.elixir.no/和https://jaspar.elixir.no/api/上的条款,然后(2)创建该文件并记录通知文本和时间戳。
Core Rules
核心规则
CRITICAL: You MUST respect the JASPAR API Terms of Use by adhering to the
following:
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
- Maximum API Window Size: The genomic window for a single API query MUST
NOT exceed 100,000 bp (100kb). The script automatically chunks larger requests for you to bypass this limitation when querying larger regions.
jaspar_api.py - Valid Matrix IDs: ,
get_tf_motif, andget_tf_metadatarequire a stable JASPAR Matrix ID (e.g.,get_tf_pwm). If a user provides a gene symbol (e.g.,MA0488.2), you must resolve it first usingJUN.resolve_tf_id - Taxonomy Required: Resolving IDs requires a to ensure targeted searches. Common IDs: Human=9606, Mouse=10090.
tax_id - Notification: If this skill is used, ensure this is mentioned in the output.
关键要求:你必须遵守JASPAR API使用条款,严格遵循以下规则:
- 使用封装脚本:始终执行提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地执行所需的速率限制。
- 最大API窗口大小:单次API查询的基因组窗口不得超过100,000 bp(100kb)。当查询更大区域时,脚本会自动拆分请求以绕过此限制。
jaspar_api.py - 有效矩阵ID:、
get_tf_motif和get_tf_metadata需要稳定的JASPAR矩阵ID(例如:get_tf_pwm)。若用户提供基因符号(例如:MA0488.2),必须先使用JUN解析为矩阵ID。resolve_tf_id - 必填分类学ID:解析ID时需要以确保精准搜索。常见ID:人类=9606,小鼠=10090。
tax_id - 通知要求:若使用本技能,需在输出中提及这一点。
Utility Scripts
实用脚本
Run all commands using the bundled Python script:
所有命令均通过捆绑的Python脚本运行:
1. Resolve TF to Matrix ID
1. 将TF解析为矩阵ID
Maps a transcription factor name to a stable Matrix ID. Required step before
fetching motifs if only a gene name is provided.
bash
uv run scripts/jaspar_api.py resolve_tf_id --name "JUN" --tax-id 9606将转录因子名称映射为稳定的矩阵ID。若仅提供基因名称,这是获取基序前的必填步骤。
bash
uv run scripts/jaspar_api.py resolve_tf_id --name "JUN" --tax-id 96062. Get TF Motif (PFM)
2. 获取TF基序(PFM)
Retrieves the raw Position Frequency Matrix for a specific TF. Supports
flag.
--formatbash
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2" --format meme检索特定TF的原始位置频率矩阵。支持参数。
--formatbash
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2" --format meme3. Get TF Metadata
3. 获取TF元数据
Retrieves TF class, family, and links to external databases (e.g., UniProt).
Supports flag.
--formatbash
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2" --format yaml检索TF的类别、家族以及外部数据库(如UniProt)的链接。支持参数。
--formatbash
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2" --format yaml4. Compute PWM (Position Weight Matrix)
4. 计算PWM(位置权重矩阵)
Fetches the PFM for a matrix and converts it to log-odds scores (PWM).
bash
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2" --pseudocount 0.1获取矩阵的PFM并将其转换为对数似然得分(PWM)。
bash
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2" --pseudocount 0.15. Infer Matrix from Protein Sequence
5. 从蛋白质序列推断矩阵
Infers potential JASPAR matrix profiles from a raw transcription factor protein
sequence.
bash
uv run scripts/jaspar_api.py infer_from_sequence --sequence "QAQLLPSHHVG"从原始转录因子蛋白质序列推断潜在的JASPAR矩阵谱。
bash
uv run scripts/jaspar_api.py infer_from_sequence --sequence "QAQLLPSHHVG"6. Get TF Flexible Model (TFFM)
6. 获取TF柔性模型(TFFM)
Retrieves metadata for a JASPAR TF Flexible Model. (Note: The JASPAR TFFM
endpoints occasionally experience 500 Internal Server errors).
bash
uv run scripts/jaspar_api.py get_tffm --tffm-id "TFFM0001.1"检索JASPAR TF柔性模型的元数据。(注意:JASPAR TFFM端点偶尔会出现500内部服务器错误)。
bash
uv run scripts/jaspar_api.py get_tffm --tffm-id "TFFM0001.1"Output Formats
输出格式
The and commands accept an optional
flag. Supported formats: (default), , , ,
, , .
get_tf_motifget_tf_metadata--formatjsonjsonpjasparmemetransfacpfmyamlget_tf_motifget_tf_metadata--formatjsonjsonpjasparmemetransfacpfmyamlAnti-Patterns
反模式
- DON'T pass gene symbols (e.g., ) to
JUN. You must pass theget_tf_motifMatrix ID.MA... - DON'T forget the when resolving a TF name.
--tax-id - DON'T use this skill for determining tissue-specific epigenetic availability (JASPAR shows potential binding, not actual tissue expression context).
- DON'T use this skill to model how a specific protein mutation affects binding.
- 请勿将基因符号(如)传入
JUN,必须传入get_tf_motif格式的矩阵ID。MA... - 请勿在解析TF名称时遗漏参数。
--tax-id - 请勿使用本技能确定组织特异性表观遗传可用性(JASPAR仅显示潜在结合情况,而非实际组织表达背景)。
- 请勿使用本技能模拟特定蛋白质突变对结合的影响。