JASPAR Skill
JASPAR is the definitive open-access database for Transcription Factor (TF)
binding profiles, stored as Position Frequency Matrices (PFMs).
Use this skill to map abstract sequence motifs or genomic regions to specific
biological regulators (e.g., "what TFs bind here?" or "what is the motif for
CTCF?").
Prerequisites
- : Read the skill and follow its Setup instructions to ensure
is installed and on PATH.
- User Notification: If LICENSE_NOTIFICATION.txt does not already exist in
this skill directory then (1) prominently notify the user to check the terms
at https://jaspar.elixir.no/ and https://jaspar.elixir.no/api/, then (2)
create the file recording the notification text and timestamp.
Core Rules
CRITICAL: You MUST respect the JASPAR API Terms of Use by adhering to the
following:
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the
database rather than accessing the database directly. The scripts
automatically enforce the required rate limit gracefully.
- Maximum API Window Size: The genomic window for a single API query MUST
NOT exceed 100,000 bp (100kb). The script automatically
chunks larger requests for you to bypass this limitation when querying
larger regions.
- Valid Matrix IDs: , , and
require a stable JASPAR Matrix ID (e.g., ). If a user provides a
gene symbol (e.g., ), you must resolve it first using .
- Taxonomy Required: Resolving IDs requires a to ensure targeted
searches. Common IDs: Human=9606, Mouse=10090.
- Notification: If this skill is used, ensure this is mentioned in the
output.
Utility Scripts
Run all commands using the bundled Python script:
1. Resolve TF to Matrix ID
Maps a transcription factor name to a stable Matrix ID. Required step before
fetching motifs if only a gene name is provided.
bash
uv run scripts/jaspar_api.py resolve_tf_id --name "JUN" --tax-id 9606
2. Get TF Motif (PFM)
Retrieves the raw Position Frequency Matrix for a specific TF. Supports
flag.
bash
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_motif --matrix-id "MA0488.2" --format meme
3. Get TF Metadata
Retrieves TF class, family, and links to external databases (e.g., UniProt).
Supports
flag.
bash
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_metadata --matrix-id "MA0488.2" --format yaml
4. Compute PWM (Position Weight Matrix)
Fetches the PFM for a matrix and converts it to log-odds scores (PWM).
bash
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2"
uv run scripts/jaspar_api.py get_tf_pwm --matrix-id "MA0488.2" --pseudocount 0.1
5. Infer Matrix from Protein Sequence
Infers potential JASPAR matrix profiles from a raw transcription factor protein
sequence.
bash
uv run scripts/jaspar_api.py infer_from_sequence --sequence "QAQLLPSHHVG"
6. Get TF Flexible Model (TFFM)
Retrieves metadata for a JASPAR TF Flexible Model. (Note: The JASPAR TFFM
endpoints occasionally experience 500 Internal Server errors).
bash
uv run scripts/jaspar_api.py get_tffm --tffm-id "TFFM0001.1"
Output Formats
The
and
commands accept an optional
flag. Supported formats:
(default),
,
,
,
,
,
.
Anti-Patterns
- DON'T pass gene symbols (e.g., ) to . You must pass
the Matrix ID.
- DON'T forget the when resolving a TF name.
- DON'T use this skill for determining tissue-specific epigenetic
availability (JASPAR shows potential binding, not actual tissue
expression context).
- DON'T use this skill to model how a specific protein mutation affects
binding.