unibind-database

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

UniBind Database Skill

UniBind数据库技能

UniBind is a database of direct TF–DNA interactions across 9 species, integrating ChIP-seq peaks with JASPAR TF binding profiles via the DAMO framework.
UniBind是一个涵盖9个物种的直接TF–DNA相互作用数据库,通过DAMO框架整合了ChIP-seq峰与JASPAR TF结合图谱。

Prerequisites

前提条件

  1. uv
    : Read the
    uv
    skill and follow its Setup instructions to ensure
    uv
    is installed and on PATH.
  2. User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://unibind.uio.no/ and https://unibind.uio.no/api/overview, then (2) create the file recording the notification text and timestamp.
  1. uv
    :阅读
    uv
    技能并遵循其设置说明,确保
    uv
    已安装且在PATH路径中。
  2. 用户通知:若此技能目录中不存在LICENSE_NOTIFICATION.txt文件,则(1) 显著通知用户查看https://unibind.uio.no/和https://unibind.uio.no/api/overview上的条款,然后(2) 创建记录通知文本和时间戳的文件。

Quick Start

快速开始

Query commands print JSON to stdout by default. Most outputs are small enough to read directly. For large outputs (
list_cell_lines
,
list_tfs
), pipe through
jq
to extract only the fields you need.
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_species
The
download_tfbs
command writes BED/FASTA files to
--output-dir
instead. You may optionally use
--output <path>
on any query command to save results to a file if needed.
查询命令默认将JSON输出到标准输出。大多数输出内容足够小,可直接读取。对于大型输出(如
list_cell_lines
list_tfs
),通过
jq
管道提取所需字段即可。
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_species
download_tfbs
命令会将BED/FASTA文件写入
--output-dir
指定的目录。如有需要,可在任何查询命令中选择使用
--output <path>
参数将结果保存到文件。

Core Rules

核心规则

  • Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
  • Output: Query commands print JSON to stdout. Most responses are compact and can be read directly.
  • Large Results:
    list_cell_lines
    and
    list_tfs
    produce large output. Pipe these through
    jq
    to extract specific fields rather than reading the full output into context.
  • Saving to File: Use
    --output <path>
    when you need to reference the data later or when processing very large results with
    jq
    .
  • Pagination: Use
    --page
    and
    --page-size
    (max 1000) to chunk large result sets.
  • Ordering: Use
    --order field_name
    (prefix with
    -
    for descending) on any list command.
  • Notification: If this skill is used, ensure this is mentioned in the output.
  • 使用包装器:务必执行提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地强制执行所需的速率限制。
  • 输出方式:查询命令将JSON输出到标准输出。大多数响应内容简洁,可直接读取。
  • 大型结果处理
    list_cell_lines
    list_tfs
    会生成大型输出。通过
    jq
    管道提取特定字段,而非将完整输出加载到上下文。
  • 保存到文件:当需要后续引用数据,或使用
    jq
    处理超大型结果时,使用
    --output <path>
    参数。
  • 分页:使用
    --page
    --page-size
    (最大1000)对大型结果集进行分块处理。
  • 排序:在任何列表命令中使用
    --order field_name
    (前缀加
    -
    表示降序)。
  • 通知要求:若使用此技能,需确保输出中提及该通知。

Utility Scripts

实用脚本

Replace
<SKILL DIR>
with the absolute path to this skill's directory.
<SKILL DIR>
替换为此技能目录的绝对路径。

1. List Species

1. 列出物种

bash
uv run <SKILL DIR>/scripts/unibind_api.py list_species
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_species

2. List Collections

2. 列出数据集合集

bash
uv run <SKILL DIR>/scripts/unibind_api.py list_collections
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_collections

3. List Cell Lines & TFs (large output — use
jp
)

3. 列出细胞系与TF(大型输出 — 使用
jp

These commands return large datasets. Use
uvx --from jmespath jp
to extract only the fields you need.
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_cell_lines | uvx --from jmespath jp "results[].name"
uv run <SKILL DIR>/scripts/unibind_api.py list_tfs | uvx --from jmespath jp "results[].tf_name"
这些命令会返回大型数据集。使用
uvx --from jmespath jp
提取所需字段即可。
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_cell_lines | uvx --from jmespath jp "results[].name"
uv run <SKILL DIR>/scripts/unibind_api.py list_tfs | uvx --from jmespath jp "results[].tf_name"

4. List and Filter Datasets (and Profile-Specific Datasets)

4. 列出并筛选数据集(及图谱特定数据集)

Filter datasets using the following arguments:
  • --species
    (e.g., "Homo sapiens")
  • --tf-name
    (e.g., "CTCF")
  • --cell-line
    (e.g., "mESC")
  • --collection
    (e.g., Permissive, Robust)
  • --search
    (a search term)
  • --biological-condition
    (biological condition or source)
  • --data-source
    (source of data, e.g., "ENCODE")
  • --has-pvalue
    ("true" or "false")
  • --identifier
    (e.g., "GSE60130")
  • --jaspar-id
    (JASPAR database profile matrix ID)
  • --model
    (prediction model)
  • --summary
    (summary filter)
  • --threshold-pvalue
    (p-value threshold)
Use
list_datasets
for standard datasets, or
list_specific_datasets
for profile-specific queries.
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_datasets --species "Homo sapiens" --tf-name "CTCF" --data-source "ENCODE"
uv run <SKILL DIR>/scripts/unibind_api.py list_specific_datasets --species "Mus musculus" --cell-line "mESC"
使用以下参数筛选数据集:
  • --species
    (例如:"Homo sapiens")
  • --tf-name
    (例如:"CTCF")
  • --cell-line
    (例如:"mESC")
  • --collection
    (例如:Permissive、Robust)
  • --search
    (搜索词)
  • --biological-condition
    (生物条件或来源)
  • --data-source
    (数据来源,例如:"ENCODE")
  • --has-pvalue
    ("true"或"false")
  • --identifier
    (例如:"GSE60130")
  • --jaspar-id
    (JASPAR数据库图谱矩阵ID)
  • --model
    (预测模型)
  • --summary
    (摘要筛选)
  • --threshold-pvalue
    (p值阈值)
使用
list_datasets
查询标准数据集,或使用
list_specific_datasets
进行图谱特定查询。
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_datasets --species "Homo sapiens" --tf-name "CTCF" --data-source "ENCODE"
uv run <SKILL DIR>/scripts/unibind_api.py list_specific_datasets --species "Mus musculus" --cell-line "mESC"

5. Get Dataset Details

5. 获取数据集详情

bash
uv run <SKILL DIR>/scripts/unibind_api.py get_dataset "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3"
bash
uv run <SKILL DIR>/scripts/unibind_api.py get_dataset "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3"

6. Download TFBS Files (BED / FASTA)

6. 下载TFBS文件(BED / FASTA格式)

Downloads all TFBS files for a dataset to a local directory. Use
--format bed
(default) or
--format fasta
.
bash
uv run <SKILL DIR>/scripts/unibind_api.py download_tfbs "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3" --output-dir /tmp/tfbs --format bed
将数据集的所有TFBS文件下载到本地目录。使用
--format bed
(默认)或
--format fasta
指定格式。
bash
uv run <SKILL DIR>/scripts/unibind_api.py download_tfbs "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3" --output-dir /tmp/tfbs --format bed

Anti-Patterns

反模式

  • DON'T attempt to use the UniBind API to query specific genomic intervals, locations, or genes.
  • DON'T guess or hallucinate genome coordinates. Always use
    ensembl-database
    as an external check if you're pulling local BED tracks for offline bedtools intersection.
  • DON'T use for motif models (PFMs). Use the jaspar-database skill instead.
  • DON'T use for gene expression data. UniBind only stores binding events.
  • DON'T assume tissue-specific expression from dataset lists alone.
  • DON'T use
    cat
    to read large JSON output files into context. The output is too large. Use
    jq
    or write your own code to parse the output files.
  • 请勿尝试使用UniBind API查询特定基因组区间、位置或基因。
  • 请勿猜测或虚构基因组坐标。如果要提取本地BED轨道用于离线bedtools交集分析,请始终使用
    ensembl-database
    作为外部校验。
  • 请勿用于基序模型(PFM)。请改用jaspar-database技能。
  • 请勿用于基因表达数据。UniBind仅存储结合事件。
  • 请勿仅从数据集列表推断组织特异性表达。
  • 请勿使用
    cat
    读取大型JSON输出文件到上下文。输出内容过大,请使用
    jq
    或编写自定义代码解析输出文件。