unibind-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseUniBind Database Skill
UniBind数据库技能
UniBind is a database of direct TF–DNA interactions across 9 species,
integrating ChIP-seq peaks with JASPAR TF binding profiles via the DAMO
framework.
UniBind是一个涵盖9个物种的直接TF–DNA相互作用数据库,通过DAMO框架整合了ChIP-seq峰与JASPAR TF结合图谱。
Prerequisites
前提条件
- : Read the
uvskill and follow its Setup instructions to ensureuvis installed and on PATH.uv - User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://unibind.uio.no/ and https://unibind.uio.no/api/overview, then (2) create the file recording the notification text and timestamp.
- :阅读
uv技能并遵循其设置说明,确保uv已安装且在PATH路径中。uv - 用户通知:若此技能目录中不存在LICENSE_NOTIFICATION.txt文件,则(1) 显著通知用户查看https://unibind.uio.no/和https://unibind.uio.no/api/overview上的条款,然后(2) 创建记录通知文本和时间戳的文件。
Quick Start
快速开始
Query commands print JSON to stdout by default. Most outputs are small enough to
read directly. For large outputs (, ), pipe through
to extract only the fields you need.
list_cell_lineslist_tfsjqbash
uv run <SKILL DIR>/scripts/unibind_api.py list_speciesThe command writes BED/FASTA files to instead.
You may optionally use on any query command to save results to
a file if needed.
download_tfbs--output-dir--output <path>查询命令默认将JSON输出到标准输出。大多数输出内容足够小,可直接读取。对于大型输出(如、),通过管道提取所需字段即可。
list_cell_lineslist_tfsjqbash
uv run <SKILL DIR>/scripts/unibind_api.py list_speciesdownload_tfbs--output-dir--output <path>Core Rules
核心规则
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
- Output: Query commands print JSON to stdout. Most responses are compact and can be read directly.
- Large Results: and
list_cell_linesproduce large output. Pipe these throughlist_tfsto extract specific fields rather than reading the full output into context.jq - Saving to File: Use when you need to reference the data later or when processing very large results with
--output <path>.jq - Pagination: Use and
--page(max 1000) to chunk large result sets.--page-size - Ordering: Use (prefix with
--order field_namefor descending) on any list command.- - Notification: If this skill is used, ensure this is mentioned in the output.
- 使用包装器:务必执行提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地强制执行所需的速率限制。
- 输出方式:查询命令将JSON输出到标准输出。大多数响应内容简洁,可直接读取。
- 大型结果处理:和
list_cell_lines会生成大型输出。通过list_tfs管道提取特定字段,而非将完整输出加载到上下文。jq - 保存到文件:当需要后续引用数据,或使用处理超大型结果时,使用
jq参数。--output <path> - 分页:使用和
--page(最大1000)对大型结果集进行分块处理。--page-size - 排序:在任何列表命令中使用(前缀加
--order field_name表示降序)。- - 通知要求:若使用此技能,需确保输出中提及该通知。
Utility Scripts
实用脚本
Replace with the absolute path to this skill's directory.
<SKILL DIR>将替换为此技能目录的绝对路径。
<SKILL DIR>1. List Species
1. 列出物种
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_speciesbash
uv run <SKILL DIR>/scripts/unibind_api.py list_species2. List Collections
2. 列出数据集合集
bash
uv run <SKILL DIR>/scripts/unibind_api.py list_collectionsbash
uv run <SKILL DIR>/scripts/unibind_api.py list_collections3. List Cell Lines & TFs (large output — use jp
)
jp3. 列出细胞系与TF(大型输出 — 使用jp
)
jpThese commands return large datasets. Use to extract
only the fields you need.
uvx --from jmespath jpbash
uv run <SKILL DIR>/scripts/unibind_api.py list_cell_lines | uvx --from jmespath jp "results[].name"
uv run <SKILL DIR>/scripts/unibind_api.py list_tfs | uvx --from jmespath jp "results[].tf_name"这些命令会返回大型数据集。使用提取所需字段即可。
uvx --from jmespath jpbash
uv run <SKILL DIR>/scripts/unibind_api.py list_cell_lines | uvx --from jmespath jp "results[].name"
uv run <SKILL DIR>/scripts/unibind_api.py list_tfs | uvx --from jmespath jp "results[].tf_name"4. List and Filter Datasets (and Profile-Specific Datasets)
4. 列出并筛选数据集(及图谱特定数据集)
Filter datasets using the following arguments:
- (e.g., "Homo sapiens")
--species - (e.g., "CTCF")
--tf-name - (e.g., "mESC")
--cell-line - (e.g., Permissive, Robust)
--collection - (a search term)
--search - (biological condition or source)
--biological-condition - (source of data, e.g., "ENCODE")
--data-source - ("true" or "false")
--has-pvalue - (e.g., "GSE60130")
--identifier - (JASPAR database profile matrix ID)
--jaspar-id - (prediction model)
--model - (summary filter)
--summary - (p-value threshold)
--threshold-pvalue
Use for standard datasets, or for
profile-specific queries.
list_datasetslist_specific_datasetsbash
uv run <SKILL DIR>/scripts/unibind_api.py list_datasets --species "Homo sapiens" --tf-name "CTCF" --data-source "ENCODE"
uv run <SKILL DIR>/scripts/unibind_api.py list_specific_datasets --species "Mus musculus" --cell-line "mESC"使用以下参数筛选数据集:
- (例如:"Homo sapiens")
--species - (例如:"CTCF")
--tf-name - (例如:"mESC")
--cell-line - (例如:Permissive、Robust)
--collection - (搜索词)
--search - (生物条件或来源)
--biological-condition - (数据来源,例如:"ENCODE")
--data-source - ("true"或"false")
--has-pvalue - (例如:"GSE60130")
--identifier - (JASPAR数据库图谱矩阵ID)
--jaspar-id - (预测模型)
--model - (摘要筛选)
--summary - (p值阈值)
--threshold-pvalue
使用查询标准数据集,或使用进行图谱特定查询。
list_datasetslist_specific_datasetsbash
uv run <SKILL DIR>/scripts/unibind_api.py list_datasets --species "Homo sapiens" --tf-name "CTCF" --data-source "ENCODE"
uv run <SKILL DIR>/scripts/unibind_api.py list_specific_datasets --species "Mus musculus" --cell-line "mESC"5. Get Dataset Details
5. 获取数据集详情
bash
uv run <SKILL DIR>/scripts/unibind_api.py get_dataset "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3"bash
uv run <SKILL DIR>/scripts/unibind_api.py get_dataset "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3"6. Download TFBS Files (BED / FASTA)
6. 下载TFBS文件(BED / FASTA格式)
Downloads all TFBS files for a dataset to a local directory. Use
(default) or .
--format bed--format fastabash
uv run <SKILL DIR>/scripts/unibind_api.py download_tfbs "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3" --output-dir /tmp/tfbs --format bed将数据集的所有TFBS文件下载到本地目录。使用(默认)或指定格式。
--format bed--format fastabash
uv run <SKILL DIR>/scripts/unibind_api.py download_tfbs "EXP047889.HMLE-Twist-ER_breast_cancer.SMAD3" --output-dir /tmp/tfbs --format bedAnti-Patterns
反模式
- DON'T attempt to use the UniBind API to query specific genomic intervals, locations, or genes.
- DON'T guess or hallucinate genome coordinates. Always use
as an external check if you're pulling local BED tracks for offline bedtools intersection.
ensembl-database - DON'T use for motif models (PFMs). Use the jaspar-database skill instead.
- DON'T use for gene expression data. UniBind only stores binding events.
- DON'T assume tissue-specific expression from dataset lists alone.
- DON'T use to read large JSON output files into context. The output is too large. Use
cator write your own code to parse the output files.jq
- 请勿尝试使用UniBind API查询特定基因组区间、位置或基因。
- 请勿猜测或虚构基因组坐标。如果要提取本地BED轨道用于离线bedtools交集分析,请始终使用作为外部校验。
ensembl-database - 请勿用于基序模型(PFM)。请改用jaspar-database技能。
- 请勿用于基因表达数据。UniBind仅存储结合事件。
- 请勿仅从数据集列表推断组织特异性表达。
- 请勿使用读取大型JSON输出文件到上下文。输出内容过大,请使用
cat或编写自定义代码解析输出文件。jq