deeptools

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

deepTools: NGS Data Analysis Toolkit

deepTools：NGS数据分析工具包

Overview

概述

deepTools is a comprehensive suite of Python command-line tools designed for processing and analyzing high-throughput sequencing data. Use deepTools to perform quality control, normalize data, compare samples, and generate publication-quality visualizations for ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, and other NGS experiments.

Core capabilities:

Convert BAM alignments to normalized coverage tracks (bigWig/bedGraph)
Quality control assessment (fingerprint, correlation, coverage)
Sample comparison and correlation analysis
Heatmap and profile plot generation around genomic features
Enrichment analysis and peak region visualization

deepTools是一套全面的Python命令行工具套件，专为处理和分析高通量测序数据而设计。使用deepTools可进行质量控制、数据标准化、样本比较，并为ChIP-seq、RNA-seq、ATAC-seq、MNase-seq及其他NGS实验生成可用于发表的可视化结果。

核心功能：

将BAM比对文件转换为标准化覆盖度轨迹文件（bigWig/bedGraph）
质量控制评估（指纹分析、相关性分析、覆盖度分析）
样本比较与相关性分析
围绕基因组特征生成热图和图谱
富集分析与峰区可视化

When to Use This Skill

适用场景

This skill should be used when:

File conversion: "Convert BAM to bigWig", "generate coverage tracks", "normalize ChIP-seq data"
Quality control: "check ChIP quality", "compare replicates", "assess sequencing depth", "QC analysis"
Visualization: "create heatmap around TSS", "plot ChIP signal", "visualize enrichment", "generate profile plot"
Sample comparison: "compare treatment vs control", "correlate samples", "PCA analysis"
Analysis workflows: "analyze ChIP-seq data", "RNA-seq coverage", "ATAC-seq analysis", "complete workflow"
Working with specific file types: BAM files, bigWig files, BED region files in genomics context

本工具适用于以下场景：

格式转换：「将BAM转为bigWig」、「生成覆盖度轨迹」、「标准化ChIP-seq数据」
质量控制：「检查ChIP质量」、「比较重复样本」、「评估测序深度」、「QC分析」
可视化：「围绕TSS创建热图」、「绘制ChIP信号图」、「可视化富集情况」、「生成图谱」
样本比较：「比较处理组与对照组」、「样本相关性分析」、「PCA分析」
分析工作流：「分析ChIP-seq数据」、「RNA-seq覆盖度分析」、「ATAC-seq分析」、「完整工作流执行」
特定文件类型处理：基因组研究场景下的BAM文件、bigWig文件、BED区域文件

Quick Start

快速入门

For users new to deepTools, start with file validation and common workflows:

对于deepTools新用户，可从文件验证和常见工作流开始：

1. Validate Input Files

1. 验证输入文件

Before running any analysis, validate BAM, bigWig, and BED files using the validation script:

bash

python scripts/validate_files.py --bam sample1.bam sample2.bam --bed regions.bed

This checks file existence, BAM indices, and format correctness.

在运行任何分析之前，使用验证脚本检查BAM、bigWig和BED文件：

bash

python scripts/validate_files.py --bam sample1.bam sample2.bam --bed regions.bed

该脚本会检查文件是否存在、BAM索引是否齐全以及格式是否正确。

2. Generate Workflow Template

2. 生成工作流模板

For standard analyses, use the workflow generator to create customized scripts:

bash

undefined

对于标准分析，可使用工作流生成器创建自定义脚本：

bash

undefined

List available workflows

列出可用工作流

python scripts/workflow_generator.py --list

Generate ChIP-seq QC workflow

生成ChIP-seq QC工作流

python scripts/workflow_generator.py chipseq_qc -o qc_workflow.sh
--input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam"
--genome-size 2913022398

Make executable and run

赋予执行权限并运行

chmod +x qc_workflow.sh ./qc_workflow.sh

undefined

chmod +x qc_workflow.sh ./qc_workflow.sh

undefined

3. Most Common Operations

3. 最常用操作

See

assets/quick_reference.md

for frequently used commands and parameters.

常用命令和参数请参考

assets/quick_reference.md

。

Installation

安装

bash

uv pip install deeptools

bash

uv pip install deeptools

Core Workflows

核心工作流

deepTools workflows typically follow this pattern: QC → Normalization → Comparison/Visualization

deepTools工作流通常遵循以下流程：质量控制 → 数据标准化 → 比较/可视化

ChIP-seq Quality Control Workflow

ChIP-seq质量控制工作流

When users request ChIP-seq QC or quality assessment:

Generate workflow script using

scripts/workflow_generator.py chipseq_qc

Key QC steps:
- Sample correlation (multiBamSummary + plotCorrelation)
- PCA analysis (plotPCA)
- Coverage assessment (plotCoverage)
- Fragment size validation (bamPEFragmentSize)
- ChIP enrichment strength (plotFingerprint)

Interpreting results:

Correlation: Replicates should cluster together with high correlation (>0.9)
Fingerprint: Strong ChIP shows steep rise; flat diagonal indicates poor enrichment
Coverage: Assess if sequencing depth is adequate for analysis

Full workflow details in

references/workflows.md

→ "ChIP-seq Quality Control Workflow"

当用户需要ChIP-seq质量控制或评估时：

使用
scripts/workflow_generator.py chipseq_qc
生成工作流脚本

关键QC步骤：
- 样本相关性分析（multiBamSummary + plotCorrelation）
- PCA分析（plotPCA）
- 覆盖度评估（plotCoverage）
- 片段长度验证（bamPEFragmentSize）
- ChIP富集强度分析（plotFingerprint）

结果解读：

相关性：重复样本应聚类在一起，且相关性高于0.9
指纹图：强ChIP信号会呈现陡峭上升趋势；平坦的对角线表示富集效果差
覆盖度：评估测序深度是否满足分析需求

完整工作流详情请参考

references/workflows.md

→ 「ChIP-seq质量控制工作流」

ChIP-seq Complete Analysis Workflow

ChIP-seq完整分析工作流

For full ChIP-seq analysis from BAM to visualizations:

Generate coverage tracks with normalization (bamCoverage)
Create comparison tracks (bamCompare for log2 ratio)
Compute signal matrices around features (computeMatrix)
Generate visualizations (plotHeatmap, plotProfile)
Enrichment analysis at peaks (plotEnrichment)

Use

scripts/workflow_generator.py chipseq_analysis

to generate template.

Complete command sequences in

references/workflows.md

→ "ChIP-seq Analysis Workflow"

从BAM文件到可视化结果的完整ChIP-seq分析流程：

生成带标准化的覆盖度轨迹（使用bamCoverage）
创建样本比较轨迹（使用bamCompare计算log2比值）
计算特征区域的信号矩阵（使用computeMatrix）
生成可视化结果（plotHeatmap、plotProfile）
峰区富集分析（plotEnrichment）

可使用

scripts/workflow_generator.py chipseq_analysis

生成模板。

完整命令序列请参考

references/workflows.md

→ 「ChIP-seq分析工作流」

RNA-seq Coverage Workflow

RNA-seq覆盖度工作流

For strand-specific RNA-seq coverage tracks:

Use bamCoverage with

--filterRNAstrand

to separate forward and reverse strands.

Important: NEVER use

--extendReads

for RNA-seq (would extend over splice junctions).

Use normalization: CPM for fixed bins, RPKM for gene-level analysis.

Template available:

scripts/workflow_generator.py rnaseq_coverage

Details in

references/workflows.md

→ "RNA-seq Coverage Workflow"

针对链特异性RNA-seq覆盖度轨迹：

使用bamCoverage并添加

--filterRNAstrand

参数分离正链和负链。

重要提示： RNA-seq分析绝不允许使用

--extendReads

参数（该参数会延伸读段跨越剪接位点）。

标准化方法选择：固定bin使用CPM，基因水平分析使用RPKM。

模板生成命令：

scripts/workflow_generator.py rnaseq_coverage

详情请参考

references/workflows.md

→ 「RNA-seq覆盖度工作流」

ATAC-seq Analysis Workflow

ATAC-seq分析工作流

ATAC-seq requires Tn5 offset correction:

Shift reads using alignmentSieve with
```
--ATACshift
```
Generate coverage with bamCoverage
Analyze fragment sizes (expect nucleosome ladder pattern)
Visualize at peaks if available

Template:

scripts/workflow_generator.py atacseq

Full workflow in

references/workflows.md

→ "ATAC-seq Workflow"

ATAC-seq分析需要进行Tn5偏移校正：

使用alignmentSieve并添加
--ATACshift
参数偏移读段
使用bamCoverage生成覆盖度轨迹
分析片段长度（预期会出现核小体梯状模式）
若有峰区数据则进行可视化

模板生成命令：

scripts/workflow_generator.py atacseq

完整工作流请参考

references/workflows.md

→ 「ATAC-seq工作流」

Tool Categories and Common Tasks

工具分类与常见任务

BAM/bigWig Processing

BAM/bigWig处理

Convert BAM to normalized coverage:

bash

bamCoverage --bam input.bam --outFileName output.bw \
    --normalizeUsing RPGC --effectiveGenomeSize 2913022398 \
    --binSize 10 --numberOfProcessors 8

Compare two samples (log2 ratio):

bash

bamCompare -b1 treatment.bam -b2 control.bam -o ratio.bw \
    --operation log2 --scaleFactorsMethod readCount

Key tools: bamCoverage, bamCompare, multiBamSummary, multiBigwigSummary, correctGCBias, alignmentSieve

Complete reference:

references/tools_reference.md

→ "BAM and bigWig File Processing Tools"

将BAM转换为标准化覆盖度轨迹：

bash

bamCoverage --bam input.bam --outFileName output.bw \
    --normalizeUsing RPGC --effectiveGenomeSize 2913022398 \
    --binSize 10 --numberOfProcessors 8

比较两个样本（log2比值）：

bash

bamCompare -b1 treatment.bam -b2 control.bam -o ratio.bw \
    --operation log2 --scaleFactorsMethod readCount

核心工具： bamCoverage、bamCompare、multiBamSummary、multiBigwigSummary、correctGCBias、alignmentSieve

完整参考文档：

references/tools_reference.md

→ 「BAM和bigWig文件处理工具」

Quality Control

质量控制

Check ChIP enrichment:

bash

plotFingerprint -b input.bam chip.bam -o fingerprint.png \
    --extendReads 200 --ignoreDuplicates

Sample correlation:

bash

multiBamSummary bins --bamfiles *.bam -o counts.npz
plotCorrelation -in counts.npz --corMethod pearson \
    --whatToShow heatmap -o correlation.png

Key tools: plotFingerprint, plotCoverage, plotCorrelation, plotPCA, bamPEFragmentSize

Complete reference:

references/tools_reference.md

→ "Quality Control Tools"

检查ChIP富集效果：

bash

plotFingerprint -b input.bam chip.bam -o fingerprint.png \
    --extendReads 200 --ignoreDuplicates

样本相关性分析：

bash

multiBamSummary bins --bamfiles *.bam -o counts.npz
plotCorrelation -in counts.npz --corMethod pearson \
    --whatToShow heatmap -o correlation.png

核心工具： plotFingerprint、plotCoverage、plotCorrelation、plotPCA、bamPEFragmentSize

完整参考文档：

references/tools_reference.md

→ 「质量控制工具」

Visualization

可视化

Create heatmap around TSS:

bash

undefined

围绕TSS创建热图：

bash

undefined

Compute matrix

计算矩阵

computeMatrix reference-point -S signal.bw -R genes.bed
-b 3000 -a 3000 --referencePoint TSS -o matrix.gz

Generate heatmap

生成热图

plotHeatmap -m matrix.gz -o heatmap.png
--colorMap RdBu --kmeans 3


**Create profile plot:**
```bash
plotProfile -m matrix.gz -o profile.png \
    --plotType lines --colors blue red

Key tools: computeMatrix, plotHeatmap, plotProfile, plotEnrichment

Complete reference:

references/tools_reference.md

→ "Visualization Tools"

plotHeatmap -m matrix.gz -o heatmap.png
--colorMap RdBu --kmeans 3


**生成图谱：**
```bash
plotProfile -m matrix.gz -o profile.png \
    --plotType lines --colors blue red

核心工具： computeMatrix、plotHeatmap、plotProfile、plotEnrichment

完整参考文档：

references/tools_reference.md

→ 「可视化工具」

Normalization Methods

标准化方法

Choosing the correct normalization is critical for valid comparisons. Consult

references/normalization_methods.md

for comprehensive guidance.

Quick selection guide:

ChIP-seq coverage: Use RPGC or CPM
ChIP-seq comparison: Use bamCompare with log2 and readCount
RNA-seq bins: Use CPM
RNA-seq genes: Use RPKM (accounts for gene length)
ATAC-seq: Use RPGC or CPM

Normalization methods:

RPGC: 1× genome coverage (requires --effectiveGenomeSize)
CPM: Counts per million mapped reads
RPKM: Reads per kb per million (accounts for region length)
BPM: Bins per million
None: Raw counts (not recommended for comparisons)

Full explanation:

references/normalization_methods.md

选择正确的标准化方法对于有效比较至关重要。请参考

references/normalization_methods.md

获取全面指导。

快速选择指南：

ChIP-seq覆盖度：使用RPGC或CPM
ChIP-seq样本比较：使用bamCompare并选择log2和readCount方法
RNA-seq bin分析：使用CPM
RNA-seq基因分析：使用RPKM（考虑基因长度）
ATAC-seq：使用RPGC或CPM

标准化方法说明：

RPGC：1×基因组覆盖度（需指定--effectiveGenomeSize参数）
CPM：每百万比对读段计数
RPKM：每千碱基每百万读段（考虑区域长度）
BPM：每百万bin计数
无标准化：原始计数（不推荐用于样本比较）

完整说明请参考：

references/normalization_methods.md

Effective Genome Sizes

有效基因组大小

RPGC normalization requires effective genome size. Common values:

Organism	Assembly	Size	Usage
Human	GRCh38/hg38	2,913,022,398	`--effectiveGenomeSize 2913022398`
Mouse	GRCm38/mm10	2,652,783,500	`--effectiveGenomeSize 2652783500`
Zebrafish	GRCz11	1,368,780,147	`--effectiveGenomeSize 1368780147`
Drosophila	dm6	142,573,017	`--effectiveGenomeSize 142573017`
C. elegans	ce10/ce11	100,286,401	`--effectiveGenomeSize 100286401`

Complete table with read-length-specific values:

references/effective_genome_sizes.md

RPGC标准化需要有效基因组大小。常见物种的数值如下：

物种	组装版本	大小	使用方式
人类	GRCh38/hg38	2,913,022,398	`--effectiveGenomeSize 2913022398`
小鼠	GRCm38/mm10	2,652,783,500	`--effectiveGenomeSize 2652783500`
斑马鱼	GRCz11	1,368,780,147	`--effectiveGenomeSize 1368780147`
果蝇	dm6	142,573,017	`--effectiveGenomeSize 142573017`
秀丽隐杆线虫	ce10/ce11	100,286,401	`--effectiveGenomeSize 100286401`

包含读长特异性数值的完整表格请参考：

references/effective_genome_sizes.md

Common Parameters Across Tools

工具通用参数

Many deepTools commands share these options:

Performance:

```
--numberOfProcessors, -p
```
: Enable parallel processing (always use available cores)
```
--region
```
: Process specific regions for testing (e.g.,
```
chr1:1-1000000
```
)

Read Filtering:

```
--ignoreDuplicates
```
: Remove PCR duplicates (recommended for most analyses)

--minMappingQuality

: Filter by alignment quality (e.g.,

--minMappingQuality 10

)

```
--minFragmentLength
```
/
```
--maxFragmentLength
```
: Fragment length bounds
```
--samFlagInclude
```
/
```
--samFlagExclude
```
: SAM flag filtering

Read Processing:

```
--extendReads
```
: Extend to fragment length (ChIP-seq: YES, RNA-seq: NO)
```
--centerReads
```
: Center at fragment midpoint for sharper signals

许多deepTools命令共享以下选项：

性能优化：

```
--numberOfProcessors, -p
```
：启用并行处理（建议使用所有可用核心）
```
--region
```
：仅处理特定区域用于测试（例如：
```
chr1:1-1000000
```
）

读段过滤：

```
--ignoreDuplicates
```
：去除PCR重复（大多数分析推荐使用）
```
--minMappingQuality
```
：根据比对质量过滤读段（例如：
```
--minMappingQuality 10
```
）
```
--minFragmentLength
```
/
```
--maxFragmentLength
```
：片段长度范围
```
--samFlagInclude
```
/
```
--samFlagExclude
```
：根据SAM标签过滤

读段处理：

```
--extendReads
```
：将读段延伸至片段长度（ChIP-seq：是，RNA-seq：否）
```
--centerReads
```
：将读段居中于片段中点以获得更清晰的信号

Best Practices

最佳实践

File Validation

文件验证

Always validate files first using

scripts/validate_files.py

to check:

File existence and readability
BAM indices present (.bai files)
BED format correctness
File sizes reasonable

始终先验证文件，使用

scripts/validate_files.py

检查：

文件是否存在且可读
BAM索引文件（.bai）是否存在
BED格式是否正确
文件大小是否合理

Analysis Strategy

分析策略

Start with QC: Run correlation, coverage, and fingerprint analysis before proceeding
Test on small regions: Use
```
--region chr1:1-10000000
```
for parameter testing
Document commands: Save full command lines for reproducibility
Use consistent normalization: Apply same method across samples in comparisons
Verify genome assembly: Ensure BAM and BED files use matching genome builds

从QC开始：在进行后续分析前先运行相关性、覆盖度和指纹分析
在小区域测试：使用
```
--region chr1:1-10000000
```
进行参数测试
记录命令：保存完整命令行以保证可重复性
使用一致的标准化方法：在样本比较中对所有样本应用相同的标准化方法
验证基因组组装版本：确保BAM和BED文件使用匹配的基因组版本

ChIP-seq Specific

ChIP-seq特定注意事项

Always extend reads for ChIP-seq:
```
--extendReads 200
```
Remove duplicates: Use
```
--ignoreDuplicates
```
in most cases
Check enrichment first: Run plotFingerprint before detailed analysis
GC correction: Only apply if significant bias detected; never use
```
--ignoreDuplicates
```
after GC correction

ChIP-seq必须延伸读段：使用
```
--extendReads 200
```
去除重复读段：大多数情况下使用
```
--ignoreDuplicates
```
先检查富集效果：在进行详细分析前先运行plotFingerprint
GC校正：仅在检测到显著偏差时应用；GC校正后绝不能使用
```
--ignoreDuplicates
```

RNA-seq Specific

RNA-seq特定注意事项

Never extend reads for RNA-seq (would span splice junctions)
Strand-specific: Use
```
--filterRNAstrand forward/reverse
```
for stranded libraries
Normalization: CPM for bins, RPKM for genes

RNA-seq绝不能延伸读段（会跨越剪接位点）
链特异性数据：对链特异性文库使用
```
--filterRNAstrand forward/reverse
```
标准化方法：bin分析使用CPM，基因分析使用RPKM

ATAC-seq Specific

ATAC-seq特定注意事项

Apply Tn5 correction: Use alignmentSieve with
```
--ATACshift
```
Fragment filtering: Set appropriate min/max fragment lengths
Check nucleosome pattern: Fragment size plot should show ladder pattern

应用Tn5校正：使用alignmentSieve并添加
```
--ATACshift
```
参数
片段过滤：设置合适的最小/最大片段长度
检查核小体模式：片段长度图应呈现梯状模式

Performance Optimization

性能优化

Use multiple processors:
```
--numberOfProcessors 8
```
(or available cores)
Increase bin size for faster processing and smaller files
Process chromosomes separately for memory-limited systems
Pre-filter BAM files using alignmentSieve to create reusable filtered files
Use bigWig over bedGraph: Compressed and faster to process

使用多核心：
```
--numberOfProcessors 8
```
（或所有可用核心）
增大bin大小：加快处理速度并减小文件体积
分染色体处理：适用于内存有限的系统
预过滤BAM文件：使用alignmentSieve创建可重复使用的过滤后文件
优先使用bigWig而非bedGraph：压缩格式且处理速度更快

Troubleshooting

故障排除

Common Issues

常见问题

BAM index missing:

bash

samtools index input.bam

Out of memory: Process chromosomes individually using

--region

bash

bamCoverage --bam input.bam -o chr1.bw --region chr1

Slow processing: Increase

--numberOfProcessors

and/or increase

--binSize

bigWig files too large: Increase bin size:

--binSize 50

or larger

BAM索引缺失：

bash

samtools index input.bam

内存不足： 使用

--region

参数单独处理各染色体：

bash

bamCoverage --bam input.bam -o chr1.bw --region chr1

处理速度慢： 增大

--numberOfProcessors

参数值，或增大

--binSize

bigWig文件过大： 增大bin大小：

--binSize 50

或更大

Validation Errors

验证错误

Run validation script to identify issues:

bash

python scripts/validate_files.py --bam *.bam --bed regions.bed

Common errors and solutions explained in script output.

运行验证脚本识别问题：

bash

python scripts/validate_files.py --bam *.bam --bed regions.bed

脚本输出中会解释常见错误及解决方案。

Reference Documentation

参考文档

This skill includes comprehensive reference documentation:

本工具包含全面的参考文档：

references/tools_reference.md

Complete documentation of all deepTools commands organized by category:

BAM and bigWig processing tools (9 tools)
Quality control tools (6 tools)
Visualization tools (3 tools)
Miscellaneous tools (2 tools)

Each tool includes:

Purpose and overview
Key parameters with explanations
Usage examples
Important notes and best practices

Use this reference when: Users ask about specific tools, parameters, or detailed usage.

按分类整理的所有deepTools命令完整文档：

BAM和bigWig文件处理工具（9个）
质量控制工具（6个）
可视化工具（3个）
其他工具（2个）

每个工具包含：

用途与概述
关键参数说明
使用示例
重要注意事项与最佳实践

适用场景： 用户询问特定工具、参数或详细用法时

references/workflows.md

Complete workflow examples for common analyses:

ChIP-seq quality control workflow
ChIP-seq complete analysis workflow
RNA-seq coverage workflow
ATAC-seq analysis workflow
Multi-sample comparison workflow
Peak region analysis workflow
Troubleshooting and performance tips

Use this reference when: Users need complete analysis pipelines or workflow examples.

常见deepTools工作流的完整示例：

ChIP-seq质量控制工作流
ChIP-seq完整分析工作流
RNA-seq覆盖度工作流
ATAC-seq分析工作流
多样本比较工作流
峰区分析工作流
故障排除与性能优化技巧

适用场景： 用户需要完整分析流程或工作流示例时

references/normalization_methods.md

Comprehensive guide to normalization methods:

Detailed explanation of each method (RPGC, CPM, RPKM, BPM, etc.)
When to use each method
Formulas and interpretation
Selection guide by experiment type
Common pitfalls and solutions
Quick reference table

Use this reference when: Users ask about normalization, comparing samples, or which method to use.

标准化方法全面指南：

每种方法的详细说明（RPGC、CPM、RPKM、BPM等）
各方法的适用场景
计算公式与结果解读
按实验类型分类的选择指南
常见误区与解决方案
快速参考表格

适用场景： 用户询问标准化方法、样本比较或方法选择时

references/effective_genome_sizes.md

Effective genome size values and usage:

Common organism values (human, mouse, fly, worm, zebrafish)
Read-length-specific values
Calculation methods
When and how to use in commands
Custom genome calculation instructions

Use this reference when: Users need genome size for RPGC normalization or GC bias correction.

有效基因组大小数值与使用说明：

常见物种数值（人类、小鼠、果蝇、线虫、斑马鱼）
读长特异性数值
计算方法
在命令中的使用时机与方式
自定义基因组的计算说明

适用场景： 用户需要RPGC标准化或GC偏差校正的基因组大小时

Helper Scripts

辅助脚本

scripts/validate_files.py

Validates BAM, bigWig, and BED files for deepTools analysis. Checks file existence, indices, and format.

Usage:

bash

python scripts/validate_files.py --bam sample1.bam sample2.bam \
    --bed peaks.bed --bigwig signal.bw

When to use: Before starting any analysis, or when troubleshooting errors.

验证deepTools分析所需的BAM、bigWig和BED文件。检查文件是否存在、索引是否齐全及格式是否正确。

用法：

bash

python scripts/validate_files.py --bam sample1.bam sample2.bam \
    --bed peaks.bed --bigwig signal.bw

适用场景： 开始任何分析之前，或排查错误时

scripts/workflow_generator.py

Generates customizable bash script templates for common deepTools workflows.

Available workflows:

```
chipseq_qc
```
: ChIP-seq quality control
```
chipseq_analysis
```
: Complete ChIP-seq analysis
```
rnaseq_coverage
```
: Strand-specific RNA-seq coverage
```
atacseq
```
: ATAC-seq with Tn5 correction

Usage:

bash

undefined

为常见deepTools工作流生成可自定义的bash脚本模板。

可用工作流：

```
chipseq_qc
```
：ChIP-seq质量控制
```
chipseq_analysis
```
：完整ChIP-seq分析
```
rnaseq_coverage
```
：链特异性RNA-seq覆盖度分析
```
atacseq
```
：带Tn5校正的ATAC-seq分析

用法：

bash

undefined

List workflows

列出可用工作流

python scripts/workflow_generator.py --list

Generate workflow

生成工作流

python scripts/workflow_generator.py chipseq_qc -o qc.sh
--input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam"
--genome-size 2913022398 --threads 8

Run generated workflow

运行生成的工作流

chmod +x qc.sh ./qc.sh


**When to use:** Users request standard workflows or need template scripts to customize.

chmod +x qc.sh ./qc.sh


**适用场景：** 用户需要标准工作流或可自定义的模板脚本时

Assets

资源文件

assets/quick_reference.md

Quick reference card with most common commands, effective genome sizes, and typical workflow pattern.

When to use: Users need quick command examples without detailed documentation.

快速参考卡片，包含最常用命令、有效基因组大小及典型工作流模式。

适用场景： 用户需要快速命令示例而无需详细文档时

Handling User Requests

用户请求处理指南

For New Users

新用户

Start with installation verification
Validate input files using
```
scripts/validate_files.py
```
Recommend appropriate workflow based on experiment type
Generate workflow template using
```
scripts/workflow_generator.py
```
Guide through customization and execution

先验证安装是否成功
使用
```
scripts/validate_files.py
```
验证输入文件
根据实验类型推荐合适的工作流
使用
```
scripts/workflow_generator.py
```
生成工作流模板
指导用户进行自定义和执行

For Experienced Users

资深用户

Provide specific tool commands for requested operations
Reference appropriate sections in
```
references/tools_reference.md
```
Suggest optimizations and best practices
Offer troubleshooting for issues

为用户请求的操作提供特定工具命令
引导用户参考
```
references/tools_reference.md
```
中的对应章节
建议优化方案与最佳实践
提供问题排查支持

For Specific Tasks

特定任务处理

"Convert BAM to bigWig":

Use bamCoverage with appropriate normalization
Recommend RPGC or CPM based on use case
Provide effective genome size for organism
Suggest relevant parameters (extendReads, ignoreDuplicates, binSize)

"Check ChIP quality":

Run full QC workflow or use plotFingerprint specifically
Explain interpretation of results
Suggest follow-up actions based on results

"Create heatmap":

Guide through two-step process: computeMatrix → plotHeatmap
Help choose appropriate matrix mode (reference-point vs scale-regions)
Suggest visualization parameters and clustering options

"Compare samples":

Recommend bamCompare for two-sample comparison
Suggest multiBamSummary + plotCorrelation for multiple samples
Guide normalization method selection

「将BAM转为bigWig」：

使用bamCoverage并选择合适的标准化方法
根据使用场景推荐RPGC或CPM
提供对应物种的有效基因组大小
建议相关参数（extendReads、ignoreDuplicates、binSize）

「检查ChIP质量」：

运行完整QC工作流或单独使用plotFingerprint
解释结果的解读方式
根据结果建议后续操作

「创建热图」：

引导用户完成两步流程：computeMatrix → plotHeatmap
帮助选择合适的矩阵模式（reference-point vs scale-regions）
建议可视化参数与聚类选项
推荐同时生成图谱作为补充

Referencing Documentation

关键提醒

When users need detailed information:

Tool details: Direct to specific sections in
```
references/tools_reference.md
```
Workflows: Use
```
references/workflows.md
```
for complete analysis pipelines
Normalization: Consult
```
references/normalization_methods.md
```
for method selection
Genome sizes: Reference
```
references/effective_genome_sizes.md
```

Search references using grep patterns:

bash

undefined

先验证文件：分析前务必验证输入文件
标准化很重要：根据比较类型选择合适的方法
谨慎使用读段延伸：ChIP-seq可用，RNA-seq禁用
使用所有核心：将
```
--numberOfProcessors
```
设置为可用核心数
在小区域测试：使用
```
--region
```
进行参数测试
先做QC：在详细分析前先运行质量控制
记录所有操作：保存命令以保证可重复性
参考文档：使用全面的参考文档获取详细指导

Find tool documentation

—

grep -A 20 "^### toolname" references/tools_reference.md

—

Find workflow

—

grep -A 50 "^## Workflow Name" references/workflows.md

—

Find normalization method

—

grep -A 15 "^### Method Name" references/normalization_methods.md

undefined

—

Example Interactions

—

User: "I need to analyze my ChIP-seq data"

Response approach:

Ask about files available (BAM files, peaks, genes)
Validate files using validation script
Generate chipseq_analysis workflow template
Customize for their specific files and organism
Explain each step as script runs

User: "Which normalization should I use?"

Response approach:

Ask about experiment type (ChIP-seq, RNA-seq, etc.)
Ask about comparison goal (within-sample or between-sample)
Consult
```
references/normalization_methods.md
```
selection guide
Recommend appropriate method with justification
Provide command example with parameters

User: "Create a heatmap around TSS"

Response approach:

Verify bigWig and gene BED files available
Use computeMatrix with reference-point mode at TSS
Generate plotHeatmap with appropriate visualization parameters
Suggest clustering if dataset is large
Offer profile plot as complement

—

Key Reminders

—

File validation first: Always validate input files before analysis
Normalization matters: Choose appropriate method for comparison type
Extend reads carefully: YES for ChIP-seq, NO for RNA-seq
Use all cores: Set
```
--numberOfProcessors
```
to available cores
Test on regions: Use
```
--region
```
for parameter testing
Check QC first: Run quality control before detailed analysis
Document everything: Save commands for reproducibility
Reference documentation: Use comprehensive references for detailed guidance

—