Epigenomics & Gene Regulation Analysis
Comprehensive analysis of the regulatory genome integrating functional genomics experiments, transcription factor binding data, cis-regulatory element catalogs, chromatin conformation, and variant regulatory scoring. Generates structured regulatory landscape reports with evidence grading.
When to Use This Skill
Triggers:
- "What regulates [gene]?" / "Show the regulatory landscape of [gene]"
- "What transcription factors bind to [gene/region]?"
- "Find enhancers near [gene]"
- "What is the regulatory impact of variant [rsID]?"
- "Find ENCODE experiments for [histone mark/TF] in [cell type]"
- "What is the chromatin structure around [gene/region]?"
- "Analyze the epigenetic regulation of [gene]"
- "Find transcription factor binding motifs for [TF]"
- "Regulatory element analysis for [genomic region]"
Use Cases:
- Gene Regulatory Landscape: Comprehensive view of all regulatory elements, TF binding, and chromatin around a gene
- Transcription Factor Profiling: TF binding motifs (JASPAR), binding sites (ReMap), and target gene identification
- Regulatory Variant Interpretation: Assess non-coding variant impact using RegulomeDB, SCREEN, and ENCODE
- Functional Genomics Data Discovery: Find ChIP-seq, ATAC-seq, Hi-C experiments from ENCODE and 4DN
- Enhancer/Promoter Cataloging: Identify and characterize cis-regulatory elements using SCREEN
- Chromatin Conformation: 3D genome organization from 4D Nucleome Hi-C data
- Epigenetic Profiling: Histone modification patterns, DNA methylation, chromatin accessibility
KEY PRINCIPLES
- Report-first approach - Create report file FIRST, then populate progressively
- Tool parameter verification - Verify params via before calling unfamiliar tools
- Evidence grading - Grade all regulatory findings by evidence strength (T1-T4)
- Citation requirements - Every finding must have inline source attribution (database, experiment ID)
- Mandatory completeness - All sections must exist with data minimums or explicit "No data" notes
- Gene disambiguation first - Resolve gene symbol/coordinates before analysis
- Cell-type context - Always note cell type specificity of regulatory data
- Negative results documented - "No enhancers found in region" is data; empty sections are failures
- English-first queries - Always use English gene names and standard nomenclature in tool calls
Evidence Grading System (MANDATORY)
Grade every regulatory finding by evidence strength:
| Tier | Symbol | Criteria | Examples |
|---|
| T1 | [T1] | Direct experimental validation, functional assay | CRISPR-validated enhancer, reporter assay, luciferase |
| T2 | [T2] | High-quality experimental data, curated | ENCODE ChIP-seq peak, SCREEN cCRE, ReMap binding site |
| T3 | [T3] | Computational prediction, motif match | JASPAR motif score, RegulomeDB score, Ensembl regulatory prediction |
| T4 | [T4] | Association, text-mined, low confidence | Literature mention, low-score motif match, inferred regulation |
Core Strategy: 7 Research Dimensions
Gene / Region / Variant Query
|
+-- PHASE 0: Gene/Region Resolution (ALWAYS FIRST)
| +-- Resolve gene symbol -> Ensembl ID, coordinates, aliases
| +-- Define genomic region of interest (+/- 500kb flanking)
|
+-- PHASE 1: Cis-Regulatory Elements (SCREEN)
| +-- Candidate enhancers, promoters, insulators
| +-- cCRE activity by cell type
| +-- CTCF binding sites
|
+-- PHASE 2: Transcription Factor Binding
| +-- JASPAR: TF binding motifs and PWMs
| +-- ReMap: ChIP-seq validated TF binding sites
| +-- ENCODE: TF ChIP-seq experiments
|
+-- PHASE 3: Regulatory Variant Scoring
| +-- RegulomeDB: Variant regulatory evidence score
| +-- Functional annotations from multiple data types
|
+-- PHASE 4: ENCODE Functional Genomics
| +-- Histone modification ChIP-seq
| +-- ATAC-seq / DNase-seq accessibility
| +-- RNA-seq expression context
| +-- Available experiments and datasets
|
+-- PHASE 5: Chromatin Conformation (4D Nucleome)
| +-- Hi-C contact maps
| +-- TAD boundaries
| +-- Chromatin loops and compartments
|
+-- PHASE 6: Ensembl Regulatory Annotation
| +-- Regulatory build features
| +-- Promoter/enhancer/CTCF site annotations
| +-- Activity states across cell types
|
+-- SYNTHESIS: Integrated Regulatory Model
+-- Aggregate regulatory evidence
+-- Build gene regulation model
+-- Identify key regulatory elements and TFs
+-- Data gaps and experimental recommendations
Phase 0: Gene/Region Resolution (ALWAYS FIRST)
CRITICAL: Resolve gene identity and genomic coordinates before any analysis.
Input Types Handled
| Input Format | Resolution Strategy |
|---|
| Gene symbol (e.g., "BRCA1") | Ensembl lookup -> coordinates, Ensembl ID |
| Genomic region (e.g., "chr17:43044295-43170245") | Use directly; identify overlapping genes |
| Ensembl ID (e.g., "ENSG00000012048") | Ensembl lookup -> symbol, coordinates |
| rsID (e.g., "rs12345") | RegulomeDB/Ensembl -> coordinates, nearby genes |
Resolution Tools
| Tool | Purpose | Parameters |
|---|
| Gene symbol to Ensembl ID + coordinates | : str, : str |
| Official gene symbol, aliases | : str |
| Cross-references to external databases | : str |
Disambiguation Output
markdown
## Gene Identity
|----------|-------|
| **Gene Symbol** | TP53 |
| **Ensembl ID** | ENSG00000141510 |
| **Chromosome** | 17 |
| **Start** | 7661779 |
| **End** | 7687550 |
| **Strand** | - |
| **Region of Interest** | 17:7161779-8187550 (+/- 500kb) |
| **Aliases** | p53, TRP53, LFS1 |
Phase 1: Cis-Regulatory Elements (SCREEN)
When: Gene name or genomic region available
Objective: Catalog candidate cis-regulatory elements (cCREs) from the ENCODE SCREEN database
Tools Used
| Tool | Function | Parameters |
|---|
SCREEN_get_regulatory_elements
| Get cCREs for a gene | : str, : str, : int |
Workflow
- Query enhancers:
SCREEN_get_regulatory_elements(gene_name=gene, element_type="enhancer", limit=20)
- Query promoters:
SCREEN_get_regulatory_elements(gene_name=gene, element_type="promoter", limit=20)
- Query insulators:
SCREEN_get_regulatory_elements(gene_name=gene, element_type="insulator", limit=10)
- For each element: extract coordinates, activity scores, cell type specificity
Decision Logic
- Multiple element types: Always query enhancers AND promoters (insulators optional)
- Empty results: Some genes have fewer regulatory elements; note counts
- Cell type specificity: SCREEN data is cell-type annotated; report top active cell types
- All findings graded [T2]: SCREEN cCREs are experimentally derived from ENCODE data
Output Format
markdown
### Cis-Regulatory Elements (SCREEN) [T2]
#### Enhancers (15 found)
|-----------|-------------|---------------|----------------|
| EH38E1234567 | chr17:7650000-7651000 | 0.95 | HepG2, K562 |
| ... | ... | ... | ... |
#### Promoters (3 found)
|-----------|-------------|---------------|----------------|
| EH38E9876543 | chr17:7687000-7688000 | 0.99 | Ubiquitous |
| ... | ... | ... | ... |
#### Insulators (2 found)
|-----------|-------------|-------------|
| EH38E5555555 | chr17:7700000-7701000 | Yes |
Phase 2: Transcription Factor Binding
When: Gene symbol available
Objective: Identify transcription factors that regulate the gene through motif analysis and ChIP-seq binding data
Tools Used
JASPAR - TF Binding Motifs
| Tool | Function | Parameters |
|---|
| Search TF binding motifs | : str, : str, : str, : str |
| Get PWM for specific TF | : str |
JASPAR_get_transcription_factors
| List TFs in collection | : str, : int, : int |
ReMap - Validated TF Binding Sites
| Tool | Function | Parameters |
|---|
ReMap_get_transcription_factor_binding
| Get TF binding sites near gene | : str, : str, : int |
ENCODE - ChIP-seq Experiments
| Tool | Function | Parameters |
|---|
ENCODE_search_experiments
| Search TF ChIP-seq experiments | : str, : str, : str, : int |
Workflow
- JASPAR motif search: Search for known TF binding motifs
jaspar_search_matrices(search=gene_symbol, collection="CORE", species="9606")
- If gene IS a TF: get its PWM binding motif
- If gene is NOT a TF: identify TFs known to bind its promoter
- ReMap binding data: Get experimentally validated TF binding sites
ReMap_get_transcription_factor_binding(gene_name=gene, cell_type="HepG2", limit=20)
- Try multiple cell types: "HepG2", "K562", "MCF-7", "GM12878"
- ENCODE ChIP-seq: Find available ChIP-seq experiments for key TFs
ENCODE_search_experiments(assay_title="ChIP-seq", target=top_tf, organism="Homo sapiens", limit=5)
Decision Logic
- Gene is a TF: Show its binding motif (JASPAR PWM) + target genes + ENCODE ChIP-seq experiments
- Gene is NOT a TF: Show TFs that bind its promoter/enhancers (ReMap) + relevant motifs
- Multiple cell types for ReMap: Query at least 2-3 common cell types
- JASPAR grades [T3]: Motif predictions are computational
- ReMap grades [T2]: Based on experimental ChIP-seq data
- ENCODE grades [T2]: Direct experimental data
Output Format
markdown
### Transcription Factor Binding
#### JASPAR Binding Motifs [T3]
|-----------|---------|-------|---------------|
| MA0106.3 | TP53 | 0.92 | RRRCWWGYYY |
| ... | ... | ... | ... |
#### ReMap ChIP-seq Validated Binding [T2]
|---------------------|-----------|--------------|-------------|
| SP1 | HepG2 | 850 | chr17:7687200-7687500 |
| CTCF | K562 | 920 | chr17:7700100-7700400 |
| ... | ... | ... | ... |
#### ENCODE ChIP-seq Experiments Available [T2]
|-----------|--------|-----------|-------|--------|
| ENCSR000BNT | TP53 | HepG2 | 12 | released |
| ... | ... | ... | ... | ... |
Phase 3: Regulatory Variant Scoring
When: rsID or variant provided, OR gene has known regulatory variants
Objective: Assess the regulatory impact of genetic variants in the region
Tools Used
| Tool | Function | Parameters |
|---|
| Get regulatory evidence score for variant | : str |
Workflow
- If rsID provided: Query RegulomeDB directly
RegulomeDB_query_variant(rsid=rsid)
- Parse RegulomeDB score (1a-7): lower = more regulatory evidence
- Extract supporting evidence types (eQTL, TF binding, chromatin state, etc.)
- Cross-reference with SCREEN and ENCODE data from other phases
RegulomeDB Score Interpretation
| Score | Meaning | Evidence Level |
|---|
| 1a | eQTL + TF binding + DNase + motif | Very likely regulatory [T2] |
| 1b | eQTL + TF binding + DNase | Likely regulatory [T2] |
| 1c | eQTL + TF binding + motif | Likely regulatory [T2] |
| 1d | eQTL + TF binding | Likely regulatory [T2] |
| 1e | eQTL + DNase | Likely regulatory [T3] |
| 1f | eQTL only | Possible regulatory [T3] |
| 2a-2c | TF binding + DNase/motif | Likely affects TF binding [T3] |
| 3a-3b | DNase or ChIP-seq evidence | Some evidence [T3] |
| 4-7 | Minimal or no evidence | Limited evidence [T4] |
Decision Logic
- Score 1a-1d: Flag as likely functional regulatory variant; high confidence
- Score 2a-3b: Moderate evidence; recommend experimental validation
- Score 4-7: Low regulatory evidence; likely benign regulatory impact
- No rsID provided: Skip this phase gracefully; note "no variant specified"
Output Format
markdown
### Regulatory Variant Impact [T2/T3]
|---------|-----------------|---------------|----------------|
| rs12345 | 1b | Likely regulatory | eQTL, TF binding, DNase |
| rs67890 | 3a | Some evidence | DNase peak |
Phase 4: ENCODE Functional Genomics
When: Gene or region available
Objective: Discover functional genomics experiments and datasets from ENCODE
Tools Used
| Tool | Function | Parameters |
|---|
ENCODE_search_experiments
| Search experiments by assay/target | , , , , |
| Get detailed experiment metadata | : str |
| List available data files | , , |
| Search available cell types | , , , |
Workflow
- Histone marks: Search for H3K4me3 (promoter), H3K27ac (enhancer), H3K4me1 (enhancer), H3K27me3 (repressive)
ENCODE_search_experiments(assay_title="ChIP-seq", target="H3K27ac", organism="Homo sapiens", limit=5)
- Chromatin accessibility: Search ATAC-seq and DNase-seq
ENCODE_search_experiments(assay_title="ATAC-seq", organism="Homo sapiens", limit=5)
- If gene is a TF: Search for ChIP-seq of that TF
ENCODE_search_experiments(assay_title="ChIP-seq", target=gene, organism="Homo sapiens", limit=5)
- RNA-seq context: Search for expression experiments
ENCODE_search_experiments(assay_title="RNA-seq", organism="Homo sapiens", limit=5)
Decision Logic
- Prioritize by relevance: Histone marks and accessibility most informative for regulatory analysis
- Cell type matching: When possible, focus on cell types relevant to user's question
- Experiment quality: Prefer "released" status and recent experiments
- Data volume: ENCODE has thousands of experiments; limit results and highlight most relevant
- All ENCODE data graded [T2]: High-quality experimental data
Output Format
markdown
### ENCODE Functional Genomics [T2]
#### Histone Modification Experiments
|-----------|------|-----------|--------|-------|
| ENCSR000AKP | H3K27ac | HepG2 | released | 8 |
| ENCSR000ALA | H3K4me3 | K562 | released | 6 |
#### Chromatin Accessibility
|-----------|-------|-----------|--------|
| ENCSR889WQX | ATAC-seq | GM12878 | released |
#### TF ChIP-seq (for [gene] if TF)
|-----------|--------|-----------|--------|
| ENCSR000BNT | TP53 | HepG2 | released |
Phase 5: Chromatin Conformation (4D Nucleome)
When: Gene or region available
Objective: Explore 3D genome organization data from the 4D Nucleome project
Tools Used
| Tool | Function | Parameters |
|---|
| Search Hi-C data | : "search_data", , , |
FourDN_get_experiment_metadata
| Get experiment details | : "get_experiment_metadata", : str |
Workflow
- Search Hi-C experiments:
FourDN_search_data(operation="search_data", assay_title="Hi-C", limit=10)
- Search Micro-C data:
FourDN_search_data(operation="search_data", assay_title="Micro-C", limit=5)
- For relevant experiments: get metadata for top results
- Note available cell types and data types
Decision Logic
- IMPORTANT: 4DN tools require parameter - This is a SOAP-style tool
- Hi-C vs Micro-C: Micro-C has higher resolution for local interactions
- Cell type matching: Note which cell types have chromatin data
- Data availability: 4DN may not cover all cell types of interest
- Grade [T2]: High-quality experimental chromatin conformation data
Output Format
markdown
### Chromatin Conformation (4D Nucleome) [T2]
#### Available Hi-C Datasets
|-----------|-----------|-------|-----------|--------|
| 4DNESXXXXXXX | H1-hESC | Hi-C | 10kb | released |
| 4DNESYYYYYYY | GM12878 | Micro-C | 1kb | released |
#### Chromatin Organization Context
- **TAD**: Gene located within TAD spanning chr17:7.1-8.2Mb
- **Compartment**: A compartment (active)
- **Nearby CTCF sites**: 3 CTCF sites within 100kb (from SCREEN Phase 1)
Phase 6: Ensembl Regulatory Annotation
When: Genomic region coordinates available
Objective: Get regulatory feature annotations from the Ensembl Regulatory Build
Tools Used
| Tool | Function | Parameters |
|---|
ensembl_get_regulatory_features
| Get regulatory features in region | : str (chr:start-end), : str, : str |
Workflow
- Get regulatory features:
ensembl_get_regulatory_features(region="17:7661779-7687550", feature="regulatory", species="human")
- Parse feature types: promoter, enhancer, CTCF_binding_site, TF_binding_site, open_chromatin_region
- Note activity states across cell types when available
Decision Logic
- Region format: Use chromosome:start-end without "chr" prefix
- Feature parameter: Must be "regulatory" for this endpoint
- Cross-reference with SCREEN: Compare Ensembl regulatory build with SCREEN cCREs
- Grade [T3]: Ensembl regulatory build is computationally derived
Output Format
markdown
### Ensembl Regulatory Build [T3]
|-----------|------|-------------|----------------|
| ENSR00000123456 | Promoter | 17:7687200-7688000 | Active (most cell types) |
| ENSR00000789012 | Enhancer | 17:7650000-7651500 | Active (liver, lung) |
| ENSR00000345678 | CTCF_binding_site | 17:7700000-7700500 | Active |
Synthesis: Integrated Regulatory Model (MANDATORY)
Always the final section. Integrates all evidence into a coherent regulatory model.
Synthesis Template
markdown
## Integrated Regulatory Model
### Regulatory Architecture Summary
**Gene**: [GENE] ([Ensembl ID])
**Region analyzed**: [coordinates] ([size]kb)
### Key Regulatory Elements
1. **Proximal promoter** [T2/T3]: Located at [coords], active in [cell types]
- TFs binding: SP1, CTCF, [others from ReMap]
- Histone marks: H3K4me3 (ENCODE), H3K27ac (ENCODE)
- SCREEN cCRE: [element ID]
2. **Distal enhancer 1** [T2]: Located at [coords], [distance] from TSS
- Active in [cell types] (SCREEN)
- TF binding: [TFs from ReMap/ENCODE]
- Hi-C contact with promoter: [Yes/No/Unknown]
3. **CTCF insulator** [T2]: Located at [coords]
- Defines TAD boundary
- CTCF motif score: [from JASPAR]
### Transcription Factor Regulatory Network
|----|-----------------|-------------|-----------|------|
| SP1 | ReMap ChIP-seq [T2] | JASPAR 0.92 [T3] | HepG2, K562 | Activator |
| CTCF | ENCODE ChIP-seq [T2] | JASPAR 0.98 [T3] | Ubiquitous | Insulator |
### Regulatory Variants (if applicable)
|---------|-----------------|-------------------|-----------------|
| rs12345 | 1b | Disrupts SP1 binding | Proximal promoter |
### Evidence Quality Assessment
|-----------|---------------|--------------|------------|
| cCREs (SCREEN) | 15 enhancers, 3 promoters | [T2] | High |
| TF Binding (ReMap) | 8 TFs validated | [T2] | High |
| Motifs (JASPAR) | 12 motif matches | [T3] | Medium |
| ENCODE experiments | 25 relevant datasets | [T2] | High |
| Chromatin (4DN) | Hi-C in 3 cell types | [T2] | Medium |
| Regulatory Build | 5 features annotated | [T3] | Medium |
### Data Gaps
- [ ] No single-cell ATAC-seq data available for this region
- [ ] Chromatin conformation data limited to 3 cell types
- [ ] No CRISPR-validated enhancers (would be needed for [T1])
- [ ] Regulatory variant impact is predictive (needs experimental validation)
### Experimental Recommendations
1. **Validate key enhancers**: CRISPR deletion or reporter assays for top 3 enhancers
2. **Confirm TF binding**: ChIP-qPCR for SP1, CTCF at predicted sites
3. **Test regulatory variants**: Allele-specific reporter assays for rs12345
Mandatory Completeness Checklist
Before finalizing any report, verify:
Tool Parameter Reference
Critical Parameter Notes (verified from source code):
| Tool | Parameter Name | Type | Notes |
|---|
SCREEN_get_regulatory_elements
| , , | str, str, int | element_type: "enhancer", "promoter", "insulator" |
ReMap_get_transcription_factor_binding
| , , | str, str, int | cell_type default: "HepG2" |
| | str | rsID format (e.g., "rs12345") |
| , , , , | str (all optional) | species="9606" for human |
| | str | JASPAR matrix ID (e.g., "MA0106.3") |
JASPAR_get_transcription_factors
| , , | str, int, int | collection="CORE" default |
ENCODE_search_experiments
| , , , , | str (all optional) | status="released" default |
| | str | ENCODE accession (e.g., "ENCSR000BNT") |
| , , | str, str, int | All optional |
| , , , | str (all optional) | |
| , , , , , | operation REQUIRED | operation="search_data" |
FourDN_get_experiment_metadata
| , | operation REQUIRED | operation="get_experiment_metadata" |
ensembl_get_regulatory_features
| , , | str, str, str | feature="regulatory", region="17:start-end" |
CRITICAL: SOAP-style Tools
The following tools require an
parameter:
- FourDN_search_data:
- FourDN_get_experiment_metadata:
operation="get_experiment_metadata"
- FourDN_get_file_metadata:
operation="get_file_metadata"
- FourDN_get_download_url:
operation="get_download_url"
Response Format Notes (verified from testing)
- SCREEN: Returns dict with , , , , keys (JSON-LD format)
- ReMap: Returns dict with TF binding records
- RegulomeDB: Returns with regulatory score and evidence in
- JASPAR search: Returns
{count, next, previous, results}
with matrix objects in
- JASPAR get_matrix: Returns dict with matrix details (name, PFM, sequence logo)
- ENCODE: Returns dict with experiment/file objects (structure varies by endpoint)
- 4DN: Returns dict with search results
- Ensembl: Returns
{status, data, url, content_type}
with regulatory features in
Fallback Strategies
Regulatory Elements
- Primary: SCREEN cCREs by gene name
- Fallback: Ensembl Regulatory Build by coordinates
- If both empty: Note "limited regulatory annotation in this region"
TF Binding
- Primary: ReMap binding sites + JASPAR motifs
- Fallback: ENCODE ChIP-seq experiments
- If all empty: Gene may have limited TF binding data; note and continue
Chromatin Data
- Primary: 4DN Hi-C experiments
- Fallback: ENCODE Hi-C experiments
- If empty: Note "no chromatin conformation data available for this region"
Variant Scoring
- Primary: RegulomeDB for rsID
- Fallback: SCREEN + ENCODE overlap analysis at variant position
- If no variant: Skip gracefully
Common Use Patterns
Pattern 1: Gene-Centric Regulatory Landscape
Input: Gene symbol (e.g., "TP53")
Workflow: All phases (0-6 + Synthesis)
Output: Complete regulatory atlas for the gene locus
Pattern 2: Transcription Factor Target Analysis
Input: TF name (e.g., "CTCF")
Workflow: Phase 0 -> Phase 2 (JASPAR motif + ENCODE ChIP-seq) -> Phase 1 (target gene cCREs)
Output: TF binding motif, genome-wide binding data, target gene catalog
Pattern 3: Non-Coding Variant Interpretation
Input: rsID (e.g., "rs6983267")
Workflow: Phase 0 -> Phase 3 (RegulomeDB) -> Phase 1 (nearby cCREs) -> Phase 2 (TF binding) -> Synthesis
Output: Regulatory impact assessment with functional context
Pattern 4: Cell-Type Specific Regulation
Input: Gene + cell type (e.g., "MYC in HepG2")
Workflow: Phase 0 -> Phase 1 (SCREEN) -> Phase 2 (ReMap in HepG2) -> Phase 4 (ENCODE in HepG2)
Output: Cell-type specific regulatory landscape
Pattern 5: Epigenetic Data Discovery
Input: Histone mark or assay type (e.g., "H3K27ac ChIP-seq in liver")
Workflow: Phase 4 (ENCODE search) -> Phase 5 (4DN chromatin) -> Summary
Output: Available datasets and download information
Limitations & Known Issues
Database-Specific
- SCREEN: Limited to ENCODE-defined cCREs; may miss tissue-specific regulatory elements
- JASPAR: Motif predictions have false positive rate; binding =/= function
- ReMap: Coverage varies by TF and cell type; ~1000 TFs covered
- RegulomeDB: Scoring based on available data; novel variants may lack evidence
- ENCODE: Primarily human and mouse; limited other organisms
- 4DN: Focused on chromatin conformation; limited cell type coverage
- Ensembl: Regulatory build is computationally predicted; may miss novel elements
Analysis
- Cell-type specificity: Regulatory elements are highly cell-type specific; data from one cell type may not generalize
- Functional validation gap: Most findings are [T2]-[T3]; [T1] validation requires experimental follow-up
- Non-coding complexity: Regulatory mechanisms are complex; catalog does not capture all interactions
- 3D genome: TAD and loop data available for limited cell types
Technical
- 4DN operation parameter: Must include for all 4DN tools (SOAP-style)
- Region format: Ensembl uses "17:start-end" (no "chr" prefix); SCREEN/ENCODE may use "chr17:start-end"
- Large gene loci: Genes spanning >1Mb may require multiple queries
Summary
Epigenomics & Gene Regulation Skill provides comprehensive regulatory landscape analysis by integrating:
- Cis-regulatory elements (SCREEN) - Enhancers, promoters, insulators from ENCODE cCRE catalog
- Transcription factor binding (JASPAR + ReMap + ENCODE) - Motifs, validated binding sites, ChIP-seq data
- Regulatory variant scoring (RegulomeDB) - Evidence-based variant regulatory impact
- Functional genomics (ENCODE) - Histone marks, chromatin accessibility, expression
- Chromatin conformation (4D Nucleome) - Hi-C, TADs, chromatin loops
- Regulatory annotation (Ensembl) - Computational regulatory build features
Outputs: Structured markdown report with regulatory element catalog, TF network, variant scoring, and integrated regulatory model
Best for: Gene regulation analysis, non-coding variant interpretation, enhancer/promoter identification, TF binding profiling, epigenetic data discovery
Total tools integrated: 21 tools across 7 databases