depmap

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

DepMap — Cancer Dependency Map

DepMap — 癌症依赖图谱

Overview

概述

The Cancer Dependency Map (DepMap) project, run by the Broad Institute, systematically characterizes genetic dependencies across hundreds of cancer cell lines using genome-wide CRISPR knockout screens (DepMap CRISPR), RNA interference (RNAi), and compound sensitivity assays (PRISM). DepMap data is essential for:
  • Identifying which genes are essential for specific cancer types
  • Finding cancer-selective dependencies (therapeutic targets)
  • Validating oncology drug targets
  • Discovering synthetic lethal interactions
Key resources:
癌症依赖图谱(DepMap)项目由Broad研究所运营,通过全基因组CRISPR敲除筛选(DepMap CRISPR)、RNA干扰(RNAi)和化合物敏感性检测(PRISM),系统性地表征了数百种癌细胞系的基因依赖关系。DepMap数据可用于:
  • 识别哪些基因对特定癌症类型至关重要
  • 寻找癌症选择性依赖(治疗靶点)
  • 验证肿瘤药物靶点
  • 发现合成致死相互作用
核心资源:

When to Use This Skill

何时使用该技能

Use DepMap when:
  • Target validation: Is a gene essential for survival in cancer cell lines with a specific mutation (e.g., KRAS-mutant)?
  • Biomarker discovery: What genomic features predict sensitivity to knockout of a gene?
  • Synthetic lethality: Find genes that are selectively essential when another gene is mutated/deleted
  • Drug sensitivity: What cell line features predict response to a compound?
  • Pan-cancer essentiality: Is a gene broadly essential across all cancer types (bad target) or selectively essential?
  • Correlation analysis: Which pairs of genes have correlated dependency profiles (co-essentiality)?
在以下场景中使用DepMap:
  • 靶点验证:某基因在带有特定突变(如KRAS突变)的癌细胞系中对生存是否至关重要?
  • 生物标志物发现:哪些基因组特征可预测基因敲除的敏感性?
  • 合成致死性:寻找当另一个基因发生突变/缺失时具有选择性必要性的基因
  • 药物敏感性:哪些细胞系特征可预测对化合物的响应?
  • 泛癌必要性:某基因在所有癌症类型中广泛必要(不适宜作为靶点)还是仅选择性必要?
  • 相关性分析:哪些基因对的依赖图谱具有相关性(共必要性)?

Core Concepts

核心概念

Dependency Scores

依赖评分

ScoreRangeMeaning
Chronos (CRISPR)~ -3 to 0+More negative = more essential. Common essential threshold: −1. Pan-essential genes ~−1 to −2
RNAi DEMETER2~ -3 to 0+Similar scale to Chronos
Gene EffectnormalizedNormalized Chronos; −1 = median effect of common essential genes
Key thresholds:
  • Chronos ≤ −0.5: likely dependent
  • Chronos ≤ −1: strongly dependent (common essential range)
评分范围含义
Chronos(CRISPR)~ -3 至 0+数值越负,必要性越强。通用必要性阈值:−1。泛必要基因范围~−1至−2
RNAi DEMETER2~ -3 至 0+与Chronos刻度类似
Gene Effect标准化标准化后的Chronos;−1 = 通用必要基因的中位效应
关键阈值:
  • Chronos ≤ −0.5:可能具有依赖性
  • Chronos ≤ −1:强依赖性(通用必要范围)

Cell Line Annotations

细胞系注释

Each cell line has:
  • DepMap_ID
    : unique identifier (e.g.,
    ACH-000001
    )
  • cell_line_name
    : human-readable name
  • primary_disease
    : cancer type
  • lineage
    : broad tissue lineage
  • lineage_subtype
    : specific subtype
每个细胞系包含:
  • DepMap_ID
    :唯一标识符(如
    ACH-000001
  • cell_line_name
    :人类可读名称
  • primary_disease
    :癌症类型
  • lineage
    :广泛组织谱系
  • lineage_subtype
    :特定亚型

Core Capabilities

核心功能

1. DepMap API

1. DepMap API

python
import requests
import pandas as pd

BASE_URL = "https://depmap.org/portal/api"

def depmap_get(endpoint, params=None):
    url = f"{BASE_URL}/{endpoint}"
    response = requests.get(url, params=params)
    response.raise_for_status()
    return response.json()
python
import requests
import pandas as pd

BASE_URL = "https://depmap.org/portal/api"

def depmap_get(endpoint, params=None):
    url = f"{BASE_URL}/{endpoint}"
    response = requests.get(url, params=params)
    response.raise_for_status()
    return response.json()

2. Gene Dependency Scores

2. 基因依赖评分

python
def get_gene_dependency(gene_symbol, dataset="Chronos_Combined"):
    """Get CRISPR dependency scores for a gene across all cell lines."""
    url = f"{BASE_URL}/gene"
    params = {
        "gene_id": gene_symbol,
        "dataset": dataset
    }
    response = requests.get(url, params=params)
    return response.json()
python
def get_gene_dependency(gene_symbol, dataset="Chronos_Combined"):
    """Get CRISPR dependency scores for a gene across all cell lines."""
    url = f"{BASE_URL}/gene"
    params = {
        "gene_id": gene_symbol,
        "dataset": dataset
    }
    response = requests.get(url, params=params)
    return response.json()

Alternatively, use the /data endpoint:

Alternatively, use the /data endpoint:

def get_dependencies_slice(gene_symbol, dataset_name="CRISPRGeneEffect"): """Get a gene's dependency slice from a dataset.""" url = f"{BASE_URL}/data/gene_dependency" params = {"gene_name": gene_symbol, "dataset_name": dataset_name} response = requests.get(url, params=params) data = response.json() return data
undefined
def get_dependencies_slice(gene_symbol, dataset_name="CRISPRGeneEffect"): """Get a gene's dependency slice from a dataset.""" url = f"{BASE_URL}/data/gene_dependency" params = {"gene_name": gene_symbol, "dataset_name": dataset_name} response = requests.get(url, params=params) data = response.json() return data
undefined

3. Download-Based Analysis (Recommended for Large Queries)

3. 基于下载的分析(推荐用于大规模查询)

For large-scale analysis, download DepMap data files and analyze locally:
python
import pandas as pd
import requests, os

def download_depmap_data(url, output_path):
    """Download a DepMap data file."""
    response = requests.get(url, stream=True)
    with open(output_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
对于大规模分析,下载DepMap数据文件并在本地分析:
python
import pandas as pd
import requests, os

def download_depmap_data(url, output_path):
    """Download a DepMap data file."""
    response = requests.get(url, stream=True)
    with open(output_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

DepMap 24Q4 data files (update version as needed)

DepMap 24Q4 data files (update version as needed)

FILES = { "crispr_gene_effect": "https://figshare.com/ndownloader/files/...", # OR download from: https://depmap.org/portal/download/all/ # Files available: # CRISPRGeneEffect.csv - Chronos gene effect scores # OmicsExpressionProteinCodingGenesTPMLogp1.csv - mRNA expression # OmicsSomaticMutationsMatrixDamaging.csv - mutation binary matrix # OmicsCNGene.csv - copy number # sample_info.csv - cell line metadata }
def load_depmap_gene_effect(filepath="CRISPRGeneEffect.csv"): """ Load DepMap CRISPR gene effect matrix. Rows = cell lines (DepMap_ID), Columns = genes (Symbol (EntrezID)) """ df = pd.read_csv(filepath, index_col=0) # Rename columns to gene symbols only df.columns = [col.split(" ")[0] for col in df.columns] return df
def load_cell_line_info(filepath="sample_info.csv"): """Load cell line metadata.""" return pd.read_csv(filepath)
undefined
FILES = { "crispr_gene_effect": "https://figshare.com/ndownloader/files/...", # OR download from: https://depmap.org/portal/download/all/ # Files available: # CRISPRGeneEffect.csv - Chronos gene effect scores # OmicsExpressionProteinCodingGenesTPMLogp1.csv - mRNA expression # OmicsSomaticMutationsMatrixDamaging.csv - mutation binary matrix # OmicsCNGene.csv - copy number # sample_info.csv - cell line metadata }
def load_depmap_gene_effect(filepath="CRISPRGeneEffect.csv"): """ Load DepMap CRISPR gene effect matrix. Rows = cell lines (DepMap_ID), Columns = genes (Symbol (EntrezID)) """ df = pd.read_csv(filepath, index_col=0) # Rename columns to gene symbols only df.columns = [col.split(" ")[0] for col in df.columns] return df
def load_cell_line_info(filepath="sample_info.csv"): """Load cell line metadata.""" return pd.read_csv(filepath)
undefined

4. Identifying Selective Dependencies

4. 识别选择性依赖

python
import numpy as np
import pandas as pd

def find_selective_dependencies(gene_effect_df, cell_line_info, target_gene,
                                 cancer_type=None, threshold=-0.5):
    """Find cell lines selectively dependent on a gene."""

    # Get scores for target gene
    if target_gene not in gene_effect_df.columns:
        return None

    scores = gene_effect_df[target_gene].dropna()
    dependent = scores[scores <= threshold]

    # Add cell line info
    result = pd.DataFrame({
        "DepMap_ID": dependent.index,
        "gene_effect": dependent.values
    }).merge(cell_line_info[["DepMap_ID", "cell_line_name", "primary_disease", "lineage"]])

    if cancer_type:
        result = result[result["primary_disease"].str.contains(cancer_type, case=False, na=False)]

    return result.sort_values("gene_effect")
python
import numpy as np
import pandas as pd

def find_selective_dependencies(gene_effect_df, cell_line_info, target_gene,
                                 cancer_type=None, threshold=-0.5):
    """Find cell lines selectively dependent on a gene."""

    # Get scores for target gene
    if target_gene not in gene_effect_df.columns:
        return None

    scores = gene_effect_df[target_gene].dropna()
    dependent = scores[scores <= threshold]

    # Add cell line info
    result = pd.DataFrame({
        "DepMap_ID": dependent.index,
        "gene_effect": dependent.values
    }).merge(cell_line_info[["DepMap_ID", "cell_line_name", "primary_disease", "lineage"]])

    if cancer_type:
        result = result[result["primary_disease"].str.contains(cancer_type, case=False, na=False)]

    return result.sort_values("gene_effect")

Example usage (after loading data)

Example usage (after loading data)

df_effect = load_depmap_gene_effect("CRISPRGeneEffect.csv")

df_effect = load_depmap_gene_effect("CRISPRGeneEffect.csv")

cell_info = load_cell_line_info("sample_info.csv")

cell_info = load_cell_line_info("sample_info.csv")

deps = find_selective_dependencies(df_effect, cell_info, "KRAS", cancer_type="Lung")

deps = find_selective_dependencies(df_effect, cell_info, "KRAS", cancer_type="Lung")

undefined
undefined

5. Biomarker Analysis (Gene Effect vs. Mutation)

5. 生物标志物分析(基因效应 vs. 突变)

python
import pandas as pd
from scipy import stats

def biomarker_analysis(gene_effect_df, mutation_df, target_gene, biomarker_gene):
    """
    Test if mutation in biomarker_gene predicts dependency on target_gene.

    Args:
        gene_effect_df: CRISPR gene effect DataFrame
        mutation_df: Binary mutation DataFrame (1 = mutated)
        target_gene: Gene to assess dependency of
        biomarker_gene: Gene whose mutation may predict dependency
    """
    if target_gene not in gene_effect_df.columns or biomarker_gene not in mutation_df.columns:
        return None

    # Align cell lines
    common_lines = gene_effect_df.index.intersection(mutation_df.index)
    scores = gene_effect_df.loc[common_lines, target_gene].dropna()
    mutations = mutation_df.loc[scores.index, biomarker_gene]

    mutated = scores[mutations == 1]
    wt = scores[mutations == 0]

    stat, pval = stats.mannwhitneyu(mutated, wt, alternative='less')

    return {
        "target_gene": target_gene,
        "biomarker_gene": biomarker_gene,
        "n_mutated": len(mutated),
        "n_wt": len(wt),
        "mean_effect_mutated": mutated.mean(),
        "mean_effect_wt": wt.mean(),
        "pval": pval,
        "significant": pval < 0.05
    }
python
import pandas as pd
from scipy import stats

def biomarker_analysis(gene_effect_df, mutation_df, target_gene, biomarker_gene):
    """
    Test if mutation in biomarker_gene predicts dependency on target_gene.

    Args:
        gene_effect_df: CRISPR gene effect DataFrame
        mutation_df: Binary mutation DataFrame (1 = mutated)
        target_gene: Gene to assess dependency of
        biomarker_gene: Gene whose mutation may predict dependency
    """
    if target_gene not in gene_effect_df.columns or biomarker_gene not in mutation_df.columns:
        return None

    # Align cell lines
    common_lines = gene_effect_df.index.intersection(mutation_df.index)
    scores = gene_effect_df.loc[common_lines, target_gene].dropna()
    mutations = mutation_df.loc[scores.index, biomarker_gene]

    mutated = scores[mutations == 1]
    wt = scores[mutations == 0]

    stat, pval = stats.mannwhitneyu(mutated, wt, alternative='less')

    return {
        "target_gene": target_gene,
        "biomarker_gene": biomarker_gene,
        "n_mutated": len(mutated),
        "n_wt": len(wt),
        "mean_effect_mutated": mutated.mean(),
        "mean_effect_wt": wt.mean(),
        "pval": pval,
        "significant": pval < 0.05
    }

6. Co-Essentiality Analysis

6. 共必要性分析

python
import pandas as pd

def co_essentiality(gene_effect_df, target_gene, top_n=20):
    """Find genes with most correlated dependency profiles (co-essential partners)."""
    if target_gene not in gene_effect_df.columns:
        return None

    target_scores = gene_effect_df[target_gene].dropna()

    correlations = {}
    for gene in gene_effect_df.columns:
        if gene == target_gene:
            continue
        other_scores = gene_effect_df[gene].dropna()
        common = target_scores.index.intersection(other_scores.index)
        if len(common) < 50:
            continue
        r = target_scores[common].corr(other_scores[common])
        if not pd.isna(r):
            correlations[gene] = r

    corr_series = pd.Series(correlations).sort_values(ascending=False)
    return corr_series.head(top_n)
python
import pandas as pd

def co_essentiality(gene_effect_df, target_gene, top_n=20):
    """Find genes with most correlated dependency profiles (co-essential partners)."""
    if target_gene not in gene_effect_df.columns:
        return None

    target_scores = gene_effect_df[target_gene].dropna()

    correlations = {}
    for gene in gene_effect_df.columns:
        if gene == target_gene:
            continue
        other_scores = gene_effect_df[gene].dropna()
        common = target_scores.index.intersection(other_scores.index)
        if len(common) < 50:
            continue
        r = target_scores[common].corr(other_scores[common])
        if not pd.isna(r):
            correlations[gene] = r

    corr_series = pd.Series(correlations).sort_values(ascending=False)
    return corr_series.head(top_n)

Co-essential genes often share biological complexes or pathways

Co-essential genes often share biological complexes or pathways

undefined
undefined

Query Workflows

查询工作流

Workflow 1: Target Validation for a Cancer Type

工作流1:特定癌症类型的靶点验证

  1. Download
    CRISPRGeneEffect.csv
    and
    sample_info.csv
  2. Filter cell lines by cancer type
  3. Compute mean gene effect for target gene in cancer vs. all others
  4. Calculate selectivity: how specific is the dependency to your cancer type?
  5. Cross-reference with mutation, expression, or CNA data as biomarkers
  1. 下载
    CRISPRGeneEffect.csv
    sample_info.csv
  2. 按癌症类型筛选细胞系
  3. 计算目标基因在目标癌症类型与其他所有类型中的平均基因效应
  4. 计算选择性:该依赖关系对目标癌症类型的特异性如何?
  5. 结合突变、表达或CNA数据作为生物标志物进行交叉验证

Workflow 2: Synthetic Lethality Screen

工作流2:合成致死筛选

  1. Identify cell lines with mutation/deletion in gene of interest (e.g., BRCA1-mutant)
  2. Compute gene effect scores for all genes in mutant vs. WT lines
  3. Identify genes significantly more essential in mutant lines (synthetic lethal partners)
  4. Filter by selectivity and effect size
  1. 识别带有目标基因突变/缺失的细胞系(如BRCA1突变型)
  2. 计算所有基因在突变型与野生型细胞系中的基因效应评分
  3. 识别在突变型细胞系中显著更必要的基因(合成致死伙伴)
  4. 按选择性和效应大小进行筛选

Workflow 3: Compound Sensitivity Analysis

工作流3:化合物敏感性分析

  1. Download PRISM compound sensitivity data (
    primary-screen-replicate-treatment-info.csv
    )
  2. Correlate compound AUC/log2(fold-change) with genomic features
  3. Identify predictive biomarkers for compound sensitivity
  1. 下载PRISM化合物敏感性数据(
    primary-screen-replicate-treatment-info.csv
  2. 将化合物AUC/log2(倍数变化)与基因组特征进行关联
  3. 识别可预测化合物敏感性的生物标志物

DepMap Data Files Reference

DepMap数据文件参考

FileDescription
CRISPRGeneEffect.csv
CRISPR Chronos gene effect (primary dependency data)
CRISPRGeneEffectUnscaled.csv
Unscaled CRISPR scores
RNAi_merged.csv
DEMETER2 RNAi dependency
sample_info.csv
Cell line metadata (lineage, disease, etc.)
OmicsExpressionProteinCodingGenesTPMLogp1.csv
mRNA expression
OmicsSomaticMutationsMatrixDamaging.csv
Damaging somatic mutations (binary)
OmicsCNGene.csv
Copy number per gene
PRISM_Repurposing_Primary_Screens_Data.csv
Drug sensitivity (repurposing library)
文件描述
CRISPRGeneEffect.csv
CRISPR Chronos基因效应(主要依赖数据)
CRISPRGeneEffectUnscaled.csv
未缩放的CRISPR评分
RNAi_merged.csv
DEMETER2 RNAi依赖数据
sample_info.csv
细胞系元数据(谱系、疾病等)
OmicsExpressionProteinCodingGenesTPMLogp1.csv
mRNA表达数据
OmicsSomaticMutationsMatrixDamaging.csv
有害体细胞突变(二进制)
OmicsCNGene.csv
每个基因的拷贝数
PRISM_Repurposing_Primary_Screens_Data.csv
药物敏感性(重定位库)
从以下地址下载所有文件:https://depmap.org/portal/download/all/

Best Practices

最佳实践

  • Use Chronos scores (not DEMETER2) for current CRISPR analyses — better controlled for cutting efficiency
  • Distinguish pan-essential from cancer-selective: Target genes with low variance (essential in all lines) are poor drug targets
  • Validate with expression data: A gene not expressed in a cell line will score as non-essential regardless of actual function
  • Use DepMap ID for cell line identification — cell_line_name can be ambiguous
  • Account for copy number: Amplified genes may appear essential due to copy number effect (junk DNA hypothesis)
  • Multiple testing correction: When computing biomarker associations genome-wide, apply FDR correction
  • 使用Chronos评分(而非DEMETER2)进行当前CRISPR分析——对切割效率的控制更优
  • 区分泛必要与癌症选择性:在所有细胞系中均必要(低方差)的基因不适宜作为药物靶点
  • 结合表达数据验证:某基因若在细胞系中不表达,无论实际功能如何,其评分都会显示为非必要
  • 使用DepMap ID识别细胞系——cell_line_name可能存在歧义
  • 考虑拷贝数影响:扩增的基因可能因拷贝数效应而显得必要(垃圾DNA假说)
  • 多重检验校正:在全基因组范围内计算生物标志物关联时,应用FDR校正

Additional Resources

额外资源