Loading...
Loading...
Analyzes genetic variant effects on gene expression (RNA-seq), chromatin accessibility (DNASE), histone marks (ChIP), and transcription factors using the AlphaGenome API. Use when the user asks about non-coding variant effects, pathogenicity, clinical significance, disease associations, functional effects, gene expression changes, splicing disruption, or regulatory effects in promoters and enhancers. Also use for resolving biological terms to tissue/cell-type ontologies (UBERON/CL) or analyzing variants in chr:pos:ref>alt format.
npx skill4agent add google-deepmind/science-skills alphagenome-single-variant-analysisuvuvuv.env.envALPHAGENOME_API_KEYENV_FILE.envprintf "Enter AlphaGenome API key (typing hidden): " && read -s key && echo && echo "ALPHAGENOME_API_KEY=$key" >> "ENV_FILE" && echo "Saved."dotenv.envcatgrepechoprintenvos.environ.getdotenv.load_dotenv()python3python3 -cuv runpip installuvlookup_gene_info.pyALPHAGENOME_API_KEYdocs/report-templates.mduv runuvuv run <script_name> [args...]uv run --project $SKILL_DIR /tmp/my_analysis.py --arg1 val1[!NOTE] The first invocation resolves and installs dependencies (~10s). Subsequent runs use the cached environment and start instantly. The cache lives in.~/.cache/uv/
tidy_scoresgene_namegene_symboloutput_typemodalitydf.columnsUSH2Awhole_gene--view detailplot_components.Sashimistrandontology_curietrack.metadata.columnsexec: "python": executable file not founduv runpythonpython3.ilocnp.flatnonzero(mask)FeatureStartEndStranddf.columnsscore_variantscore_variantontology_termsadata.varpredict_variantontology_termsJunctionpredictionjunction_data.get_junctions_to_plot(predictions=..., name=...).kuvexec: uv: not founduvUV_INDEX_URL=https://pypi.org/simplescripts/visualize_variant_effects.pyexamples/splicing/examples/model_limitation_RNU4ATAC/examples/polyadenylation_HBA2/examples/regulatory/examples/negative_result_GATA4/examples/negative_result_TGFB3/scripts/lookup_gene_info.pyscripts/resolve_ontology_terms.pyscore_variantfrom alphagenome.models import dna_client
from alphagenome.models import variant_scorers
from alphagenome.data import genome
import os
import pandas as pd
# Setup API Key and Client
dna_model = dna_client.create(api_key=os.environ.get('ALPHAGENOME_API_KEY'),
address='dns:///gdmscience.googleapis.com:443')
# Define Variant (example)
variant_str = "chr2:1234:A>C"
chrom, pos_str, ref_alt = variant_str.split(':')
ref, alt = ref_alt.split('>')
pos = int(pos_str)
# Use supported sequence length (e.g., 2**20 for optimal performance)
SEQ_LENGTH = 2**20
interval = genome.Interval(chrom, pos - SEQ_LENGTH // 2, pos + SEQ_LENGTH // 2)
variant = genome.Variant(chrom, pos, ref, alt)
scorers = [
variant_scorers.RECOMMENDED_VARIANT_SCORERS[m]
for m in variant_scorers.RECOMMENDED_VARIANT_SCORERS
if "ACTIVE" not in m and "CAGE" not in m and "PROCAP" not in m
]
print(f"Scoring variant {variant_str}...")
scores_list = dna_model.score_variant(interval=interval, variant=variant, variant_scorers=scorers)
# Process and Display Results
all_dfs = []
for score_adata in scores_list:
df = variant_scorers.tidy_scores([score_adata], match_gene_strand=True)
if df is not None:
all_dfs.append(df)
if all_dfs:
df = pd.concat(all_dfs)
significant = df[df['quantile_score'].abs() > 0.995]
ranked = significant.sort_values('raw_score', key=abs, ascending=False)
print("Top Significant Hits:")
print(ranked[['biosample_name', 'gene_name', 'output_type', 'quantile_score', 'raw_score']])# Define keywords based on disease context
disease_keywords = ["liver", "hepatocyte"]
# Filter for any match
mask = df['biosample_name'].str.contains('|'.join(disease_keywords), case=False, na=False)
relevant_hits = df[mask].sort_values('raw_score', key=abs, ascending=False)
print(f"\n--- Extended Analysis (Keywords: {disease_keywords}) ---")
print(relevant_hits.head(20)[['biosample_name', 'output_type', 'raw_score', 'quantile_score']])Variant Analysis Progress:
- [ ] Step 0: Review Golden Examples (MANDATORY)
- [ ] Step 1: Create Output Folder and Setup
- [ ] Step 2: Parse User Query & Research
- [ ] Step 3: Resolve Tissues & Modalities
- [ ] Step 4: Visualize & Save Plots
- [ ] Step 5: Analyze Predictions (view plots, no code). MANDATORY: Read [interpretation-guide.md](docs/interpretation-guide.md) before interpreting results.
- [ ] Step 6: Write Report, save it as `report.md` (MANDATORY)
- [ ] Step 7: Self-Critique (view `report.md` to verify links & claims)
- [ ] Step 8: Make artifact out of `report.md`report.md| Script | Purpose |
|---|---|
| Comprehensive gene and transcript lookup using |
| : : GTF data : | |
| Biological terms → UBERON/CL/EFO IDs |
| REF/ALT visualization (expression, regulatory, |
| : : splicing) : | |
| In-Silico Mutagenesis SeqLogo generation |
| Quantitative splicing analysis (delta scores, |
| : : junctions) : | |
| Genomic track visualization for a region |