Total 50,402 skills, Data Processing has 2557 skills
Showing 12 of 2557 skills
Complete guide for Apache Airflow orchestration including DAGs, operators, sensors, XComs, task dependencies, dynamic workflows, and production deployment
Create data-driven charts with Vega-Lite (simple) and Vega (advanced). Best for bar, line, scatter, heatmap, area charts, and multi-series analytics. Use when you have numeric data arrays needing statistical visualization. Vega for radar charts and word clouds. NOT for process diagrams (use mermaid) or quick KPI cards (use infographic).
Apache Airflow workflow orchestration. Use for data pipelines.
Strategic guidance for designing modern data platforms, covering storage paradigms (data lake, warehouse, lakehouse), modeling approaches (dimensional, normalized, data vault, wide tables), data mesh principles, and medallion architecture patterns. Use when architecting data platforms, choosing between centralized vs decentralized patterns, selecting table formats (Iceberg, Delta Lake), or designing data governance frameworks.
Comprehensive HPK (proprietary healthcare message format) parser and explainer. Supports 100+ message types across patient administration (ID, MV, CV), supply chain (PR, FO, MA, CO, LI, RO, FA), inventory (SO, IM), organizational structure (ST, UT), and financial operations (RD, DD). Uses @erp-pas/hpk-dictionary as source of truth. Validates structure, extracts fields, explains business context, maps to HL7 v2.5/IHE PAM, and troubleshoots integration issues.
Read/write FASTA, GenBank, FASTQ files. Sequence manipulation (complement, translate). Indexed random access via faidx. For NGS pipelines (SAM/BAM/VCF), use pysam. For BLAST, use gget or blat-integration.
Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence.
Side-by-side stat comparisons with context. Adjust for era, pace of play, league differences. Advanced metrics explained in plain English.
R programming for data analysis, visualization, and statistical workflows. Use when working with R scripts (.R), Quarto documents (.qmd), RMarkdown (.Rmd), or R projects. Covers tidyverse workflows, ggplot2 visualizations, statistical analysis, epidemiological methods, and reproducible research practices.
Use when "improving image quality", "enhancing screenshots", "upscaling images", "sharpening photos", or asking about "image optimization", "screenshot quality", "resolution improvement"
Synthesize multiple media analyses into cross-source patterns and insights. Use when you need to cross-reference analyses, find patterns across sources, or perform meta-analysis of media content.
Production ETL patterns orchestrator. Routes to core reliability patterns and incremental load strategies.