Total 30,743 skills, Data Processing has 1471 skills
Showing 12 of 1471 skills
Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.
Comprehensive Azure Data Factory validation rules, activity nesting limitations, linked service requirements, and edge-case handling guidance
Flink Job Creator - Auto-activating skill for Data Pipelines. Triggers on: flink job creator, flink job creator Part of the Data Pipelines skill category.
Expert guidance for Dagster data orchestration including assets, resources, schedules, sensors, partitions, testing, and ETL patterns. Use when building or extending Dagster projects, writing assets, configuring automation, or integrating with dbt/dlt/Sling.
Use when implementing data governance frameworks, building data catalogs, establishing data lineage, defining data quality rules, or setting up data stewardship programs - covers metadata management, data quality, and complianceUse when ", " mentioned.
Post-mortem analysis for any Intelligems A/B test. Extracts learnings from funnel data, segment patterns, and customer behavior — then suggests what to test next based on findings.
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Statistical modeling toolkit. OLS, GLM, logistic, ARIMA, time series, hypothesis tests, diagnostics, AIC/BIC, for rigorous statistical inference and econometric analysis.
PostgreSQL-based semantic and hybrid search with pgvector and ParadeDB. Use when implementing vector search, semantic search, hybrid search, or full-text search in PostgreSQL. Covers pgvector setup, indexing (HNSW, IVFFlat), hybrid search (FTS + BM25 + RRF), ParadeDB as Elasticsearch alternative, and re-ranking with Cohere/cross-encoders. Supports vector(1536) and halfvec(3072) types for OpenAI embeddings. Triggers: pgvector, vector search, semantic search, hybrid search, embedding search, PostgreSQL RAG, BM25, RRF, HNSW index, similarity search, ParadeDB, pg_search, reranking, Cohere rerank, pg_trgm, trigram, fuzzy search, LIKE, ILIKE, autocomplete, typo tolerance, fuzzystrmatch
Use when asked to parse, normalize, standardize, or convert dates from various formats to consistent ISO 8601 or custom formats.
Convert between physical units (length, mass, temperature, time, etc.). Use for scientific calculations, data transformation, or unit standardization.
Use when asked to visualize sales territories, coverage areas, service regions, or geographic boundaries on interactive maps.