Total 30,734 skills, Data Processing has 1471 skills
Showing 12 of 1471 skills
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Guides Python SDK development in Apache Beam, including environment setup, testing, building, and running pipelines. Use when working with Python code in sdks/python/.
Consult this skill when designing data pipelines or transformation workflows. Use when data flows through fixed sequence of transformations, stages can be independently developed and tested, parallel processing of stages is beneficial. Do not use when selecting from multiple paradigms - use architecture-paradigms first. DO NOT use when: data flow is not sequential or predictable. DO NOT use when: complex branching/merging logic dominates.
Asset allocation frameworks: strategic (SAA), tactical (TAA), mean-variance optimization, Black-Litterman, risk parity, glide paths.
Run the TikSpyder tool to collect TikTok data — search by keyword, username, or hashtag, download videos, extract keyframes, and export structured data. Use this skill whenever the user wants to collect TikTok data, search TikTok profiles or hashtags, search TikTok videos, download TikTok content, run tikspyder, or launch the TikSpyder Streamlit interface. Also trigger when the user mentions data collection from TikTok, even if they don't say "tikspyder" by name.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Write, debug, and explain ShopifyQL queries and Shopify Segment Query Language expressions. Use this skill whenever the user wants to query Shopify analytics data, build customer segments, write ShopifyQL for reports, explore sales/orders/products data via the Shopify Admin API, debug a ShopifyQL error, understand available tables/dimensions/metrics, or translate a business question into a Shopify query. Also triggers for: "ShopifyQL", "Shopify analytics query", "customer segment filter", "Shopify segment", "SHOW FROM sales", "GROUP BY in Shopify", "Shopify report query", or any mention of ShopifyQL tables like `sales`, `sessions`, `orders`.
Scan stocks for bullish trends using technical indicators (SMA, RSI, MACD, ADX). Use when user asks to scan for bullish stocks, find trending stocks, or rank symbols by momentum.
Perform statistical hypothesis testing, regression analysis, ANOVA, and t-tests with plain-English interpretations and visualizations.
Extract structured data from Office documents (DOCX, PPTX, XLSX, HWP, HWPX) using the Polaris AI DataInsight Doc Extract API. Use when the user wants to parse, analyze, or extract text, tables, charts, images, or shapes from document files. Invoke this skill whenever the user mentions extracting content from Word, PowerPoint, Excel, HWP, or HWPX files, wants to parse document structure, needs to convert document data for RAG pipelines, or asks about reading tables, charts, or text from Office-format documents — even if they don't explicitly mention "DataInsight" or "Polaris".
Generate realistic dummy datasets for testing with customizable columns, constraints, and output formats (CSV, JSON, SQL, Python script). Use when creating test data, building mock datasets, or generating sample data for development and demos.
Perform cohort analysis on user engagement data — retention curves, feature adoption trends, and segment-level insights. Use when analyzing user retention by cohort, studying feature adoption over time, investigating churn patterns, or identifying engagement trends.