Total 30,804 skills, Data Processing has 1471 skills
Showing 12 of 1471 skills
Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"
Databricks CLI operations: auth, profiles, Unity Catalog, data exploration, jobs, pipelines, clusters, model serving, bundles and more. Contains up-to-date guidelines for all Databricks CLI tasks, useful for all Databricks-related tasks.
SEC EDGAR extraction pipeline: setup, filing discovery by CIK, recipe-driven extraction, and report generation.
Build apps on Databricks Apps platform. Use when asked to create dashboards, data apps, analytics tools, or visualizations. Invoke BEFORE starting implementation.
读取、写入和操作Excel文件(.xlsx、.xls)。创建电子表格、读取数据并导出为各种格式。
R programming for data analysis, visualization, and statistical workflows. Use when working with R scripts (.R), Quarto documents (.qmd), RMarkdown (.Rmd), or R projects. Covers tidyverse workflows, ggplot2 visualizations, statistical analysis, epidemiological methods, and reproducible research practices.
Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management
Use when "NetworkX", "graph analysis", "network analysis", "graph algorithms", "shortest path", "centrality", "PageRank", "community detection", "social network", "knowledge graph"
Headless spreadsheet engine for financial modeling, data analysis, and scenario comparison. Use when: building financial models with ratios and what-if scenarios, computing derived values from tabular data with formulas, producing .xlsx files with live formulas (not static values) for human review, any task where the agent would otherwise write imperative code to manipulate numbers that a spreadsheet does naturally. Triggers: financial model, scenario analysis, ratio computation, balance sheet, P&L, what-if, sensitivity analysis, banking ratios, spreadsheet model, build a model, projection, forecast. Do NOT use for: simple CSV/Excel read/write (use the xlsx skill), chart-only tasks, or data volumes exceeding ~5000 rows.
Use when "data visualization", "plotting", "charts", "matplotlib", "plotly", "seaborn", "graphs", "figures", "heatmap", "scatter plot", "bar chart", "interactive plots"
Use when "PyMC", "Bayesian", "MCMC", "probabilistic programming", or asking about "Bayesian regression", "hierarchical model", "NUTS sampler", "posterior distribution", "prior predictive", "credible intervals", "uncertainty quantification"
Use when "data pipelines", "ETL", "data warehousing", "data lakes", or asking about "Airflow", "Spark", "dbt", "Snowflake", "BigQuery", "data modeling"