Total 50,523 skills, Data Processing has 2561 skills
Showing 12 of 2561 skills
Trace the citation neighborhood around one focal paper into foundations, descendants, bridges, weak edges, and optional second-hop links
Finto.io integration. Manage data, records, and automate workflows. Use when the user wants to interact with Finto.io data.
Diffbot integration. Manage Articles, Products, Images, Discussions, Videos. Use when the user wants to interact with Diffbot data.
Screen and deeply analyze potential NSE multibagger stocks using a Peter Lynch-style quality, growth, valuation, and technical framework. Use when the user asks for multibagger stocks, undervalued high-growth NSE stocks, Peter Lynch-style analysis, top NSE multibagger candidates, smallcap/midcap compounders, value traps, or worst-case scenarios for high-potential stocks.
Develops and executes Spark code on Dataproc Clusters and Serverless. Reads and writes data using BigLake Iceberg catalogs, BigQuery and Spanner. Debugs execution failures. Use when: - Writing Spark ETL pipelines on GCP. - Training or running inference with ML models with spark on GCP. - Managing Spark clusters, jobs, batches, and interactive sessions. Don't use when: - Writing generic Python scripts that don't use Spark. - Performing simple SQL queries that can be done directly in BigQuery.
Discovers and inspects BigQuery Data Transfer Service (DTS) configurations. Use this to identify existing ingestion pipelines and extract datasource or transfer config metadata for data pipelines. Use when a user asks for ingestion scenarios while building or managing data pipelines or when a user asks to "ingest" or "add" data that may already be managed by a DTS transfer.
Use these skills when you need to handle advanced data intelligence and predictive tasks. Use when a user asks "why" data changed or needs future projections. Provides automated insight generation and time-series forecasting.
Review Kafka schema changes (Avro, Protobuf, JSON Schema) for compatibility and evolution best practices using the Lenses MCP server. Detects breaking changes, missing defaults, schema drift and naming issues. Use when user says "review schema changes", "check schema compatibility", "will this schema break consumers" or asks about schema evolution. Do NOT use for creating new schemas from scratch or registering them in the cluster.
Builds data infrastructure — ETL/ELT pipelines, data warehousing, stream processing, data quality, orchestration (Airflow/Dagster), and analytics engineering (dbt). Use when the user asks to build data pipelines, set up ETL/ELT workflows, design a data warehouse, configure stream processing, or implement analytics engineering with dbt, Airflow, or Dagster.
Audit the health of a PostHog project's data warehouse — find every broken or degraded pipeline item across sources, sync schemas, materialized views, batch exports, and transformations. Use when the user asks "what's broken in my warehouse?", "give me a health check", "audit my data pipeline", "why are some dashboards stale?", or wants a one-shot triage summary before deciding where to spend time. Produces a prioritized report of issues grouped by severity and type, with recommended next steps.
Deep dive on a PostHog user by email address. Analyze what they do, where they spend time, and what products they use.
Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on. Use when the user wants to "connect Stripe", "import data from Postgres", "add a new data source", "sync my warehouse tables", or wants to pick sync methods for each table. Walks through source-type discovery, credential validation, table discovery, per-table sync_type selection, and the final create call. Also covers picking a good prefix and what to do right after creation.