Search Results: data-warehouse

Found 41 Skills

setting-up-a-data-warehouse-source

Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on. Use when the user wants to "connect Stripe", "import data from Postgres", "add a new data source", "sync my warehouse tables", or wants to pick sync methods for each table. Walks through source-type discovery, credential validation, table discovery, per-table sync_type selection, and the final create call. Also covers picking a good prefix and what to do right after creation.

🇺🇸|EnglishTranslated

Product & Designdaemon-blockint-tech/agen...

product-management-human-data-platform

Guides product management for human data platforms—annotation and labeling products, workforce workflows, task design, quality systems (gold sets, adjudication, inter-annotator agreement), customer ML-team project delivery, contributor experience, and privacy-safe handling of human-generated training data. Use when prioritizing roadmap for labeling/RLHF/eval data platforms, writing PRDs for annotation or QA features, defining success metrics for throughput and quality, scoping enterprise customer workflows, or balancing cost-quality-speed tradeoffs—not for hands-on model training (data-scientist), warehouse/analytics pipelines (data-warehouse-engineer), generic BRD workshops without product lens (business-analyst), AI solution architecture for copilots (applied-ai-architect-commercial-enterprise), or control implementation for audits (compliance-engineer). UX flows: product-designer. Eval harnesses: prompt-engineer-agent-prompts-evals. Pricing/packaging for platform: product-management-monetization.

🇺🇸|EnglishTranslated

Data Processingdaemon-blockint-tech/agen...

data-scrubbing

Guides cleaning and standardizing tabular datasets before analysis, modeling, or reporting—profiling, quality rules, missing values, duplicates, outliers, type coercion, encoding fixes, record linkage, deduplication, high-level PII handling (not legal advice), actuarial/insurance field scrubbing, reproducible scrub pipelines, validation checks, and sign-off. Distinct from warehouse ETL or statistical modeling. Use when the user asks for "data scrubbing", "clean this dataset", "scrub the data", "data cleaning", "dedupe records", "handle missing values", "outlier treatment", "standardize columns", "data quality rules", "profile this table", or "prepare data for modeling". Not warehouse pipelines (data-warehouse-engineer), ML modeling (data-scientist, actuary), privacy programs (compliance-engineer), FinOps only (finops-analyst), or assumption governance (assumption-setting).

🇺🇸|EnglishTranslated

Data Processinggithub/awesome-copilot

snowflake-semanticview

Create, alter, and validate Snowflake semantic views using Snowflake CLI (snow). Use when asked to build or troubleshoot semantic views/semantic layer definitions with CREATE/ALTER SEMANTIC VIEW, to validate semantic-view DDL against Snowflake via CLI, or to guide Snowflake CLI installation and connection setup.

🇺🇸|EnglishTranslated

Data Processinganthropics/knowledge-work...

write-query

Write optimized SQL for your dialect with best practices. Use when translating a natural-language data need into SQL, building a multi-CTE query with joins and aggregations, optimizing a query against a large partitioned table, or getting dialect-specific syntax for Snowflake, BigQuery, Postgres, etc.

🇺🇸|EnglishTranslated

Data Processingmembranedev/application-s...

snowflake

Snowflake integration. Manage data, records, and automate workflows. Use when the user wants to interact with Snowflake data.

🇺🇸|EnglishTranslated

Data Processingaltimateai/data-engineeri...

finding-expensive-queries

Finds and ranks expensive Snowflake queries by cost, time, or data scanned. Use when: (1) User asks to find slow, expensive, or problematic queries (2) Task mentions "query history", "top queries", "most expensive", or "slowest queries" (3) Analyzing warehouse costs or identifying optimization candidates (4) Finding queries that scan the most data or have the most spillage Returns ranked list of queries with metrics and optimization recommendations.

🇺🇸|EnglishTranslated

Data Processingasgard-ai-platform/skills

tech-data-pipeline

Design data pipelines covering ETL vs ELT architectures, data source integration, scheduling, quality checks, and warehouse design. Use this skill when the user needs to move data between systems, build a data warehouse, automate data processing, or improve data reliability — even if they say 'move data from X to Y', 'build an ETL pipeline', 'our data is a mess', or 'set up a data warehouse'.

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/ai-agent-skills

ktx-context-layer-data-agents

Teach AI agents how to query data warehouses accurately using ktx - an executable context layer with skills, memory, and a semantic layer

🇺🇸|EnglishTranslated

Data Processingastronomer/agents

analyzing-data

Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what is the count", data lookups, metrics, trends, or SQL analysis.

🇺🇸|EnglishTranslated

29 scripts/Attention

Data Processingbytedance/agentkit-sample...

byted-bytehouse-slow-query

ByteHouse Slow Query Analysis and Performance Optimization Tool, used to identify and analyze slow queries, provide query performance optimization suggestions, view query execution plans, and analyze query historical trends. Use this Skill when you need to identify and analyze slow queries in ByteHouse database, get query performance optimization suggestions, view query execution plans, or analyze query historical trends.

🇨🇳|ChineseTranslated

1 scripts/Checked

Data Processingmotherduckdb/agent-skills

motherduck-model-data

Design and build database schemas and data models in MotherDuck. Produces a file-based project scaffold. Use when creating tables, choosing data types, defining relationships, or restructuring data for analytics workloads.

🇺🇸|EnglishTranslated