Search Results: dataset

Found 288 Skills

Data Processingdavila7/claude-code-templ...

vaex

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that don't fit in memory.

🇺🇸|EnglishTranslated

Data Processingastronomer/agents

profiling-tables

Deep-dive data profiling for a specific table. Use when the user asks to profile a table, wants statistics about a dataset, asks about data quality, or needs to understand a table's structure and content. Requires a table name.

🇺🇸|EnglishTranslated

AI & Machine Learningovachiever/droid-tings

tooluniverse

Use this skill when working with scientific research tools and workflows across bioinformatics, cheminformatics, genomics, structural biology, proteomics, and drug discovery. This skill provides access to 600+ scientific tools including machine learning models, datasets, APIs, and analysis packages. Use when searching for scientific tools, executing computational biology workflows, composing multi-step research pipelines, accessing databases like OpenTargets/PubChem/UniProt/PDB/ChEMBL, performing tool discovery for research tasks, or integrating scientific computational resources into LLM workflows.

🇺🇸|EnglishTranslated

2 scripts/Checked

Data Processingmims-harvard/tooluniverse

tooluniverse-expression-data-retrieval

Retrieves gene expression and omics datasets from ArrayExpress and BioStudies with gene disambiguation, experiment quality assessment, and structured reports. Creates comprehensive dataset profiles with metadata, sample information, and download links. Use when users need expression data, omics datasets, or mention ArrayExpress (E-MTAB, E-GEOD) or BioStudies (S-BSST) accessions.

🇺🇸|EnglishTranslated

AI & Machine Learningkimasplund/claude_cogniti...

chromadb-integration-skills

Universal ChromaDB integration patterns for semantic search, persistent storage, and pattern matching across all agent types. Use when agents need to store/search large datasets, build knowledge bases, perform semantic analysis, or maintain persistent memory across sessions.

🇺🇸|EnglishTranslated

AI & Machine Learninglubu-labs/langchain-agent...

langgraph-testing-evaluation

Use this skill when you need to test or evaluate LangGraph/LangChain agents: writing unit or integration tests, generating test scaffolds, mocking LLM/tool behavior, running trajectory evaluation (match or LLM-as-judge), running LangSmith dataset evaluations, and comparing two agent versions with A/B-style offline analysis. Use it for Python and JavaScript/TypeScript workflows, evaluator design, experiment setup, regression gates, and debugging flaky/incorrect evaluation results.

🇺🇸|EnglishTranslated

11 scripts/Attention

Automationnstbrowser/nstbrowser-ai-...

nstbrowser-ai-agent

Browser automation CLI with Nstbrowser integration for AI agents. Use when the user needs advanced browser fingerprinting, profile management, proxy configuration, batch operations on multiple browser profiles, or cursor-based pagination for large datasets. Triggers include requests to "use NST profile", "configure proxy for profile", "manage browser profiles", "batch update profiles", "start multiple browsers", "list profiles with pagination", or any task requiring Nstbrowser's anti-detection features.

🇺🇸|EnglishTranslated

3 scripts/Attention

Frontend Developmentsyncfusion/wpf-ui-compone...

syncfusion-wpf-datapager

Implements Syncfusion WPF DataPager (SfDataPager) for paginating large datasets in WPF applications. Use this when implementing pagination controls, page navigation, or splitting large data into manageable chunks. Supports configurable page sizes, navigation buttons, numeric page buttons, and works with DataGrid, ListBox, ListView, and ItemsControl.

🇺🇸|EnglishTranslated

AI & Machine Learningakillness/oh-my-skills

langsmith

Instrument, trace, evaluate, and monitor LLM applications and AI agents with LangSmith. Use when setting up observability for LLM pipelines, running offline or online evaluations, managing prompts in the Prompt Hub, creating datasets for regression testing, or deploying agent servers. Triggers on: langsmith, langchain tracing, llm tracing, llm observability, llm evaluation, trace llm calls, @traceable, wrap_openai, langsmith evaluate, langsmith dataset, langsmith feedback, langsmith prompt hub, langsmith project, llm monitoring, llm debugging, llm quality, openevals, langsmith cli, langsmith experiment, annotate llm, llm judge.

🇺🇸|EnglishTranslated

2 scripts/Attention

Data Processingduckdb/duckdb-skills

read-file

Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite) or remote URL (S3, HTTPS). Use when user references a data file, asks "what's in this file", or wants to preview/profile a dataset. Not for source code.

🇺🇸|EnglishTranslated

Data Processingasgard-ai-platform/skills

stat-eda

Conduct Exploratory Data Analysis (EDA) using descriptive statistics, visualizations, and data quality checks. Use this skill when the user has a dataset and needs to understand its structure, find patterns, detect anomalies, or prepare data for further analysis — even if they say 'what does this data look like', 'find interesting patterns', 'clean this data', or 'summarize this dataset'.

🇺🇸|EnglishTranslated

Data Processingclickhouse/agent-skills

chdb-datastore

Drop-in pandas replacement with ClickHouse performance. Use `import chdb.datastore as pd` (or `from datastore import DataStore`) and write standard pandas code — same API, 10-100x faster on large datasets. Supports 16+ data sources (MySQL, PostgreSQL, S3, MongoDB, ClickHouse, Iceberg, Delta Lake, etc.) and 10+ file formats (Parquet, CSV, JSON, Arrow, ORC, etc.) with cross-source joins. Use this skill when the user wants to analyze data with pandas-style syntax, speed up slow pandas code, query remote databases or cloud storage as DataFrames, or join data across different sources — even if they don't explicitly mention chdb or DataStore. Do NOT use for raw SQL queries, ClickHouse server administration, or non-Python languages.

🇺🇸|EnglishTranslated

1 scripts/Checked