Loading...
Loading...
Found 103 Skills
Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific, and multimodal models. Use for generating embeddings for RAG, semantic search, or similarity tasks. Best for production embedding generation.
Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.
Elite AI/ML Senior Engineer with 20+ years experience. Transforms Claude into a world-class AI researcher and engineer capable of building production-grade ML systems, LLMs, transformers, and computer vision solutions. Use when: (1) Building ML/DL models from scratch or fine-tuning, (2) Designing neural network architectures, (3) Implementing LLMs, transformers, attention mechanisms, (4) Computer vision tasks (object detection, segmentation, GANs), (5) NLP tasks (NER, sentiment, embeddings), (6) MLOps and production deployment, (7) Data preprocessing and feature engineering, (8) Model optimization and debugging, (9) Clean code review for ML projects, (10) Choosing optimal libraries and frameworks. Triggers: "ML", "AI", "deep learning", "neural network", "transformer", "LLM", "computer vision", "NLP", "TensorFlow", "PyTorch", "sklearn", "train model", "fine-tune", "embedding", "CNN", "RNN", "LSTM", "attention", "GPT", "BERT", "diffusion", "GAN", "object detection", "segmentation".
Vector embeddings with HNSW indexing, sql.js persistence, and hyperbolic support. 75x faster with agentic-flow integration. Use when: semantic search, pattern matching, similarity queries, knowledge retrieval. Skip when: exact text matching, simple lookups, no semantic understanding needed.
Build document Q&A with Gemini File Search - fully managed RAG with automatic chunking, embeddings, and citations. Upload 100+ file formats, query with natural language. Use when: document Q&A, searchable knowledge bases, semantic search. Troubleshoot: document immutability, storage quota (3x), chunking config, metadata limits (20 max), polling timeouts, displayName dropped (Blob uploads), grounding lost (JSON mode), tool conflicts (googleSearch + fileSearch).
Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.
Configure LM Studio as embedding provider for GrepAI. Use this skill for local embeddings with a GUI interface.
Build Retrieval-Augmented Generation (RAG) applications that combine LLM capabilities with external knowledge sources. Covers vector databases, embeddings, retrieval strategies, and response generation. Use when building document Q&A systems, knowledge base applications, enterprise search, or combining LLMs with custom data.
Learn how to enhance your CMS like PocketBase with AI-powered content recommendations using text embeddings, SQLite, and k-nearest neighbor search for efficient and scalable related content suggestions.
AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.
PostgreSQL-based semantic and hybrid search with pgvector and ParadeDB. Use when implementing vector search, semantic search, hybrid search, or full-text search in PostgreSQL. Covers pgvector setup, indexing (HNSW, IVFFlat), hybrid search (FTS + BM25 + RRF), ParadeDB as Elasticsearch alternative, and re-ranking with Cohere/cross-encoders. Supports vector(1536) and halfvec(3072) types for OpenAI embeddings. Triggers: pgvector, vector search, semantic search, hybrid search, embedding search, PostgreSQL RAG, BM25, RRF, HNSW index, similarity search, ParadeDB, pg_search, reranking, Cohere rerank, pg_trgm, trigram, fuzzy search, LIKE, ILIKE, autocomplete, typo tolerance, fuzzystrmatch
Run LLMs and AI models on Cloudflare's GPU network with Workers AI. Includes Llama 4, Gemma 3, Mistral 3.1, Flux images, BGE embeddings, streaming, and AI Gateway. Handles 2025 breaking changes. Prevents 7 documented errors. Use when: implementing LLM inference, images, RAG, or troubleshooting AI_ERROR, rate limits, max_tokens, BGE pooling, context window, neuron billing, Miniflare AI binding, NSFW filter, num_steps.