Search Results: transformer

optimizing-attention-flash

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.

pytorch

PyTorch deep learning development with transformers, diffusion models, and GPU optimization.

deep-learning-pytorch

Expert guidance for deep learning, transformers, diffusion models, and LLM development with PyTorch, Transformers, Diffusers, and Gradio.

nlp-natural-language-processing

Expert guidance for natural language processing development using transformers, spaCy, NLTK, and modern NLP techniques.

transformers-huggingface

Expert guidance for working with Hugging Face Transformers library for NLP, computer vision, and multimodal AI tasks.

nanogpt

Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).

fine-tuning-with-trl

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

AI & Machine Learningletta-ai/skills

mteb-retrieve

Guidance for text embedding retrieval tasks using sentence transformers or similar embedding models. This skill should be used when the task involves loading documents, encoding text with embedding models, computing similarity scores (cosine similarity), and retrieving/ranking documents based on semantic similarity to a query. Applies to MTEB benchmark tasks, document retrieval, semantic search, and text similarity ranking.

Data Processingjeremylongshore/claude-co...

pyspark-transformer

Pyspark Transformer - Auto-activating skill for Data Pipelines. Triggers on: pyspark transformer, pyspark transformer Part of the Data Pipelines skill category.

AI & Machine Learningeyadsibai/ltk

rag-frameworks

Use when "RAG", "retrieval augmented generation", "LangChain", "LlamaIndex", "sentence transformers", "embeddings", "document QA", "chatbot with documents", "semantic search"

transformers

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

transformer-lens-interpretability

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.