Loading...
Loading...
Found 913 Skills
Running and fine-tuning LLMs on Apple Silicon with MLX. Use when working with models locally on Mac, converting Hugging Face models to MLX format, fine-tuning with LoRA/QLoRA on Apple Silicon, or serving models via HTTP API.
Route AI coding queries to local LLMs in air-gapped networks. Integrates Serena MCP for semantic code understanding. Use when working offline, with local models (Ollama, LM Studio, Jan, OpenWebUI), or in secure/closed environments. Triggers on local LLM, Ollama, LM Studio, Jan, air-gapped, offline AI, Serena, local inference, closed network, model routing, defense network, secure coding.
Use when working on vLLM Studio backend architecture (controller runtime, Pi-mono agent loop, OpenAI-compatible endpoints, LiteLLM gateway, inference process, and debugging commands).
Use when setting up, deploying, or operating vLLM Studio (env keys, controller/frontend startup, Docker services, branch workflow, and release checklists).
Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.
USE FOR RAG/LLM grounding. Returns pre-extracted web content (text, tables, code) optimized for LLMs. GET + POST. Adjust max_tokens/count based on complexity. Supports Goggles, local/POI. For AI answers use answers. Recommended for anyone building AI/agentic applications.
Vision, audio, and multimodal LLM integration patterns. Use when processing images, transcribing audio, generating speech, or building multimodal AI pipelines.
LLM의 본질(확률적 토큰 예측), hallucination의 구조적 원인, temperature의 의미를 학습시키는 모듈.
LLM Tuning Patterns
Evaluate LLM systems using automated metrics, LLM-as-judge, and benchmarks. Use when testing prompt quality, validating RAG pipelines, measuring safety (hallucinations, bias), or comparing models for production deployment.
BullMQ queue system reference for Redis-backed job queues, workers, flows, and schedulers. Use when: (1) creating queues and workers with BullMQ, (2) adding jobs (delayed, prioritized, repeatable, deduplicated), (3) setting up FlowProducer parent-child job hierarchies, (4) configuring retry strategies, rate limiting, or concurrency, (5) implementing job schedulers with cron/interval patterns, (6) preparing BullMQ for production (graceful shutdown, Redis config, monitoring), or (7) debugging stalled jobs or connection issues
Step-by-step guide for adding support for a new LLM in Dust. Use when adding a new model, or updating a previous one.