Loading...
Loading...
Found 118 Skills
Creates unique, production-grade frontend interfaces with exceptional design quality. Use when user asks to build web components, pages, materials, posters, or applications (e.g., websites, landing pages, dashboards, React components, HTML/CSS layouts, or styling/beautifying any web UI). Generates creative, polished code and UI designs that avoid mediocre AI aesthetics.
Z.AI CLI providing: - Vision: image/video analysis, OCR, UI-to-code, error diagnosis (GLM-4.6V) - Search: real-time web search with domain/recency filtering - Reader: web page to markdown extraction - Repo: GitHub code search and reading via ZRead - Tools: MCP tool discovery and raw calls - Code: TypeScript tool chaining Use for visual content analysis, web search, page reading, or GitHub exploration. Requires Z_AI_API_KEY.
Process multimodal inputs (images, video, audio, PDFs) with Gemini 3 Pro. Covers image understanding, video analysis, audio processing, document extraction, media resolution control, OCR, and token optimization. Use when analyzing images, processing video, transcribing audio, extracting PDF content, or working with multimodal data.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
Applies Pathfinders Labs' official brand identity to artifacts including landing pages, presentations, social media content, and documents. Use when creating content that represents Pathfinders Labs' mission of web democratization, technical expertise, and nomadic lifestyle. Ensures visual consistency and authentic voice across all platforms.
Comprehensive patterns for AI-powered audio generation including text-to-music, voice synthesis, text-to-speech, sound effects, and audio manipulation using MusicGen, Bark, ElevenLabs, and more. Use when "music generation, text to music, AI music, voice cloning, text to speech, TTS API, ElevenLabs, MusicGen, Bark, audio synthesis, sound effects generation, voice synthesis, AudioCraft, " mentioned.
Compose intellectually sophisticated persuasive essays using tripartite dialectical structure (establish-critique-synthesize), paradox accumulation, conversational register calibration, and strategic humility. Supports three atomic writing primitives (AGONAL α, MAIEUTIC β, APOPHATIC γ) with hypersoft plithogenic composition, plus legacy style modes and hybrid combinations. Triggers on requests for persuasive writing to mixed/skeptical audiences, defending counterintuitive claims, Socratic pedagogical dialogue, editorial first-person essays, or writing that must balance accessibility with depth. Implements recursive thematic anchoring, forced dilemma construction, and transformed return closure. Use when linear argumentation is insufficient and accumulated tension resolves through synthesis.
Use when creating or developing anything, before writing code or implementation plans - refines rough ideas into fully-formed designs through structured Socratic questioning, alternative exploration, and incremental validation
Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.
Convert documents (PDF, Word, Excel, PowerPoint, images, HTML) to Markdown using microsoft/markitdown. Use for document analysis, content extraction, preprocessing for LLMs, or batch document conversion. Supports images with OCR/LLM descriptions, audio transcription, and ZIP archives.
Use when defending or maintaining social order, rule of law, and peaceful institutions. Applies when countering destabilization, upholding democratic norms, or reasoning through how stable civilizations resist and resolve chaos without violence.
Multi-perspective code analysis using three AI personas (RYAN, FLASH, SOCRATES) for comprehensive decision-making. Use when complex code decisions need analysis from multiple viewpoints, or when avoiding single-perspective blind spots is critical.