Loading...
Loading...
Found 316 Skills
GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table detection. Supports 90+ languages with 2x accuracy over Tesseract.
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use only when explicitly invoked with "use browser agent" or "use agent browser".
Analyze images using Gemini's vision capabilities. Use for image analysis, text extraction from screenshots, and visual content understanding.
Browser automation for web tasks - scraping, form filling, testing, screenshots, data extraction. Use when tasks involve web pages, URLs, login flows, or web interaction.
Translate Figma nodes into production-ready code with 1:1 visual fidelity using the Figma MCP workflow (design context, screenshots, assets, and project-convention translation). Trigger when the user provides Figma URLs or node IDs, or asks to implement designs or components that must match Figma specs. Requires a working Figma MCP server connection.
Rigorous visual validation expert specializing in UI testing, design system compliance, and accessibility verification. Masters screenshot analysis, visual regression testing, and component validation. Use PROACTIVELY to verify UI modifications have achieved their intended goals through comprehensive visual analysis.
Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via `playwright-cli` or the bundled wrapper script.
Use the Figma MCP server to fetch design context, screenshots, variables, and assets from Figma, and to translate Figma nodes into production code. Trigger when a task involves Figma URLs, node IDs, design-to-code implementation, or Figma MCP setup and troubleshooting.
Browser automation for AI agents. Use when the user needs to navigate websites, read page content, fill forms, click elements, take screenshots, or manage browser tabs.
Controls a cloud browser from a sandboxed remote machine. Use when the agent is running in a sandbox (no GUI) and needs to navigate websites, interact with web pages, fill forms, take screenshots, or expose local dev servers via tunnels.
Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.
Web scraping skill using Firecrawl API for deep content extraction, format conversion, and page interaction. Use when you need to scrape web pages, extract structured data, take screenshots, parse PDFs, or crawl entire websites. Triggers: firecrawl, scrape, extract content, screenshot, parse pdf, crawl website, 抓取网页, 提取内容, 网页截图