Loading...
Loading...
Found 437 Skills
Browser automation using the agent-browser CLI. Use when user asks to browse websites, open webpages, interact with page elements, take screenshots, fill forms, click buttons, scrape content, or automate browser tasks.
Use when the user needs Playwright-based web application testing — screenshots, browser log analysis, interaction verification, visual regression, accessibility, and network mocking. Triggers: E2E test setup, visual regression testing, accessibility audit, Playwright configuration, page object model creation, CI test pipeline.
Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications, OR when they provide screenshots/images/designs to replicate or draw inspiration from. For screenshot inputs, extracts design guidelines first using ai-multimodal analysis, then implements code following those guidelines. Generates creative, polished code that avoids generic AI aesthetics.
Vercel agent-browser — Rust CLI for AI-driven browser automation via CDP. Use when: "agent-browser", "browse website", "automate browser", "scrape with browser", "fill form", "click button", "take screenshot", "browser automation", "headless chrome", "web interaction", "accessibility snapshot", "browser refs". Deterministic ref-based selectors, JSON output, daemon architecture. Replaces Playwright/Puppeteer for agent workflows.
Use when running Playwright via terminal CLI — `npx playwright test` (test runner), `codegen` (interactive recording), `screenshot` / `pdf` (one-off captures), and CI sharding. NOT for agent-driven real-time browser control (use `claude-in-chrome` MCP tools for that).
Typography critique. Works on code, screenshots, or briefs.
Create or update GitHub issues from screenshots, emails, messages, or any visual/text input. Extracts structured data, redacts PII, detects issue templates, proposes issues for approval, then files them via gh CLI. Don't use for GitLab/Jira tickets, opening pull requests, or fixing the bug described in the issue.
Drive a real browser to QA a feature end-to-end as a user would. Loads the right mix of Playwright MCP, Claude-in-Chrome, and computer-use, plus the failure modes to avoid. Use whenever you need to verify a UI feature works in a browser, capture PR screenshots, repro a customer bug visually, or do end-of-task dogfooding before declaring something "done". This is the QA stage of orchestrate mode.
Optimize App Store product pages for search visibility and conversion. Covers App Store Optimization ASO strategy, keyword research and keyword field optimization, app title and subtitle keyword placement, App Store description writing for conversion, promotional text rotation strategy, screenshot caption writing and ordering, in-app review prompt timing with RequestReviewAction and AppStore.requestReview, Custom Product Pages for audience segments, in-app events for search indexing, product page A/B testing experiments, localized metadata optimization across markets, and ratings and review management. Use when improving App Store discoverability, optimizing keyword strategy, writing App Store descriptions or promotional text, planning screenshot captions, setting up Custom Product Pages, configuring in-app review prompts, creating in-app events, running product page optimization tests, or developing a ratings management strategy.
Extract frames from video files using ffmpeg for AI/LLM analysis. Use when (1) the user asks to analyze, describe, or summarize a video file, (2) the user wants to extract frames or screenshots from a video, (3) the user provides a video file (.mp4, .mov, .avi, .mkv, .webm, etc.) and asks questions about its visual content, (4) the user wants to identify scenes, objects, or events in a video, (5) the user wants timestamps overlaid on extracted frames for temporal reference. Converts video into JPEG frames that can be attached to LLM prompts as images. Requires ffmpeg on PATH. Supports scene-change detection, model-aware optimization (Claude/OpenAI/Gemini), quality presets (efficient/balanced/detailed/ocr), grayscale and high-contrast OCR mode, and automatic FPS calculation via --max-frames.
macOS screen capture, window recording, GIF conversion, and agent evidence bundles from the terminal. Built on ScreenCaptureKit for window-level targeting ffmpeg cannot do. Use when the user wants a screenshot of a specific window or app, a screen recording, a GIF conversion, a before/after diff, an evidence bundle for a PR, OCR text from a window, a terminal VHS recording, a Remotion render, or wants to watch a UI for changes. Requires macOS Screen Recording permission on first run.
Use this skill when an AI agent needs to inspect, verify, debug, or profile a live Vite app by running temporary snippets inside the browser page and reading browser logs or captured artifacts. Use for client state after interactions, imported app modules, DOM state, human-like input, canvas/WebGL/Three.js state, screenshots, videos, CPU/network/performance/heap analysis, WebXR/Three.js XR with IWER, and runtime-only behavior without editing app files.