Loading...
Loading...
Found 437 Skills
This skill should be used when the user asks to "test on iOS simulator", "run app on iPhone", "take iOS screenshot", "tap button in simulator", "automate iOS UI", "install app on simulator", "boot simulator", or when working with iOS apps, Xcode, Simulator, simctl, idb, UI automation, or iOS testing. It automates iOS Simulator workflows including device lifecycle (create/boot/erase), app management (install/launch), push notifications, privacy grants, screenshots, and accessibility-based UI navigation.
Use Orca's computer-use CLI to inspect and control local desktop apps through accessibility trees, screenshots, and safe UI actions. Use when an agent needs to list desktop apps, get an app state, read visible UI, click, type, press keys, scroll, drag, set values, or perform app accessibility actions. Triggers include "computer use", "orca computer", "list apps", "get app state", "read Spotify", "read Slack", "click app", "type text", "press key", "set value", "scroll app", "drag app", and desktop app interaction tasks.
Route Unity and Unreal frame-time complaints into one bottleneck-first profiling brief. Use when the main job is interpreting profiler screenshots, `stat unit` / `stat gpu` output, benchmark-route complaints, or Steam Deck / target-device review packets; choosing the smallest useful next capture; naming one primary bottleneck family; and deciding whether to stay with quick packets, move to an engine-native profiler, or escalate further. Route generic app/service tuning to `performance-optimization`, build/editor/package failures to `game-build-log-triage`, broader game-production coordination to `bmad-gds`, and mixed demo/community feedback to `game-demo-feedback-triage`.
Build and maintain a persistent markdown wiki that an LLM updates on the user's behalf, usually inside an Obsidian vault or git-tracked notes repo. Use when raw sources such as web articles, papers, meeting notes, transcripts, screenshots, or past analyses need to be turned into an interlinked knowledge base with immutable source files, LLM-written wiki pages, `index.md`, `log.md`, schema rules in `AGENTS.md` or `CLAUDE.md`, source summaries, query notes, and recurring lint passes. Triggers on: llm-wiki, personal wiki, obsidian wiki, research vault, knowledge base, source ingest, persistent notes, wiki maintenance, source summaries, query filing.
Interact with an iOS simulator or Android emulator using argent MCP tools. Use when tapping UI elements, performing gestures, scrolling, typing text, pressing hardware buttons, launching apps, opening URLs, taking screenshots.
Autonomously test an app UI (iOS or Android) by running interact-screenshot-verify loops using argent MCP tools. Use when testing a UI flow, verifying login works, testing navigation, or running an end-to-end UI test scenario.
Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Supports remote Browserbase sessions with Browserbase Identity, Verified browsers, automatic CAPTCHA solving, and residential proxies — ideal for protected websites and JavaScript-heavy pages.
Capture a full DevTools-protocol trace of any browser automation — CDP firehose, screenshots, and DOM dumps — then bisect the stream into per-page searchable buckets. Use when the user wants to debug a failed run, audit network/console/DOM activity, attach a trace to an in-progress session, or feed structured per-page summaries back into an agent loop so its next iteration learns from the last one.
Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images with ai-multimodal, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Includes workflows for capturing and analyzing inspiration screenshots with chrome-devtools and ai-multimodal, iterative design image generation until aesthetic standards are met, and comprehensive design system guidance covering BEAUTIFUL (aesthetic principles), RIGHT (functionality/accessibility), SATISFYING (micro-interactions), and PEAK (storytelling) stages. Integrates with chrome-devtools, ai-multimodal, media-processing, ui-styling, and web-frameworks skills.
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Generate social media preview images (Open Graph) and configure meta tags. Creates a screenshot-optimized page using the project's existing design system, captures it at 1200x630, and sets up all social sharing meta tags.
Implement integration-ready UI code from a Figma selection or a provided nodeId using TemPad Dev MCP as the only source of design evidence (code snapshot, structure, screenshot, assets, tokens, codegen config). Detect the target repo stack and conventions first, then translate TemPad Dev’s Tailwind-like JSX/Vue IR into project-native code without adding new dependencies. Never guess key styles or measurements; avoid screenshot tuning loops. If required evidence is missing/contradictory or assets cannot be handled under repo policy, stop or ship a safe base with explicit warnings and omissions.