Loading...
Loading...
Found 7 Skills
Desktop automation via native OS accessibility trees using the agent-desktop CLI. Use when an AI agent needs to observe, interact with, or automate desktop applications (click buttons, fill forms, navigate menus, read UI state, toggle checkboxes, scroll, drag, type text, take screenshots, manage windows, use clipboard). Covers 50 commands across observation, interaction, keyboard/mouse, app lifecycle, clipboard, and wait. Triggers on: "click button", "fill form", "open app", "read UI", "automate desktop", "accessibility tree", "snapshot app", "type into field", "navigate menu", "toggle checkbox", "take screenshot", "desktop automation", "agent-desktop", or any desktop GUI interaction task. Supports macOS (Phase 1), with Windows and Linux planned.
macOS native app automation CLI for AI agents. Use when the user needs to interact with macOS desktop applications, including opening apps, clicking buttons, toggling settings, filling forms, reading UI state, automating System Settings, controlling Finder, Safari, or any native app.
Browser automation CLI using DOMShell MCP server. Maps Chrome's Accessibility Tree to a virtual filesystem for agent-native navigation.
Use Orca's computer-use CLI to inspect and control local desktop apps through accessibility trees, screenshots, and safe UI actions. Use when an agent needs to list desktop apps, get an app state, read visible UI, click, type, press keys, scroll, drag, set values, or perform app accessibility actions. Triggers include "computer use", "orca computer", "list apps", "get app state", "read Spotify", "read Slack", "click app", "type text", "press key", "set value", "scroll app", "drag app", and desktop app interaction tasks.
AI-powered browser automation toolset, including agent-browser (accessibility tree extraction), actionbook (50+ website automation recipes), and browser-use (Python automation library). Use cases: (1) Scrape web content that requires JS rendering (2) Fetch data from platforms like X/Twitter, GitHub, Reddit, etc. (3) Take web page screenshots (4) Automate browser operations (5) Retrieve the accessibility tree structure of web pages. Use this skill when you need to access dynamic web pages, bypass anti-scraping measures, or perform browser automation.
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection
Browser automation MCP server using Playwright's accessibility tree for LLM-friendly web interaction