agent-browser
Original:🇺🇸 English
Not Translated
A comprehensive skill for using agent-browser, a CLI tool for browser automation designed for AI agents, developed by Vercel Labs. This skill covers installation, core commands, selectors (refs, CSS, XPath, semantic locators), agent mode, sessions, options, and best practices. Use this skill whenever the user needs to automate browser interactions via CLI commands, especially for AI agents that need to interact with web pages.
2installs
Added on
NPX Install
npx skill4agent add teachingai/full-stack-skills agent-browserSKILL.md Content
When to use this skill
Use this skill whenever the user wants to:
- Automate browser interactions via CLI commands
- Use browser automation for AI agents
- Navigate websites and interact with pages using command-line tools
- Use refs-based element selection for deterministic automation
- Integrate browser automation into AI agent workflows
- Capture snapshots of web pages with accessibility trees
- Fill forms, click elements, and extract content via CLI
- Use semantic locators for more reliable element selection
- Work with browser automation in agent mode with JSON output
- Manage multiple browser sessions
- Debug browser automation with headed mode
- Use authenticated sessions with custom headers
- Connect to existing browsers via CDP
- Stream browser viewport for live preview
How to use this skill
This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser:
-
Install agent-browser:
- Load for installation instructions
examples/getting-started/installation.md
- Load
-
Quick Start:
- Load for basic workflow examples
examples/quick-start/quick-start.md
- Load
-
Learn core commands:
- Load for basic commands (open, click, fill, etc.)
examples/commands/basic-commands.md - Load for advanced commands (snapshot, eval, etc.)
examples/commands/advanced-commands.md - Load for information retrieval commands
examples/commands/get-info/ - Load for state checking commands
examples/commands/check-state/ - Load for semantic locator commands
examples/commands/find-elements/ - Load for wait commands
examples/commands/wait/ - Load for mouse control commands
examples/commands/mouse-control/ - Load for browser configuration
examples/commands/browser-settings/ - Load for cookies and storage management
examples/commands/cookies-storage/ - Load for network interception
examples/commands/network/ - Load for tab and window management
examples/commands/tabs-windows/ - Load for iframe handling
examples/commands/frames/ - Load for dialog handling
examples/commands/dialogs/ - Load for debugging commands
examples/commands/debug/ - Load for navigation commands
examples/commands/navigation/ - Load for setup commands
examples/commands/setup/
- Load
-
Understand selectors:
- Load for refs-based selection (@e1, @e2, etc.)
examples/selectors/refs.md - Load for CSS, XPath, and semantic locators
examples/selectors/traditional-selectors.md
- Load
-
Use agent mode:
- Load for agent mode overview
examples/agent-mode/introduction.md - Load for optimal AI workflow
examples/agent-mode/optimal-workflow.md - Load for integrating with AI agents
examples/agent-mode/integration.md
- Load
-
Advanced features:
- Load for session management
examples/advanced/sessions.md - Load for debugging with visible browser
examples/advanced/headed-mode.md - Load for authentication via headers
examples/advanced/authenticated-sessions.md - Load for custom browser executable
examples/advanced/custom-executable.md - Load for Chrome DevTools Protocol integration
examples/advanced/cdp-mode.md - Load for browser viewport streaming
examples/advanced/streaming.md - Load for architecture overview
examples/advanced/architecture.md - Load for platform support
examples/advanced/platforms.md - Load for AI agent integration patterns
examples/advanced/usage-with-agents.md
- Load
-
Configure options:
- Load for global CLI options
examples/options/global-options.md - Load for snapshot-specific options
examples/options/snapshot-options.md - Load for session management options
examples/options/session-options.md
- Load
-
Reference API documentation when needed:
- - Complete command reference
api/commands.md - - Selector reference
api/selectors.md - - Options reference
api/options.md
-
Use templates for quick start:
- - Basic automation workflow
templates/basic-automation.md - - AI agent workflow template
templates/ai-agent-workflow.md
Doc mapping (one-to-one with official documentation)
- See examples and API files → https://github.com/vercel-labs/agent-browser
Examples and Templates
This skill includes detailed examples organized to match the official documentation structure. All examples are in the directory (see mapping above).
examples/To use examples:
- Identify the topic from the user's request
- Load the appropriate example file from the mapping above
- Follow the instructions, syntax, and best practices in that file
- Adapt the code examples to your specific use case
To use templates:
- Reference templates in directory for common scaffolding
templates/ - Adapt templates to your specific needs and coding style
API Reference
- Commands API: - Complete command reference with syntax and examples
api/commands.md - Selectors API: - Selector types and usage reference
api/selectors.md - Options API: - All options reference
api/options.md
Best Practices
- Use Refs: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation
- Snapshot First: Always snapshot before interacting with elements to get refs
- Agent Mode: Use flag for machine-readable output in agent mode
--json - Session Management: Use to maintain state across commands
--session - Interactive Snapshot: Use flag for interactive snapshot selection
-i - Semantic Locators: Use semantic locators (role/name) when refs are not available
- Error Handling: Check command exit codes and error messages
- Wait for Navigation: Commands automatically wait for navigation to complete
- Headed Mode: Use for debugging, headless for production
--headed - CDP Integration: Use for Chrome DevTools Protocol integration
--cdp - Streaming: Use for live browser preview
AGENT_BROWSER_STREAM_PORT - Authenticated Sessions: Use for authentication without login flows
--headers - Custom Executable: Use for serverless deployments or custom browsers
--executable-path - Snapshot Options: Combine ,
-i,-c,-doptions to optimize snapshot output-s
Resources
- GitHub Repository: https://github.com/vercel-labs/agent-browser
- Official README: https://github.com/vercel-labs/agent-browser/blob/main/README.md
- Agent Mode Documentation: https://agent-browser.dev/agent-mode
- Issues: https://github.com/vercel-labs/agent-browser/issues
Keywords
agent-browser, CLI browser automation, AI agents, browser automation CLI, refs, snapshot, agent mode, semantic locators, browser automation tool, command-line browser, AI agent browser, deterministic selectors, accessibility tree, browser commands, web automation CLI, sessions, headed mode, authenticated sessions, CDP mode, streaming, Chrome DevTools Protocol, Playwright, browser automation for AI