web-browser

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

browser-automation

Overview

概述

Browser automation skill with two approaches:

agent-browser - Snapshot-based interaction model optimized for AI agents

Compact element refs (
```
@e1
```
,
```
@e2
```
) reduce token usage dramatically
Workflow:
```
open
```
→
```
snapshot -i
```
→ interact with refs → re-snapshot
Best for: dynamic exploration, form filling, scraping with unknown structure

playwright - Direct Playwright CLI and Node.js scripts

Full Playwright API access via scripts
Codegen for recording interactions
Best for: scripted automation, testing, batch operations, complex workflows

该浏览器自动化技能包含两种实现方案：

agent-browser - 针对AI Agent优化的基于快照的交互模型

精简的元素引用（
```
@e1
```
、
```
@e2
```
）可大幅降低token消耗
工作流：
```
open
```
→
```
snapshot -i
```
→ 通过引用交互 → 重新生成快照
适用场景：动态探索、表单填写、未知结构网页的数据爬取

playwright - 直接调用Playwright CLI和Node.js脚本的方案

可通过脚本调用完整的Playwright API
支持Codegen功能录制交互操作
适用场景：脚本化自动化、测试、批量操作、复杂工作流

Sub-skills

子技能

CRITICAL: You MUST load the appropriate sub-skill from the

sub-skills/

directory based on user intent.

重要提示：你必须根据用户意图从

sub-skills/

目录加载对应的子技能。

When to use each

各子技能适用场景

Sub-skill	When to use	Triggers
agent-browser.md	Interactive exploration, AI-driven navigation, unknown page structure	"navigate to", "fill this form", "click the button", "scrape this page", "explore the site"
playwright.md	Scripted automation, testing, batch screenshots, codegen	"write a script", "generate test", "batch screenshot", "record my actions", "create automation script"

子技能	适用场景	触发词
agent-browser.md	交互式探索、AI驱动的导航、未知页面结构的场景	"navigate to", "fill this form", "click the button", "scrape this page", "explore the site"
playwright.md	脚本化自动化、测试、批量截图、代码生成	"write a script", "generate test", "batch screenshot", "record my actions", "create automation script"

Default behavior

默认行为

If user intent is unclear, prefer agent-browser for interactive tasks
If user asks for "a script" or "automation code", use playwright
If user mentions "codegen" or "record", use playwright

若用户意图不明确，交互式任务优先使用agent-browser
若用户要求"a script"或"automation code"，使用playwright
若用户提到"codegen"或"record"，使用playwright

Process

流程

Determine user intent from their request
Load the appropriate sub-skill from
```
sub-skills/
```
Execute the sub-skill process
Verify expected outcome was achieved

从用户请求中判断意图
从
```
sub-skills/
```
目录加载对应的子技能
执行子技能对应的流程
验证是否达到预期结果

Resources

资源

sub-skills/: Approach-specific instructions
- ```
agent-browser.md
```
  : Snapshot/refs workflow with npx agent-browser
- ```
playwright.md
```
  : Playwright CLI and Node.js scripts
references/agent-browser/: Deep-dive documentation for agent-browser
templates/agent-browser/: Ready-to-use shell scripts for agent-browser

sub-skills/：不同实现方案的专属说明
- ```
agent-browser.md
```
  ：基于npx agent-browser的快照/引用工作流说明
- ```
playwright.md
```
  ：Playwright CLI和Node.js脚本使用说明
references/agent-browser/：agent-browser的深度文档
templates/agent-browser/：开箱即用的agent-browser Shell脚本模板

Quick reference

快速参考

agent-browser (default for interactive tasks)

agent-browser（交互式任务默认方案）

bash

undefined

bash

undefined

Session isolation (generate random slug like bright-falcon)

npx agent-browser --session <slug> open https://example.com npx agent-browser --session <slug> snapshot -i npx agent-browser --session <slug> click @e1 npx agent-browser --session <slug> fill @e2 "text"

undefined

undefined

playwright (for scripts and codegen)

playwright（适用于脚本和codegen场景）

bash

undefined

bash

undefined

Quick screenshot

npx playwright screenshot https://example.com output.png

Record interactions as code

npx playwright codegen https://example.com

PDF generation

npx playwright pdf https://example.com output.pdf

undefined

npx playwright pdf https://example.com output.pdf

undefined