agentation-self-driving

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agentation Self-Driving Mode

Agentation自动驾驶模式

Autonomously critique a web page by adding design annotations via the Agentation toolbar — in a visible headed browser so the user can watch the agent work in real time, like watching a self-driving car navigate.
通过Agentation工具栏添加设计标注,自主评审网页——在可见的有头浏览器中进行,用户可以实时观察Agent的工作过程,就像观看自动驾驶汽车行驶一样。

Launch — Always Headed

启动——始终使用有头模式

The browser MUST be visible. Never run headless. The user watches you scan, hover, click, and annotate.
Preflight: Verify
agent-browser
is available before anything else:
bash
command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }
Launch: Try opening directly first. Only close an existing session if the open command fails with a stale session error — this avoids killing a browser someone else is using:
bash
undefined
浏览器必须可见,绝不能以无头模式运行。用户会观察你扫描、悬停、点击和标注的过程。
预检:首先确认
agent-browser
可用:
bash
command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }
启动:先尝试直接打开。只有当打开命令因会话过期错误失败时,才关闭现有会话——这样可以避免关闭其他人正在使用的浏览器:
bash
undefined

Try to open. If it fails (stale session), close first then retry.

尝试打开。如果失败(会话过期),先关闭再重试。

agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }

Then verify the Agentation toolbar is present and expand it:

```bash
agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }

然后确认Agentation工具栏已存在并展开它:

```bash

1. Check toolbar exists on the page (data-feedback-toolbar is the root marker)

1. 检查页面上是否存在工具栏(data-feedback-toolbar是根标记)

agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"

If "NOT FOUND": Agentation is not installed on this page — stop and tell the user

如果显示“NOT FOUND”:此页面未安装Agentation——停止操作并告知用户

2. Expand ONLY if collapsed (clicking when already expanded collapses it)

2. 仅在工具栏折叠时展开(如果已经展开,点击会折叠它)

agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"

3. Verify: take a snapshot and look for toolbar controls

3. 验证:拍摄快照并查看工具栏控件

agent-browser snapshot -i
agent-browser snapshot -i

If expanded: you'll see "Block page interactions" checkbox, color buttons (Purple, Blue, etc.)

如果已展开:你会看到“Block page interactions”复选框、颜色按钮(Purple、Blue等)

If collapsed: you'll only see the small toggle button — retry step 2

如果仍折叠:你只会看到小的切换按钮——重试步骤2


"Block page interactions" must be checked (default: on).

> **eval quoting rule**: Always use `[class*=toggleContent]` (no quotes around the attribute value) in eval strings. Do not use double-bang in eval because bash treats it as history expansion. Do not use backslash-escaped inner quotes either, as they break unpredictably across shells.

必须勾选“Block page interactions”(默认已开启)。

> **eval引用规则**:在eval字符串中始终使用`[class*=toggleContent]`(属性值不要加引号)。不要在eval中使用双感叹号,因为bash会将其视为历史扩展。也不要使用反斜杠转义的内部引号,因为它们在不同shell中会不可预测地失效。

Critical: How to Create Annotations

关键:如何创建标注

Standard element clicks (
click @ref
) do NOT trigger annotation dialogs.
The Agentation overlay intercepts pointer events at the coordinate level. Use coordinate-based mouse events — this also makes the interaction visible in the browser as the cursor moves across the page.
@ref
compatibility
: Only
click
,
fill
,
type
,
hover
,
focus
,
check
,
select
,
drag
support
@ref
syntax. The commands
scrollintoview
,
get box
, and
eval
do NOT — they expect CSS selectors. Use
eval
with
querySelector
for scrolling and position lookup.
bash
undefined
标准元素点击(
click @ref
)不会触发标注对话框
。Agentation覆盖层会在坐标级别拦截指针事件。请使用基于坐标的鼠标事件——这也会让用户在浏览器中看到光标在页面上移动的交互过程。
@ref
兼容性
:只有
click
fill
type
hover
focus
check
select
drag
支持
@ref
语法。
scrollintoview
get box
eval
命令不支持——它们需要CSS选择器。使用带
querySelector
eval
进行滚动和位置查找。
bash
undefined

1. Take interactive snapshot — identify target element and build a CSS selector

1. 拍摄交互式快照——确定目标元素并构建CSS选择器

agent-browser snapshot -i
agent-browser snapshot -i

Example: snapshot shows heading "Point at bugs." [ref=e10]

示例:快照显示标题“Point at bugs.” [ref=e10]

Derive a CSS selector: 'h1', or more specific: 'h1:first-of-type'

推导CSS选择器:'h1',或更具体的:'h1:first-of-type'

2. Scroll the element into view via eval (NOT scrollintoview @ref — that breaks)

2. 通过eval将元素滚动到视图中(不要使用scrollintoview @ref——这会失效)

agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"

3. Get its bounding box via eval (NOT get box @ref — that also breaks)

3. 通过eval获取其边界框(不要使用get box @ref——这也会失效)

agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"

Returns: "383,245,200,40" (parse these as x,y,width,height)

返回:"383,245,200,40" (解析为x,y,width,height)

4. Move cursor to element center, then click

4. 将光标移动到元素中心,然后点击

centerX = x + width/2, centerY = y + height/2

centerX = x + width/2, centerY = y + height/2

agent-browser mouse move <centerX> <centerY> agent-browser mouse down left agent-browser mouse up left
agent-browser mouse move <centerX> <centerY> agent-browser mouse down left agent-browser mouse up left

5. Get the annotation dialog refs — read the FULL snapshot output

5. 获取标注对话框的引用——阅读完整的快照输出

Dialog refs appear at the BOTTOM of the list, don't truncate with head/tail

对话框引用出现在列表底部,不要用head/tail截断

agent-browser snapshot -i
agent-browser snapshot -i

Look for: textbox "What should change?" and "Cancel" / "Add" buttons

查找:文本框“What should change?”以及“Cancel”/“Add”按钮

6. Type critique — fill and click DO support @ref

6. 输入评审内容——fill和click支持@ref

agent-browser fill @<textboxRef> "Your critique here"
agent-browser fill @<textboxRef> "你的评审内容"

7. Submit (Add button enables after text is filled)

7. 提交(填写文本后Add按钮会启用)

agent-browser click @<addRef>

If no dialog appears after clicking, the toolbar may have collapsed. Re-expand (only if collapsed) and retry:

```bash
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"
agent-browser click @<addRef>

如果点击后没有出现对话框,工具栏可能已折叠。重新展开(仅在折叠时)并重试:

```bash
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"

Building CSS selectors from snapshots

从快照构建CSS选择器

The snapshot shows element roles, names, and refs. Map them to CSS selectors:
Snapshot lineCSS selector
heading "Point at bugs." [ref=e10]
h1
or
h1:first-of-type
button "npm install agentation Copy" [ref=e15]
button:has(code)
or by text content via eval
link "Star on GitHub" [ref=e28]
a[href*=github]
paragraph (long text...) [ref=e20]
Target by section:
section:nth-of-type(2) p
When in doubt, use a broader selector and verify with eval:
bash
agent-browser eval "document.querySelector('h2').textContent"
快照会显示元素的角色、名称和引用。将它们映射到CSS选择器:
快照行CSS选择器
heading "Point at bugs." [ref=e10]
h1
h1:first-of-type
button "npm install agentation Copy" [ref=e15]
button:has(code)
或通过eval按文本内容选择
link "Star on GitHub" [ref=e28]
a[href*=github]
paragraph (long text...) [ref=e20]
按区域定位:
section:nth-of-type(2) p
不确定时,使用更宽泛的选择器并通过eval验证:
bash
agent-browser eval "document.querySelector('h2').textContent"

The Loop

循环流程

Work top-to-bottom through the page. For each annotation:
  1. Scroll to the target area via eval (
    scrollIntoView
    )
  2. Pick a specific element — heading, paragraph, button, section container
  3. Get its bounding box via eval (
    getBoundingClientRect
    )
  4. Execute the coordinate-click sequence (
    mouse move
    mouse down
    mouse up
    )
  5. Read the full snapshot output to find dialog refs at the bottom
  6. Write the critique (
    fill @ref
    ) and submit (
    click @ref
    )
  7. Verify the annotation was added (see below)
  8. Move to the next area
从上到下遍历页面。对于每个标注:
  1. 通过eval将目标区域滚动到视图中(
    scrollIntoView
  2. 选择一个特定元素——标题、段落、按钮、区域容器
  3. 通过eval获取其边界框(
    getBoundingClientRect
  4. 执行基于坐标的点击序列(
    mouse move
    mouse down
    mouse up
  5. 阅读完整的快照输出,找到底部的对话框引用
  6. 撰写评审内容(
    fill @ref
    )并提交(
    click @ref
  7. 验证标注已添加(见下文)
  8. 移动到下一个区域

Verifying annotations

验证标注

After submitting each annotation, confirm the count increased:
bash
agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"
提交每个标注后,确认数量已增加:
bash
agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"

Should return the expected count (1 after first, 2 after second, etc.)

应返回预期数量(第一次后为1,第二次后为2,依此类推)


If the count didn't increase, the submission failed silently — re-snapshot and check if the dialog is still open.

Aim for 5-8 annotations per page unless told otherwise.

如果数量没有增加,说明提交已静默失败——重新拍摄快照并检查对话框是否仍打开。

除非另有说明,每页目标添加5-8个标注。

What to Critique

评审内容

AreaWhat to look for
Hero / above the foldHeadline hierarchy, CTA placement, visual grouping
NavigationLabel styling, category grouping, visual weight
Demo / illustrationsClarity, depth, animation readability
Content sectionsSpacing rhythm, callout treatments, typography hierarchy
Key taglinesWhether resonant lines get enough visual emphasis
CTAs and footerConversion weight, visual separation, final actions
区域检查要点
Hero / 首屏区域标题层级、CTA位置、视觉分组
导航栏标签样式、类别分组、视觉权重
演示/插图清晰度、层次感、动画可读性
内容区域间距节奏、强调内容处理、排版层级
关键标语有共鸣的标语是否获得足够的视觉强调
CTA与页脚转化权重、视觉区分度、最终操作选项

Critique Style

评审风格

2-3 sentences max per annotation:
  • Specific and actionable: "Stack the install command below the subheading at 16px" not "fix the layout"
  • 1-2 concrete alternatives: Reference CSS values, layout patterns, or design systems
  • Name the principle: Visual hierarchy, Gestalt grouping, whitespace, emphasis, conversion design
  • Reference comparable products: "Like how Stripe/Linear/Vercel handles this"
Bad: "This section needs work" Good: "This bullet list reads like docs, not a showcase. Use a 3-column card grid with icons — similar to Stripe's guidelines pattern. Creates visual rhythm and scannability."
每个标注最多2-3句话:
  • 具体且可执行:“将安装命令堆叠在副标题下方,间距16px”而非“修复布局”
  • 1-2个具体替代方案:参考CSS值、布局模式或设计系统
  • 点明设计原则:视觉层级、格式塔分组、留白、强调、转化设计
  • 参考同类产品:“就像Stripe/Linear/Vercel的处理方式”
反面示例:“此区域需要改进” 正面示例:“此项目符号列表读起来像文档,而非展示内容。使用带图标3列卡片网格——类似Stripe的指南模式。可创建视觉节奏和易读性。”

Install

安装

The skill must be symlinked into
~/.claude/skills/
for Claude Code to discover it:
bash
ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-driving
Restart Claude Code after installing. Verify with
/agentation-self-driving
— if it loads the skill instructions, the symlink is working.
该技能必须链接到
~/.claude/skills/
目录,才能被Claude Code发现:
bash
ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-driving
安装后重启Claude Code。通过
/agentation-self-driving
验证——如果能加载技能说明,说明链接成功。

Troubleshooting

故障排除

  • "Browser not launched. Call launch first.": Stale session from a previous run — run
    agent-browser close 2>/dev/null
    then retry the
    --headed open
    command
  • Toolbar not found on page: Agentation isn't installed — run
    /agentation
    to set it up first
  • No dialog after clicking: Toolbar collapsed — re-expand with the state-aware eval (check
    [class*=expanded]
    first), retry
  • Wrong element targeted: Click Cancel, scroll to intended element, retry with correct coordinates
  • Add button stays disabled: Text wasn't filled — re-snapshot and fill the textbox
  • Page navigated: "Block page interactions" is off — enable via toolbar settings
  • Annotation count didn't increase: Submission failed — dialog may still be open, re-snapshot and check
  • Interrupted mid-run (Ctrl+C): The browser stays open with whatever state it was in. Run
    agent-browser close
    to clean up before starting a new session
  • “Browser not launched. Call launch first.”:之前的运行留下了过期会话——执行
    agent-browser close 2>/dev/null
    ,然后重试
    --headed open
    命令
  • Toolbar not found on page:未安装Agentation——先执行
    /agentation
    进行设置
  • No dialog after clicking:工具栏已折叠——使用状态感知的eval重新展开(先检查
    [class*=expanded]
    ),然后重试
  • Wrong element targeted:点击Cancel,滚动到目标元素,使用正确坐标重试
  • Add button stays disabled:未填写文本——重新拍摄快照并填写文本框
  • Page navigated:“Block page interactions”已关闭——通过工具栏设置启用
  • Annotation count didn't increase:提交失败——对话框可能仍打开,重新拍摄快照检查
  • Interrupted mid-run (Ctrl+C):浏览器会保持当前状态打开。启动新会话前执行
    agent-browser close
    清理

agent-browser Pitfalls

agent-browser 常见陷阱

These will silently break the workflow if you're not aware of them:
PitfallWhat happensFix
scrollintoview @ref
Crashes: "Unsupported token @ref while parsing css selector"Use
eval "document.querySelector('sel').scrollIntoView({block:'center'})"
get box @ref
Same crash —
get box
parses refs as CSS selectors
Use
eval "((r)=>r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('sel').getBoundingClientRect())"
eval
with double-bang
Bash expands double-bang as history substitution before the command runsUse
expr !== null
or
expr ? true : false
instead
eval
with backslash-escaped quotes
Escaped inner quotes break across shellsDrop the quotes:
[class*=toggleContent]
works for simple values without spaces
snapshot -i | head -50
Annotation dialog refs (
textbox "What should change?"
,
Add
,
Cancel
) appear at the BOTTOM of the snapshot
Always read the full snapshot output — never truncate
click @ref
on overlay elements
The click goes through to the real DOM, bypassing the Agentation overlayUse
mouse move
mouse down left
mouse up left
for coordinate-based clicks that the overlay intercepts
--headed open
fails with "Browser not launched"
Stale sessions from previous runs block new launchesRun
agent-browser close 2>/dev/null
then retry the open command
Rule of thumb:
@ref
works for interaction commands (
click
,
fill
,
type
,
hover
). For everything else (
eval
,
get
,
scrollintoview
), use CSS selectors via
querySelector
in an eval.
如果不注意,这些问题会静默中断工作流程:
陷阱后果修复方案
scrollintoview @ref
崩溃:“Unsupported token @ref while parsing css selector”使用
eval "document.querySelector('sel').scrollIntoView({block:'center'})"
get box @ref
同样崩溃——
get box
会将引用解析为CSS选择器
使用
eval "((r)=>r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('sel').getBoundingClientRect())"
eval
中使用双感叹号
Bash会在命令运行前将双感叹号解析为历史替换使用
expr !== null
expr ? true : false
替代
eval
中使用反斜杠转义引号
转义的内部引号在不同shell中会失效去掉引号:对于无空格的简单值,
[class*=toggleContent]
即可
snapshot -i | head -50
标注对话框引用(
textbox "What should change?"
Add
Cancel
)出现在快照底部
始终阅读完整的快照输出——绝不截断
在覆盖层元素上使用
click @ref
点击会穿透到真实DOM,绕过Agentation覆盖层使用
mouse move
mouse down left
mouse up left
进行基于坐标的点击,这样覆盖层会拦截
--headed open
失败并显示“Browser not launched”
之前的运行留下的过期会话会阻止新启动执行
agent-browser close 2>/dev/null
后重试打开命令
经验法则
@ref
适用于交互命令(
click
fill
type
hover
)。对于其他命令(
eval
get
scrollintoview
),在eval中通过
querySelector
使用CSS选择器。

Two-Session Workflow (Full Self-Driving)

双会话工作流(完全自动驾驶)

With MCP connected (toolbar shows "MCP Connected"), annotations auto-send to any listening agent. This enables:
  • Session 1 (this skill): Watches the page, adds critique annotations in the visible browser
  • Session 2: Runs
    agentation_watch_annotations
    in a loop, receives annotations, edits code to address each one
The user watches Session 1 drive through the page in the browser while Session 2 fixes issues in the codebase — fully autonomous design review and implementation.
连接MCP后(工具栏显示“MCP Connected”),标注会自动发送给所有监听的Agent。这支持:
  • 会话1(本技能):观察页面,在可见浏览器中添加评审标注
  • 会话2:循环运行
    agentation_watch_annotations
    ,接收标注,编辑代码解决每个问题
用户会在浏览器中观看会话1遍历页面,同时会话2在代码库中修复问题——实现完全自主的设计评审与落地。