agentation-self-driving
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgentation Self-Driving Mode
Agentation自动驾驶模式
Autonomously critique a web page by adding design annotations via the Agentation toolbar — in a visible headed browser so the user can watch the agent work in real time, like watching a self-driving car navigate.
通过Agentation工具栏添加设计标注,自主评审网页——在可见的有头浏览器中进行,用户可以实时观察Agent的工作过程,就像观看自动驾驶汽车行驶一样。
Launch — Always Headed
启动——始终使用有头模式
The browser MUST be visible. Never run headless. The user watches you scan, hover, click, and annotate.
Preflight: Verify is available before anything else:
agent-browserbash
command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }Launch: Try opening directly first. Only close an existing session if the open command fails with a stale session error — this avoids killing a browser someone else is using:
bash
undefined浏览器必须可见,绝不能以无头模式运行。用户会观察你扫描、悬停、点击和标注的过程。
预检:首先确认可用:
agent-browserbash
command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }启动:先尝试直接打开。只有当打开命令因会话过期错误失败时,才关闭现有会话——这样可以避免关闭其他人正在使用的浏览器:
bash
undefinedTry to open. If it fails (stale session), close first then retry.
尝试打开。如果失败(会话过期),先关闭再重试。
agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }
Then verify the Agentation toolbar is present and expand it:
```bashagent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }
然后确认Agentation工具栏已存在并展开它:
```bash1. Check toolbar exists on the page (data-feedback-toolbar is the root marker)
1. 检查页面上是否存在工具栏(data-feedback-toolbar是根标记)
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
If "NOT FOUND": Agentation is not installed on this page — stop and tell the user
如果显示“NOT FOUND”:此页面未安装Agentation——停止操作并告知用户
2. Expand ONLY if collapsed (clicking when already expanded collapses it)
2. 仅在工具栏折叠时展开(如果已经展开,点击会折叠它)
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"
3. Verify: take a snapshot and look for toolbar controls
3. 验证:拍摄快照并查看工具栏控件
agent-browser snapshot -i
agent-browser snapshot -i
If expanded: you'll see "Block page interactions" checkbox, color buttons (Purple, Blue, etc.)
如果已展开:你会看到“Block page interactions”复选框、颜色按钮(Purple、Blue等)
If collapsed: you'll only see the small toggle button — retry step 2
如果仍折叠:你只会看到小的切换按钮——重试步骤2
"Block page interactions" must be checked (default: on).
> **eval quoting rule**: Always use `[class*=toggleContent]` (no quotes around the attribute value) in eval strings. Do not use double-bang in eval because bash treats it as history expansion. Do not use backslash-escaped inner quotes either, as they break unpredictably across shells.
必须勾选“Block page interactions”(默认已开启)。
> **eval引用规则**:在eval字符串中始终使用`[class*=toggleContent]`(属性值不要加引号)。不要在eval中使用双感叹号,因为bash会将其视为历史扩展。也不要使用反斜杠转义的内部引号,因为它们在不同shell中会不可预测地失效。Critical: How to Create Annotations
关键:如何创建标注
Standard element clicks () do NOT trigger annotation dialogs. The Agentation overlay intercepts pointer events at the coordinate level. Use coordinate-based mouse events — this also makes the interaction visible in the browser as the cursor moves across the page.
click @refcompatibility: Only@ref,click,fill,type,hover,focus,check,selectsupportdragsyntax. The commands@ref,scrollintoview, andget boxdo NOT — they expect CSS selectors. Useevalwithevalfor scrolling and position lookup.querySelector
bash
undefined标准元素点击()不会触发标注对话框。Agentation覆盖层会在坐标级别拦截指针事件。请使用基于坐标的鼠标事件——这也会让用户在浏览器中看到光标在页面上移动的交互过程。
click @ref兼容性:只有@ref、click、fill、type、hover、focus、check、select支持drag语法。@ref、scrollintoview和get box命令不支持——它们需要CSS选择器。使用带eval的querySelector进行滚动和位置查找。eval
bash
undefined1. Take interactive snapshot — identify target element and build a CSS selector
1. 拍摄交互式快照——确定目标元素并构建CSS选择器
agent-browser snapshot -i
agent-browser snapshot -i
Example: snapshot shows heading "Point at bugs." [ref=e10]
示例:快照显示标题“Point at bugs.” [ref=e10]
Derive a CSS selector: 'h1', or more specific: 'h1:first-of-type'
推导CSS选择器:'h1',或更具体的:'h1:first-of-type'
2. Scroll the element into view via eval (NOT scrollintoview @ref — that breaks)
2. 通过eval将元素滚动到视图中(不要使用scrollintoview @ref——这会失效)
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"
3. Get its bounding box via eval (NOT get box @ref — that also breaks)
3. 通过eval获取其边界框(不要使用get box @ref——这也会失效)
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
Returns: "383,245,200,40" (parse these as x,y,width,height)
返回:"383,245,200,40" (解析为x,y,width,height)
4. Move cursor to element center, then click
4. 将光标移动到元素中心,然后点击
centerX = x + width/2, centerY = y + height/2
centerX = x + width/2, centerY = y + height/2
agent-browser mouse move <centerX> <centerY>
agent-browser mouse down left
agent-browser mouse up left
agent-browser mouse move <centerX> <centerY>
agent-browser mouse down left
agent-browser mouse up left
5. Get the annotation dialog refs — read the FULL snapshot output
5. 获取标注对话框的引用——阅读完整的快照输出
Dialog refs appear at the BOTTOM of the list, don't truncate with head/tail
对话框引用出现在列表底部,不要用head/tail截断
agent-browser snapshot -i
agent-browser snapshot -i
Look for: textbox "What should change?" and "Cancel" / "Add" buttons
查找:文本框“What should change?”以及“Cancel”/“Add”按钮
6. Type critique — fill and click DO support @ref
6. 输入评审内容——fill和click支持@ref
agent-browser fill @<textboxRef> "Your critique here"
agent-browser fill @<textboxRef> "你的评审内容"
7. Submit (Add button enables after text is filled)
7. 提交(填写文本后Add按钮会启用)
agent-browser click @<addRef>
If no dialog appears after clicking, the toolbar may have collapsed. Re-expand (only if collapsed) and retry:
```bash
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"agent-browser click @<addRef>
如果点击后没有出现对话框,工具栏可能已折叠。重新展开(仅在折叠时)并重试:
```bash
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"Building CSS selectors from snapshots
从快照构建CSS选择器
The snapshot shows element roles, names, and refs. Map them to CSS selectors:
| Snapshot line | CSS selector |
|---|---|
| |
| |
| |
| Target by section: |
When in doubt, use a broader selector and verify with eval:
bash
agent-browser eval "document.querySelector('h2').textContent"快照会显示元素的角色、名称和引用。将它们映射到CSS选择器:
| 快照行 | CSS选择器 |
|---|---|
| |
| |
| |
| 按区域定位: |
不确定时,使用更宽泛的选择器并通过eval验证:
bash
agent-browser eval "document.querySelector('h2').textContent"The Loop
循环流程
Work top-to-bottom through the page. For each annotation:
- Scroll to the target area via eval ()
scrollIntoView - Pick a specific element — heading, paragraph, button, section container
- Get its bounding box via eval ()
getBoundingClientRect - Execute the coordinate-click sequence (→
mouse move→mouse down)mouse up - Read the full snapshot output to find dialog refs at the bottom
- Write the critique () and submit (
fill @ref)click @ref - Verify the annotation was added (see below)
- Move to the next area
从上到下遍历页面。对于每个标注:
- 通过eval将目标区域滚动到视图中()
scrollIntoView - 选择一个特定元素——标题、段落、按钮、区域容器
- 通过eval获取其边界框()
getBoundingClientRect - 执行基于坐标的点击序列(→
mouse move→mouse down)mouse up - 阅读完整的快照输出,找到底部的对话框引用
- 撰写评审内容()并提交(
fill @ref)click @ref - 验证标注已添加(见下文)
- 移动到下一个区域
Verifying annotations
验证标注
After submitting each annotation, confirm the count increased:
bash
agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"提交每个标注后,确认数量已增加:
bash
agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"Should return the expected count (1 after first, 2 after second, etc.)
应返回预期数量(第一次后为1,第二次后为2,依此类推)
If the count didn't increase, the submission failed silently — re-snapshot and check if the dialog is still open.
Aim for 5-8 annotations per page unless told otherwise.
如果数量没有增加,说明提交已静默失败——重新拍摄快照并检查对话框是否仍打开。
除非另有说明,每页目标添加5-8个标注。What to Critique
评审内容
| Area | What to look for |
|---|---|
| Hero / above the fold | Headline hierarchy, CTA placement, visual grouping |
| Navigation | Label styling, category grouping, visual weight |
| Demo / illustrations | Clarity, depth, animation readability |
| Content sections | Spacing rhythm, callout treatments, typography hierarchy |
| Key taglines | Whether resonant lines get enough visual emphasis |
| CTAs and footer | Conversion weight, visual separation, final actions |
| 区域 | 检查要点 |
|---|---|
| Hero / 首屏区域 | 标题层级、CTA位置、视觉分组 |
| 导航栏 | 标签样式、类别分组、视觉权重 |
| 演示/插图 | 清晰度、层次感、动画可读性 |
| 内容区域 | 间距节奏、强调内容处理、排版层级 |
| 关键标语 | 有共鸣的标语是否获得足够的视觉强调 |
| CTA与页脚 | 转化权重、视觉区分度、最终操作选项 |
Critique Style
评审风格
2-3 sentences max per annotation:
- Specific and actionable: "Stack the install command below the subheading at 16px" not "fix the layout"
- 1-2 concrete alternatives: Reference CSS values, layout patterns, or design systems
- Name the principle: Visual hierarchy, Gestalt grouping, whitespace, emphasis, conversion design
- Reference comparable products: "Like how Stripe/Linear/Vercel handles this"
Bad: "This section needs work"
Good: "This bullet list reads like docs, not a showcase. Use a 3-column card grid with icons — similar to Stripe's guidelines pattern. Creates visual rhythm and scannability."
每个标注最多2-3句话:
- 具体且可执行:“将安装命令堆叠在副标题下方,间距16px”而非“修复布局”
- 1-2个具体替代方案:参考CSS值、布局模式或设计系统
- 点明设计原则:视觉层级、格式塔分组、留白、强调、转化设计
- 参考同类产品:“就像Stripe/Linear/Vercel的处理方式”
反面示例:“此区域需要改进”
正面示例:“此项目符号列表读起来像文档,而非展示内容。使用带图标3列卡片网格——类似Stripe的指南模式。可创建视觉节奏和易读性。”
Install
安装
The skill must be symlinked into for Claude Code to discover it:
~/.claude/skills/bash
ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-drivingRestart Claude Code after installing. Verify with — if it loads the skill instructions, the symlink is working.
/agentation-self-driving该技能必须链接到目录,才能被Claude Code发现:
~/.claude/skills/bash
ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-driving安装后重启Claude Code。通过验证——如果能加载技能说明,说明链接成功。
/agentation-self-drivingTroubleshooting
故障排除
- "Browser not launched. Call launch first.": Stale session from a previous run — run then retry the
agent-browser close 2>/dev/nullcommand--headed open - Toolbar not found on page: Agentation isn't installed — run to set it up first
/agentation - No dialog after clicking: Toolbar collapsed — re-expand with the state-aware eval (check first), retry
[class*=expanded] - Wrong element targeted: Click Cancel, scroll to intended element, retry with correct coordinates
- Add button stays disabled: Text wasn't filled — re-snapshot and fill the textbox
- Page navigated: "Block page interactions" is off — enable via toolbar settings
- Annotation count didn't increase: Submission failed — dialog may still be open, re-snapshot and check
- Interrupted mid-run (Ctrl+C): The browser stays open with whatever state it was in. Run to clean up before starting a new session
agent-browser close
- “Browser not launched. Call launch first.”:之前的运行留下了过期会话——执行,然后重试
agent-browser close 2>/dev/null命令--headed open - Toolbar not found on page:未安装Agentation——先执行进行设置
/agentation - No dialog after clicking:工具栏已折叠——使用状态感知的eval重新展开(先检查),然后重试
[class*=expanded] - Wrong element targeted:点击Cancel,滚动到目标元素,使用正确坐标重试
- Add button stays disabled:未填写文本——重新拍摄快照并填写文本框
- Page navigated:“Block page interactions”已关闭——通过工具栏设置启用
- Annotation count didn't increase:提交失败——对话框可能仍打开,重新拍摄快照检查
- Interrupted mid-run (Ctrl+C):浏览器会保持当前状态打开。启动新会话前执行清理
agent-browser close
agent-browser Pitfalls
agent-browser 常见陷阱
These will silently break the workflow if you're not aware of them:
| Pitfall | What happens | Fix |
|---|---|---|
| Crashes: "Unsupported token @ref while parsing css selector" | Use |
| Same crash — | Use |
| Bash expands double-bang as history substitution before the command runs | Use |
| Escaped inner quotes break across shells | Drop the quotes: |
| Annotation dialog refs ( | Always read the full snapshot output — never truncate |
| The click goes through to the real DOM, bypassing the Agentation overlay | Use |
| Stale sessions from previous runs block new launches | Run |
Rule of thumb: works for interaction commands (, , , ). For everything else (, , ), use CSS selectors via in an eval.
@refclickfilltypehoverevalgetscrollintoviewquerySelector如果不注意,这些问题会静默中断工作流程:
| 陷阱 | 后果 | 修复方案 |
|---|---|---|
| 崩溃:“Unsupported token @ref while parsing css selector” | 使用 |
| 同样崩溃—— | 使用 |
| Bash会在命令运行前将双感叹号解析为历史替换 | 使用 |
| 转义的内部引号在不同shell中会失效 | 去掉引号:对于无空格的简单值, |
| 标注对话框引用( | 始终阅读完整的快照输出——绝不截断 |
在覆盖层元素上使用 | 点击会穿透到真实DOM,绕过Agentation覆盖层 | 使用 |
| 之前的运行留下的过期会话会阻止新启动 | 执行 |
经验法则:适用于交互命令(、、、)。对于其他命令(、、),在eval中通过使用CSS选择器。
@refclickfilltypehoverevalgetscrollintoviewquerySelectorTwo-Session Workflow (Full Self-Driving)
双会话工作流(完全自动驾驶)
With MCP connected (toolbar shows "MCP Connected"), annotations auto-send to any listening agent. This enables:
- Session 1 (this skill): Watches the page, adds critique annotations in the visible browser
- Session 2: Runs in a loop, receives annotations, edits code to address each one
agentation_watch_annotations
The user watches Session 1 drive through the page in the browser while Session 2 fixes issues in the codebase — fully autonomous design review and implementation.
连接MCP后(工具栏显示“MCP Connected”),标注会自动发送给所有监听的Agent。这支持:
- 会话1(本技能):观察页面,在可见浏览器中添加评审标注
- 会话2:循环运行,接收标注,编辑代码解决每个问题
agentation_watch_annotations
用户会在浏览器中观看会话1遍历页面,同时会话2在代码库中修复问题——实现完全自主的设计评审与落地。