launch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVS Code Automation
VS Code 自动化
Automate VS Code (Code OSS) using agent-browser. VS Code is built on Electron/Chromium and exposes a Chrome DevTools Protocol (CDP) port that agent-browser can connect to, enabling the same snapshot-interact workflow used for web pages.
使用agent-browser实现VS Code(Code OSS)的自动化。VS Code基于Electron/Chromium构建,暴露了一个Chrome DevTools Protocol(CDP)端口,agent-browser可连接该端口,从而实现与网页相同的快照-交互工作流。
Prerequisites
前置条件
- must be installed. It's listed in devDependencies — run
agent-browserin the repo root. Usenpm installif it's not on your PATH, or install globally withnpx agent-browser.npm install -g agent-browser - For Code OSS (VS Code dev build): The repo must be built before launching. runs the build automatically if needed, or set
./scripts/code.shto skip the compile step if you've already built.VSCODE_SKIP_PRELAUNCH=1 - CSS selectors are internal implementation details. Selectors like ,
.interactive-input-part, and.interactive-input-editorused in.part.auxiliarybarcommands are VS Code internals that may change across versions. If they stop working, useevalto re-discover the current DOM structure.agent-browser snapshot -i
- 必须安装。它已列在devDependencies中——在仓库根目录运行
agent-browser即可。如果它不在你的PATH中,可使用npm install,或通过npx agent-browser全局安装。npm install -g agent-browser - 针对Code OSS(VS Code开发版):启动前必须先构建仓库。会在需要时自动运行构建,若你已完成构建,可设置
./scripts/code.sh跳过编译步骤。VSCODE_SKIP_PRELAUNCH=1 - CSS选择器属于内部实现细节:命令中使用的
eval、.interactive-input-part和.interactive-input-editor等选择器是VS Code的内部实现,可能会随版本变化。如果这些选择器失效,可使用.part.auxiliarybar重新探索当前的DOM结构。agent-browser snapshot -i
Core Workflow
核心工作流
- Launch Code OSS with remote debugging enabled
- Connect agent-browser to the CDP port
- Snapshot to discover interactive elements
- Interact using element refs
- Re-snapshot after navigation or state changes
bash
undefined- 启动启用远程调试的Code OSS
- 连接agent-browser到CDP端口
- 快照以发现可交互元素
- 交互使用元素引用
- 重新快照在导航或状态变更后
bash
undefinedLaunch Code OSS with remote debugging
Launch Code OSS with remote debugging
./scripts/code.sh --remote-debugging-port=9224
./scripts/code.sh --remote-debugging-port=9224
Wait for Code OSS to start, retry until connected
Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
Discover UI elements
Discover UI elements
agent-browser snapshot -i
agent-browser snapshot -i
Focus the chat input (macOS)
Focus the chat input (macOS)
agent-browser press Control+Meta+i
undefinedagent-browser press Control+Meta+i
undefinedConnecting
连接操作
bash
undefinedbash
undefinedConnect to a specific port
Connect to a specific port
agent-browser connect 9222
agent-browser connect 9222
Or use --cdp on each command
Or use --cdp on each command
agent-browser --cdp 9222 snapshot -i
agent-browser --cdp 9222 snapshot -i
Auto-discover a running Chromium-based app
Auto-discover a running Chromium-based app
agent-browser --auto-connect snapshot -i
After `connect`, all subsequent commands target the connected app without needing `--cdp`.agent-browser --auto-connect snapshot -i
执行`connect`后,所有后续命令都会指向已连接的应用,无需再使用`--cdp`参数。Tab Management
标签页管理
Electron apps often have multiple windows or webviews. Use tab commands to list and switch between them:
bash
undefinedElectron应用通常有多个窗口或webview。使用标签页命令来列出并切换它们:
bash
undefinedList all available targets (windows, webviews, etc.)
List all available targets (windows, webviews, etc.)
agent-browser tab
agent-browser tab
Switch to a specific tab by index
Switch to a specific tab by index
agent-browser tab 2
agent-browser tab 2
Switch by URL pattern
Switch by URL pattern
agent-browser tab --url "settings"
undefinedagent-browser tab --url "settings"
undefinedLaunching Code OSS (VS Code Dev Build)
启动Code OSS(VS Code开发版)
The VS Code repository includes which launches Code OSS from source. It passes all arguments through to the Electron binary, so works directly:
scripts/code.sh--remote-debugging-portbash
cd <repo-root> # the root of your VS Code checkout
./scripts/code.sh --remote-debugging-port=9224Wait for the window to fully initialize, then connect:
bash
undefinedVS Code仓库包含脚本,可从源码启动Code OSS。它会将所有参数传递给Electron二进制文件,因此参数可直接生效:
scripts/code.sh--remote-debugging-portbash
cd <repo-root> # the root of your VS Code checkout
./scripts/code.sh --remote-debugging-port=9224等待窗口完全初始化后再进行连接:
bash
undefinedWait for Code OSS to start, retry until connected
Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser snapshot -i
**Tips:**
- Set `VSCODE_SKIP_PRELAUNCH=1` to skip the compile step if you've already built: `VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224` (from the repo root)
- Code OSS uses the default user data directory. Unlike VS Code Insiders, you don't typically need `--user-data-dir` since there's usually only one Code OSS instance running.
- If you see "Sent env to running instance. Terminating..." it means Code OSS is already running and forwarded your args to the existing instance. Quit Code OSS and relaunch with the flag, or use `--user-data-dir=/tmp/code-oss-debug` to force a new instance.for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser snapshot -i
**提示:**
- 若你已完成构建,可设置`VSCODE_SKIP_PRELAUNCH=1`跳过编译步骤:`VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224`(需在仓库根目录执行)
- Code OSS使用默认的用户数据目录。与VS Code Insiders不同,你通常不需要`--user-data-dir`参数,因为通常只会运行一个Code OSS实例。
- 若你看到“Sent env to running instance. Terminating...”提示,说明Code OSS已在运行,它会将你的参数转发给现有实例。请退出Code OSS并重新带参数启动,或使用`--user-data-dir=/tmp/code-oss-debug`强制启动新实例。Launching VS Code Extensions for Debugging
启动VS Code扩展以进行调试
To debug a VS Code extension via agent-browser, launch VS Code Insiders with and . Use to avoid conflicting with an already-running instance.
--extensionDevelopmentPath--remote-debugging-port--user-data-dirbash
undefined要通过agent-browser调试VS Code扩展,需使用和参数启动VS Code Insiders。使用参数可避免与已运行的实例冲突。
--extensionDevelopmentPath--remote-debugging-port--user-data-dirbash
undefinedBuild the extension first
Build the extension first
cd <extension-repo-root> # e.g., the root of your extension checkout
npm run compile
cd <extension-repo-root> # e.g., the root of your extension checkout
npm run compile
Launch VS Code Insiders with the extension and CDP
Launch VS Code Insiders with the extension and CDP
code-insiders
--extensionDevelopmentPath="<extension-repo-root>"
--remote-debugging-port=9223
--user-data-dir=/tmp/vscode-ext-debug
--extensionDevelopmentPath="<extension-repo-root>"
--remote-debugging-port=9223
--user-data-dir=/tmp/vscode-ext-debug
code-insiders
--extensionDevelopmentPath="<extension-repo-root>"
--remote-debugging-port=9223
--user-data-dir=/tmp/vscode-ext-debug
--extensionDevelopmentPath="<extension-repo-root>"
--remote-debugging-port=9223
--user-data-dir=/tmp/vscode-ext-debug
Wait for VS Code to start, retry until connected
Wait for VS Code to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done
agent-browser snapshot -i
**Key flags:**
- `--extensionDevelopmentPath=<path>` — loads your extension from source (must be compiled first)
- `--remote-debugging-port=9223` — enables CDP (use 9223 to avoid conflicts with other apps on 9222)
- `--user-data-dir=<path>` — uses a separate profile so it starts a new process instead of sending to an existing VS Code instance
**Without `--user-data-dir`**, VS Code detects the running instance, forwards the args to it, and exits immediately — you'll see "Sent env to running instance. Terminating..." and CDP never starts.for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done
agent-browser snapshot -i
**关键参数:**
- `--extensionDevelopmentPath=<path>` — 从源码加载你的扩展(必须先完成编译)
- `--remote-debugging-port=9223` — 启用CDP(使用9223可避免与其他应用在9222端口冲突)
- `--user-data-dir=<path>` — 使用独立的配置文件,这样会启动新进程而非将参数发送到已运行的VS Code实例
**若不使用`--user-data-dir`**,VS Code会检测到运行中的实例,将参数转发给它后立即退出——你会看到“Sent env to running instance. Terminating...”提示,且CDP永远不会启动。Interacting with Monaco Editor (Chat Input, Code Editors)
与Monaco Editor交互(聊天输入、代码编辑器)
VS Code uses Monaco Editor for all text inputs including the Copilot Chat input. Monaco editors require specific agent-browser techniques — standard , , and commands may not work depending on the VS Code build.
clickfillkeyboard typeVS Code使用Monaco Editor处理所有文本输入,包括Copilot Chat输入。Monaco编辑器需要特定的agent-browser操作技巧——标准的、和命令可能无法在部分VS Code版本中生效。
clickfillkeyboard typeThe Universal Pattern: Focus via Keyboard Shortcut + press
press通用方案:通过键盘快捷键聚焦 + press
命令
pressThis works on all VS Code builds (Code OSS, Insiders, stable):
bash
undefined此方法适用于所有VS Code版本(Code OSS、Insiders、稳定版):
bash
undefined1. Open and focus the chat input with the keyboard shortcut
1. 使用键盘快捷键打开并聚焦聊天输入框
macOS:
macOS:
agent-browser press Control+Meta+i
agent-browser press Control+Meta+i
Linux / Windows:
Linux / Windows:
agent-browser press Control+Alt+i
agent-browser press Control+Alt+i
2. Type using individual press commands
2. 使用单个press命令输入文本
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space # Use "Space" for spaces
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space # Use "Space" for spaces
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d
Verify text appeared (optional)
验证文本是否输入成功(可选)
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
3. Send the message (same on all platforms)
3. 发送消息(所有平台通用)
agent-browser press Enter
**Chat focus shortcut by platform:**
- **macOS:** `Ctrl+Cmd+I` → `agent-browser press Control+Meta+i`
- **Linux:** `Ctrl+Alt+I` → `agent-browser press Control+Alt+i`
- **Windows:** `Ctrl+Alt+I` → `agent-browser press Control+Alt+i`
This shortcut focuses the chat input and sets `document.activeElement` to a `DIV` with class `native-edit-context` — VS Code's native text editing surface that correctly processes key events from `agent-browser press`.agent-browser press Enter
**各平台聊天聚焦快捷键:**
- **macOS:** `Ctrl+Cmd+I` → `agent-browser press Control+Meta+i`
- **Linux:** `Ctrl+Alt+I` → `agent-browser press Control+Alt+i`
- **Windows:** `Ctrl+Alt+I` → `agent-browser press Control+Alt+i`
此快捷键会聚焦聊天输入框,并将`document.activeElement`设置为带有`native-edit-context`类的`DIV`元素——这是VS Code的原生文本编辑界面,可正确处理来自`agent-browser press`的按键事件。type @ref
— Works on Some Builds
type @reftype @ref
— 仅在部分版本中生效
type @refOn VS Code Insiders (extension debug mode), handles focus and input in one step:
type @refbash
agent-browser snapshot -i在VS Code Insiders(扩展调试模式)中,可一步完成聚焦和输入:
type @refbash
agent-browser snapshot -iLook for: textbox "The editor is not accessible..." [ref=e62]
Look for: textbox "The editor is not accessible..." [ref=e62]
agent-browser type @e62 "Hello from George!"
However, **`type @ref` silently fails on Code OSS** — the command completes without error but no text appears. This also applies to `keyboard type` and `keyboard inserttext`. Always verify text appeared after typing, and fall back to the keyboard shortcut + `press` pattern if it didn't. The `press`-per-key approach works universally across all builds.agent-browser type @e62 "Hello from George!"
但**`type @ref`在Code OSS中会静默失败**——命令执行完成但无错误提示,但文本不会显示。`keyboard type`和`keyboard inserttext`命令也存在同样问题。输入后务必验证文本是否显示,若未显示则回退到键盘快捷键 + `press`的方案。逐键`press`的方法在所有版本中都能生效。Compatibility Matrix
兼容性矩阵
| Method | VS Code Insiders | Code OSS |
|---|---|---|
| ✅ Works | ✅ Works |
| ✅ Works | ❌ Silent fail |
| ✅ Works | ❌ Silent fail |
| ✅ Works | ❌ Silent fail |
| ❌ Blocked by overlay | ❌ Blocked by overlay |
| ❌ Element not visible | ❌ Element not visible |
| 方法 | VS Code Insiders | Code OSS |
|---|---|---|
| 聚焦快捷键 + 逐键press | ✅ 生效 | ✅ 生效 |
| ✅ 生效 | ❌ 静默失败 |
| ✅ 生效 | ❌ 静默失败 |
| ✅ 生效 | ❌ 静默失败 |
| ❌ 被遮罩层阻止 | ❌ 被遮罩层阻止 |
| ❌ 元素不可见 | ❌ 元素不可见 |
Fallback: Focus via JavaScript Mouse Events
备选方案:通过JavaScript鼠标事件聚焦
If the keyboard shortcut doesn't work (e.g., chat panel isn't configured), you can focus the editor via JavaScript:
bash
agent-browser eval '
(() => {
const inputPart = document.querySelector(".interactive-input-part");
const editor = inputPart.querySelector(".monaco-editor");
const rect = editor.getBoundingClientRect();
const x = rect.x + rect.width / 2;
const y = rect.y + rect.height / 2;
editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
return "activeElement: " + document.activeElement?.className;
})()'若键盘快捷键无效(例如聊天面板未配置),你可通过JavaScript聚焦编辑器:
bash
agent-browser eval '
(() => {
const inputPart = document.querySelector(".interactive-input-part");
const editor = inputPart.querySelector(".monaco-editor");
const rect = editor.getBoundingClientRect();
const x = rect.x + rect.width / 2;
const y = rect.y + rect.height / 2;
editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
return "activeElement: " + document.activeElement?.className;
})()'Then use press for each character
然后使用press命令逐键输入
agent-browser press H
agent-browser press e
agent-browser press H
agent-browser press e
...
...
undefinedundefinedVerifying Text and Clearing
验证文本与清空输入
bash
undefinedbash
undefinedVerify text in the chat input
验证聊天输入框中的文本
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
Clear the input (Select All + Backspace)
清空输入框(全选 + 退格)
macOS:
macOS:
agent-browser press Meta+a
agent-browser press Meta+a
Linux / Windows:
Linux / Windows:
agent-browser press Control+a
agent-browser press Control+a
Then delete:
然后删除:
agent-browser press Backspace
undefinedagent-browser press Backspace
undefinedScreenshot Tips for VS Code
VS Code截图技巧
On ultrawide monitors, the chat sidebar may be in the far-right corner of the CDP screenshot. Options:
- Use to capture the entire window
agent-browser screenshot --full - Use element screenshots:
agent-browser screenshot ".part.auxiliarybar" sidebar.png - Use to see labeled element positions
agent-browser screenshot --annotate - Maximize the sidebar first: click the "Maximize Secondary Side Bar" button
macOS: Ifreturns "Permission denied", your terminal needs Screen Recording permission. Grant it in System Settings → Privacy & Security → Screen Recording. As a fallback, use theagent-browser screenshotverification snippet to confirm text was entered — this doesn't require screen permissions.eval
在超宽显示器上,聊天侧边栏可能位于CDP截图的最右侧。可选方案:
- 使用捕获整个窗口
agent-browser screenshot --full - 使用元素截图:
agent-browser screenshot ".part.auxiliarybar" sidebar.png - 使用查看带标签的元素位置
agent-browser screenshot --annotate - 先最大化侧边栏:点击“Maximize Secondary Side Bar”按钮
macOS注意事项: 若返回“Permission denied”,说明你的终端需要屏幕录制权限。在系统设置 → 隐私与安全性 → 屏幕录制中授予权限。作为备选方案,可使用agent-browser screenshot验证代码片段来确认文本是否输入成功——此方法不需要屏幕权限。eval
Troubleshooting
故障排除
"Connection refused" or "Cannot connect"
“Connection refused”或“Cannot connect”
- Make sure Code OSS was launched with
--remote-debugging-port=NNNN - If Code OSS was already running, quit and relaunch with the flag
- Check that the port isn't in use by another process:
- macOS / Linux:
lsof -i :9224 - Windows:
netstat -ano | findstr 9224
- macOS / Linux:
- 确保Code OSS是通过参数启动的
--remote-debugging-port=NNNN - 若Code OSS已在运行,请退出后重新带参数启动
- 检查端口是否被其他进程占用:
- macOS / Linux:
lsof -i :9224 - Windows:
netstat -ano | findstr 9224
- macOS / Linux:
Elements not appearing in snapshot
快照中未显示元素
- VS Code uses multiple webviews. Use to list targets and switch to the right one
agent-browser tab - Use to include cursor-interactive elements (divs with onclick handlers)
agent-browser snapshot -i -C
- VS Code使用多个webview。使用列出目标并切换到正确的标签页
agent-browser tab - 使用包含可通过光标交互的元素(带有onclick处理器的div)
agent-browser snapshot -i -C
Cannot type in Monaco Editor inputs
无法在Monaco Editor输入框中输入文本
- Use for individual keystrokes after focusing the input. Focus the chat input with the keyboard shortcut (macOS:
agent-browser press, Linux/Windows:Ctrl+Cmd+I).Ctrl+Alt+I - ,
type @ref, andkeyboard typework on VS Code Insiders but silently fail on Code OSS — they complete without error but no text appears. Thekeyboard inserttext-per-key approach works universally.press - See the "Interacting with Monaco Editor" section above for the full compatibility matrix.
- 聚焦输入框后,使用逐键输入。使用键盘快捷键聚焦聊天输入框(macOS:
agent-browser press,Linux/Windows:Ctrl+Cmd+I)。Ctrl+Alt+I - 、
type @ref和keyboard type在VS Code Insiders中生效,但在Code OSS中静默失败——命令执行完成但无文本显示。逐键keyboard inserttext的方法在所有版本中都能生效。press - 请查看上方“与Monaco Editor交互”部分的完整兼容性矩阵。
Cleanup / Disconnect
清理 / 断开连接
⚠️ IMPORTANT: Always quit Code OSS when you're done. Code OSS is a full Electron app that consumes significant memory (often 1–4 GB+). Leaving it running in the background will slow your machine considerably. Don't just disconnect agent-browser — kill the Code OSS process too.
bash
undefined⚠️ 重要提示:使用完成后请务必退出Code OSS。 Code OSS是完整的Electron应用,会占用大量内存(通常1-4 GB以上)。让它在后台运行会显著降低你的机器性能。不要仅断开agent-browser的连接——务必终止Code OSS进程。
bash
undefined1. Disconnect agent-browser
1. 断开agent-browser连接
agent-browser close
agent-browser close
2. QUIT Code OSS — do not leave it running!
2. 退出Code OSS — 不要让它继续运行!
macOS: Cmd+Q in the app window, or:
macOS: Cmd+Q in the app window, or:
Find the process
Find the process
lsof -i :9224 | grep LISTEN
lsof -i :9224 | grep LISTEN
Kill it (replace <PID> with the actual PID)
Kill it (replace <PID> with the actual PID)
kill <PID>
kill <PID>
Linux:
Linux:
kill $(lsof -t -i :9224)
kill $(lsof -t -i :9224)
Windows:
Windows:
taskkill /F /PID <PID>
taskkill /F /PID <PID>
Or use Task Manager to end "Code - OSS"
Or use Task Manager to end "Code - OSS"
If you launched with `./scripts/code.sh`, the process name is `Electron` or `Code - OSS`. Verify it's gone:
```bash
若你通过`./scripts/code.sh`启动,进程名称为`Electron`或`Code - OSS`。验证进程已终止:
```bashConfirm no process is listening on the debug port
Confirm no process is listening on the debug port
lsof -i :9224 # should return nothing
undefinedlsof -i :9224 # should return nothing
undefined