agentic-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgentic Browser
Agentic Browser
Browser automation for AI agents via inference.sh.
通过inference.sh为AI Agent实现浏览器自动化。
Quick Start
快速开始
bash
curl -fsSL https://cli.inference.sh | sh && infsh loginbash
curl -fsSL https://cli.inference.sh | sh && infsh loginOpen a page and get interactive elements
Open a page and get interactive elements
infsh app run agentic-browser --function open --input '{"url": "https://example.com"}' --session new
undefinedinfsh app run agentic-browser --function open --input '{"url": "https://example.com"}' --session new
undefinedCore Workflow
核心工作流
Every browser automation follows this pattern:
- Open: Navigate to URL, get element refs
- Snapshot: Re-fetch elements after DOM changes
- Interact: Use refs to click, fill, etc.
@e - Re-snapshot: After navigation, get fresh refs
bash
undefined所有浏览器自动化都遵循以下模式:
- 打开:导航至URL,获取元素引用
- 快照:DOM变化后重新获取元素
- 交互:使用引用进行点击、填写等操作
@e - 重新快照:导航完成后获取新的引用
bash
undefinedStart session
Start session
RESULT=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com/login"
}')
SESSION_ID=$(echo $RESULT | jq -r '.session_id')
RESULT=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com/login"
}')
SESSION_ID=$(echo $RESULT | jq -r '.session_id')
Elements returned like: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"
Elements returned like: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"
Fill form
Fill form
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e1", "text": "user@example.com"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e2", "text": "password123"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e1", "text": "user@example.com"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e2", "text": "password123"
}'
Click submit
Click submit
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "click", "ref": "@e3"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "click", "ref": "@e3"
}'
Close when done
Close when done
infsh app run agentic-browser --function close --session $SESSION_ID --input '{}'
undefinedinfsh app run agentic-browser --function close --session $SESSION_ID --input '{}'
undefinedFunctions
功能函数
open
open
Navigate to URL and configure browser. Returns page snapshot with refs.
@ebash
infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com",
"width": 1280,
"height": 720,
"user_agent": "Mozilla/5.0..."
}'Returns:
- : Current page URL
url - : Page title
title - : List of interactive elements with
elementsrefs@e - : Page screenshot (for vision agents)
screenshot
导航至URL并配置浏览器。返回包含引用的页面快照。
@ebash
infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com",
"width": 1280,
"height": 720,
"user_agent": "Mozilla/5.0..."
}'返回内容:
- : 当前页面URL
url - : 页面标题
title - : 包含
elements引用的可交互元素列表@e - : 页面截图(供视觉Agent使用)
screenshot
snapshot
snapshot
Re-fetch page state after DOM changes. Always call after clicks that navigate.
bash
infsh app run agentic-browser --function snapshot --session $SESSION_ID --input '{}'DOM变化后重新获取页面状态。点击触发导航的元素后务必调用此函数。
bash
infsh app run agentic-browser --function snapshot --session $SESSION_ID --input '{}'interact
interact
Interact with elements using refs from snapshot.
@e| Action | Description | Required Fields |
|---|---|---|
| Click element | |
| Clear and type text | |
| Type text (no clear) | |
| Press key | |
| Select dropdown | |
| Hover over element | |
| Scroll page | |
| Go back in history | - |
| Wait milliseconds | |
bash
undefined使用快照中的引用与元素交互。
@e| 操作 | 描述 | 必填字段 |
|---|---|---|
| 点击元素 | |
| 清空并输入文本 | |
| 输入文本(不清空原有内容) | |
| 按下按键 | |
| 选择下拉选项 | |
| 悬停在元素上 | |
| 滚动页面 | |
| 返回历史页面 | - |
| 等待指定毫秒数 | |
bash
undefinedClick
Click
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "click", "ref": "@e5"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "click", "ref": "@e5"
}'
Fill input
Fill input
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e1", "text": "hello@example.com"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e1", "text": "hello@example.com"
}'
Press Enter
Press Enter
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "press", "text": "Enter"
}'
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "press", "text": "Enter"
}'
Scroll down
Scroll down
infsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "scroll", "direction": "down"
}'
undefinedinfsh app run agentic-browser --function interact --session $SESSION_ID --input '{
"action": "scroll", "direction": "down"
}'
undefinedscreenshot
screenshot
Take page screenshot.
bash
infsh app run agentic-browser --function screenshot --session $SESSION_ID --input '{
"full_page": true
}'截取页面截图。
bash
infsh app run agentic-browser --function screenshot --session $SESSION_ID --input '{
"full_page": true
}'execute
execute
Run JavaScript on the page.
bash
infsh app run agentic-browser --function execute --session $SESSION_ID --input '{
"code": "document.title"
}'在页面上运行JavaScript代码。
bash
infsh app run agentic-browser --function execute --session $SESSION_ID --input '{
"code": "document.title"
}'close
close
Close browser session.
bash
infsh app run agentic-browser --function close --session $SESSION_ID --input '{}'关闭浏览器会话。
bash
infsh app run agentic-browser --function close --session $SESSION_ID --input '{}'Element Refs
元素引用
Elements are returned with refs like:
@e@e1 [a] "Home" href="/"
@e2 [input type="text"] placeholder="Search"
@e3 [button] "Submit"
@e4 [select] "Choose option"Important: Refs are invalidated after navigation. Always re-snapshot after:
- Clicking links/buttons that navigate
- Form submissions
- Dynamic content loading
元素会以引用的形式返回,例如:
@e@e1 [a] "Home" href="/"
@e2 [input type="text"] placeholder="Search"
@e3 [button] "Submit"
@e4 [select] "Choose option"重要提示:导航后引用会失效。在以下操作后务必重新获取快照:
- 点击会触发导航的链接/按钮
- 表单提交
- 动态内容加载
Examples
示例
Form Submission
表单提交
bash
SESSION=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com/contact"
}' | jq -r '.session_id')bash
SESSION=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://example.com/contact"
}' | jq -r '.session_id')Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea] "Message", @e4 [button] "Send"
Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea] "Message", @e4 [button] "Send"
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'
Check result
Check result
infsh app run agentic-browser --function snapshot --session $SESSION --input '{}'
infsh app run agentic-browser --function close --session $SESSION --input '{}'
undefinedinfsh app run agentic-browser --function snapshot --session $SESSION --input '{}'
infsh app run agentic-browser --function close --session $SESSION --input '{}'
undefinedSearch and Extract
搜索与提取
bash
SESSION=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://google.com"
}' | jq -r '.session_id')bash
SESSION=$(infsh app run agentic-browser --function open --session new --input '{
"url": "https://google.com"
}' | jq -r '.session_id')Fill search box and submit
Fill search box and submit
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}'
infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'
Get results page
Get results page
infsh app run agentic-browser --function snapshot --session $SESSION --input '{}'
infsh app run agentic-browser --function close --session $SESSION --input '{}'
undefinedinfsh app run agentic-browser --function snapshot --session $SESSION --input '{}'
infsh app run agentic-browser --function close --session $SESSION --input '{}'
undefinedExtract Data with JavaScript
使用JavaScript提取数据
bash
infsh app run agentic-browser --function execute --session $SESSION --input '{
"code": "Array.from(document.querySelectorAll(\"h2\")).map(h => h.textContent)"
}'bash
infsh app run agentic-browser --function execute --session $SESSION --input '{
"code": "Array.from(document.querySelectorAll(\"h2\")).map(h => h.textContent)"
}'Sessions
会话
Browser state persists within a session. Always:
- Start with on first call
--session new - Use returned for subsequent calls
session_id - Close session when done
浏览器状态会在会话中持久化。请始终遵循以下步骤:
- 首次调用时使用启动会话
--session new - 后续调用使用返回的
session_id - 使用完成后关闭会话
Related Skills
相关技能
bash
undefinedbash
undefinedWeb search (for research + browse)
Web search (for research + browse)
npx skills add inference-sh/skills@web-search
npx skills add inference-sh/skills@web-search
LLM models (analyze extracted content)
LLM models (analyze extracted content)
npx skills add inference-sh/skills@llm-models
undefinednpx skills add inference-sh/skills@llm-models
undefinedDocumentation
文档
- inference.sh Sessions - Session management
- Multi-function Apps - How functions work
- inference.sh Sessions - 会话管理
- Multi-function Apps - 函数工作机制