Loading...
Loading...
Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: web scraping, form filling, clicking, typing, drag-drop, file upload, JavaScript execution. Use for: web automation, data extraction, testing, agent browsing, research. Triggers: browser, web automation, scrape, navigate, click, fill form, screenshot, browse web, playwright, headless browser, web agent, surf internet, record video
npx skill4agent add inference-sh-6/skills agent-browser@e
# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login
# Open a page and get interactive elements
infsh app run agent-browser --function open --input '{"url": "https://example.com"}' --session newInstall note: The install script only detects your OS/architecture, downloads the matching binary from, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.dist.inference.sh
@e# 1. Start session
RESULT=$(infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com/login"
}')
SESSION_ID=$(echo $RESULT | jq -r '.session_id')
# Elements: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"
# 2. Fill and submit
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e1", "text": "user@example.com"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
"action": "fill", "ref": "@e2", "text": "password123"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
"action": "click", "ref": "@e3"
}'
# 3. Re-snapshot after navigation
infsh app run agent-browser --function snapshot --session $SESSION_ID --input '{}'
# 4. Close when done
infsh app run agent-browser --function close --session $SESSION_ID --input '{}'| Function | Description |
|---|---|
| Navigate to URL, configure browser (viewport, proxy, video recording) |
| Re-fetch page state with |
| Perform actions using |
| Take page screenshot (viewport or full page) |
| Run JavaScript code on the page |
| Close session, returns video if recording was enabled |
| Action | Description | Required Fields |
|---|---|---|
| Click element | |
| Double-click element | |
| Clear and type text | |
| Type text (no clear) | |
| Press key (Enter, Tab, etc.) | |
| Select dropdown option | |
| Hover over element | |
| Check checkbox | |
| Uncheck checkbox | |
| Drag and drop | |
| Upload file(s) | |
| Scroll page | |
| Go back in history | - |
| Wait milliseconds | |
| Navigate to URL | |
@e@e1 [a] "Home" href="/"
@e2 [input type="text"] placeholder="Search"
@e3 [button] "Submit"
@e4 [select] "Choose option"
@e5 [input type="checkbox"] name="agree"# Start with recording enabled (optionally show cursor indicator)
SESSION=$(infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com",
"record_video": true,
"show_cursor": true
}' | jq -r '.session_id')
# ... perform actions ...
# Close to get the video file
infsh app run agent-browser --function close --session $SESSION --input '{}'
# Returns: {"success": true, "video": <File>}infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com",
"show_cursor": true,
"record_video": true
}'infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com",
"proxy_url": "http://proxy.example.com:8080",
"proxy_username": "user",
"proxy_password": "pass"
}'infsh app run agent-browser --function interact --session $SESSION --input '{
"action": "upload",
"ref": "@e5",
"file_paths": ["/path/to/file.pdf"]
}'infsh app run agent-browser --function interact --session $SESSION --input '{
"action": "drag",
"ref": "@e1",
"target_ref": "@e2"
}'infsh app run agent-browser --function execute --session $SESSION --input '{
"code": "document.querySelectorAll(\"h2\").length"
}'
# Returns: {"result": "5", "screenshot": <File>}| Reference | Description |
|---|---|
| references/commands.md | Full function reference with all options |
| references/snapshot-refs.md | Ref lifecycle, invalidation rules, troubleshooting |
| references/session-management.md | Session persistence, parallel sessions |
| references/authentication.md | Login flows, OAuth, 2FA handling |
| references/video-recording.md | Recording workflows for debugging |
| references/proxy-support.md | Proxy configuration, geo-testing |
| Template | Description |
|---|---|
| templates/form-automation.sh | Form filling with validation |
| templates/authenticated-session.sh | Login once, reuse session |
| templates/capture-workflow.sh | Content extraction with screenshots |
SESSION=$(infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com/contact"
}' | jq -r '.session_id')
# Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea], @e4 [button] "Send"
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'
infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run agent-browser --function close --session $SESSION --input '{}'SESSION=$(infsh app run agent-browser --function open --session new --input '{
"url": "https://google.com"
}' | jq -r '.session_id')
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'
infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run agent-browser --function close --session $SESSION --input '{}'SESSION=$(infsh app run agent-browser --function open --session new --input '{
"url": "https://example.com",
"record_video": true
}' | jq -r '.session_id')
# Take full page screenshot
infsh app run agent-browser --function screenshot --session $SESSION --input '{
"full_page": true
}'
# Close and get video
RESULT=$(infsh app run agent-browser --function close --session $SESSION --input '{}')
echo $RESULT | jq '.video'--session newsession_id# Web search (for research + browse)
npx skills add inference-sh/skills@web-search
# LLM models (analyze extracted content)
npx skills add inference-sh/skills@llm-models