pilotty
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTerminal Automation with pilotty
基于pilotty的终端自动化
CRITICAL: Argument Positioning
重点:参数位置
All flags (, , , etc.) MUST come BEFORE positional arguments:
--name-s--formatbash
undefined所有标志(、、等)必须放在位置参数之前:
--name-s--formatbash
undefinedCORRECT - flags before command/arguments
正确 - 标志在命令/参数之前
pilotty spawn --name myapp vim file.txt
pilotty key -s myapp Enter
pilotty snapshot -s myapp --format text
pilotty spawn --name myapp vim file.txt
pilotty key -s myapp Enter
pilotty snapshot -s myapp --format text
WRONG - flags after command (they get passed to the app, not pilotty!)
错误 - 标志在命令之后(它们会被传递给应用,而非pilotty!)
pilotty spawn vim file.txt --name myapp # FAILS: --name goes to vim
pilotty key Enter -s myapp # FAILS: -s goes nowhere useful
This is the #1 cause of agent failures. When in doubt: **flags first, then command/args**.
---pilotty spawn vim file.txt --name myapp # 失败:--name被传递给vim
pilotty key Enter -s myapp # 失败:-s没有起到有效作用
这是导致Agent失败的头号原因。如有疑问:**先写标志,再写命令/参数**。
---Quick start
快速开始
bash
pilotty spawn vim file.txt # Start TUI app in managed session
pilotty wait-for "file.txt" # Wait for app to be ready
pilotty snapshot # Get screen state with UI elements
pilotty key i # Enter insert mode
pilotty type "Hello, World!" # Type text
pilotty key Escape # Exit insert mode
pilotty kill # End sessionbash
pilotty spawn vim file.txt # 在托管会话中启动TUI应用
pilotty wait-for "file.txt" # 等待应用准备就绪
pilotty snapshot # 获取包含UI元素的屏幕状态
pilotty key i # 进入插入模式
pilotty type "Hello, World!" # 输入文本
pilotty key Escape # 退出插入模式
pilotty kill # 结束会话Core workflow
核心工作流
- Spawn: starts the app in a background PTY
pilotty spawn <command> - Wait: ensures the app is ready
pilotty wait-for <text> - Snapshot: returns screen state with detected UI elements
pilotty snapshot - Understand: Parse to identify buttons, inputs, toggles
elements[] - Interact: Use keyboard commands (,
key) to navigate and interacttype - Re-snapshot: Check to detect screen changes
content_hash
- 启动(Spawn):在后台PTY中启动应用
pilotty spawn <command> - 等待(Wait):确保应用已准备就绪
pilotty wait-for <text> - 快照(Snapshot):返回包含检测到的UI元素的屏幕状态
pilotty snapshot - 解析(Understand):解析以识别按钮、输入框、切换开关
elements[] - 交互(Interact):使用键盘命令(、
key)进行导航和交互type - 重新快照(Re-snapshot):检查以检测屏幕变化
content_hash
Commands
命令
Session management
会话管理
bash
pilotty spawn <command> # Start TUI app (e.g., pilotty spawn htop)
pilotty spawn --name myapp <cmd> # Start with custom session name (--name before command)
pilotty kill # Kill default session
pilotty kill -s myapp # Kill specific session
pilotty list-sessions # List all active sessions
pilotty daemon # Manually start daemon (usually auto-starts)
pilotty shutdown # Stop daemon and all sessions
pilotty examples # Show end-to-end workflow examplebash
pilotty spawn <command> # 启动TUI应用(例如:pilotty spawn htop)
pilotty spawn --name myapp <cmd> # 使用自定义会话名称启动(--name需在命令之前)
pilotty kill # 终止默认会话
pilotty kill -s myapp # 终止指定会话
pilotty list-sessions # 列出所有活跃会话
pilotty daemon # 手动启动守护进程(通常会自动启动)
pilotty shutdown # 停止守护进程和所有会话
pilotty examples # 展示端到端工作流示例Screen capture
屏幕捕获
bash
pilotty snapshot # Full JSON with text content and elements
pilotty snapshot --format compact # JSON without text field
pilotty snapshot --format text # Plain text with cursor indicator
pilotty snapshot -s myapp # Snapshot specific sessionbash
pilotty snapshot # 包含文本内容和元素的完整JSON
pilotty snapshot --format compact # 不含text字段的JSON
pilotty snapshot --format text # 带光标指示器的纯文本
pilotty snapshot -s myapp # 捕获指定会话的快照Wait for screen to change (eliminates need for sleep!)
等待屏幕变化(无需再用sleep!)
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH # Block until screen changes
pilotty snapshot --await-change $HASH --settle 50 # Wait for 50ms stability
undefinedHASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH # 阻塞直到屏幕变化
pilotty snapshot --await-change $HASH --settle 50 # 等待50ms稳定时间
undefinedInput
输入操作
bash
pilotty type "hello" # Type text at cursor
pilotty type -s myapp "text" # Type in specific session
pilotty key Enter # Press Enter
pilotty key Ctrl+C # Send interrupt
pilotty key Escape # Send Escape
pilotty key Tab # Send Tab
pilotty key F1 # Function key
pilotty key Alt+F # Alt combination
pilotty key Up # Arrow key
pilotty key -s myapp Ctrl+S # Key in specific sessionbash
pilotty type "hello" # 在光标位置输入文本
pilotty type -s myapp "text" # 在指定会话中输入文本
pilotty key Enter # 按下回车键
pilotty key Ctrl+C # 发送中断信号
pilotty key Escape # 按下Esc键
pilotty key Tab # 按下Tab键
pilotty key F1 # 按下功能键F1
pilotty key Alt+F # 按下Alt+F组合键
pilotty key Up # 按下上箭头键
pilotty key -s myapp Ctrl+S # 在指定会话中按下Ctrl+SKey sequences (space-separated, sent in order)
按键序列(空格分隔,按顺序发送)
pilotty key "Ctrl+X m" # Emacs chord: Ctrl+X then m
pilotty key "Escape : w q Enter" # vim :wq sequence
pilotty key "a b c" --delay 50 # Send a, b, c with 50ms delay
pilotty key -s myapp "Tab Tab Enter" # Sequence in specific session
undefinedpilotty key "Ctrl+X m" # Emacs组合键:先按Ctrl+X再按m
pilotty key "Escape : w q Enter" # vim的:wq序列
pilotty key "a b c" --delay 50 # 发送a、b、c,每个按键间隔50ms
pilotty key -s myapp "Tab Tab Enter" # 在指定会话中执行按键序列
undefinedInteraction
交互操作
bash
pilotty click 5 10 # Click at row 5, col 10
pilotty click -s myapp 10 20 # Click in specific session
pilotty scroll up # Scroll up 1 line
pilotty scroll down 5 # Scroll down 5 lines
pilotty scroll up 10 -s myapp # Scroll in specific sessionbash
pilotty click 5 10 # 点击第5行第10列
pilotty click -s myapp 10 20 # 在指定会话中点击第10行第20列
pilotty scroll up # 向上滚动1行
pilotty scroll down 5 # 向下滚动5行
pilotty scroll up 10 -s myapp # 在指定会话中向上滚动10行Terminal control
终端控制
bash
pilotty resize 120 40 # Resize terminal to 120 cols x 40 rows
pilotty resize 80 24 -s myapp # Resize specific session
pilotty wait-for "Ready" # Wait for text to appear (30s default)
pilotty wait-for "Error" -r # Wait for regex pattern
pilotty wait-for "Done" -t 5000 # Wait with 5s timeout
pilotty wait-for "~" -s editor # Wait in specific sessionbash
pilotty resize 120 40 # 将终端调整为120列×40行
pilotty resize 80 24 -s myapp # 调整指定会话的终端尺寸
pilotty wait-for "Ready" # 等待文本出现(默认超时30秒)
pilotty wait-for "Error" -r # 等待正则表达式匹配的内容
pilotty wait-for "Done" -t 5000 # 等待文本出现,超时时间5秒
pilotty wait-for "~" -s editor # 在指定会话中等待文本出现Global options
全局选项
| Option | Description |
|---|---|
| Target specific session (default: "default") |
| Snapshot format: full, compact, text |
| Timeout for wait-for and await-change (default: 30000) |
| Treat wait-for pattern as regex |
| Session name for spawn command |
| Delay between keys in a sequence (default: 0, max: 10000) |
| Block snapshot until content_hash differs |
| Wait for screen to be stable for this many ms (default: 0) |
| 选项 | 描述 |
|---|---|
| 目标指定会话(默认:"default") |
| 快照格式:full、compact、text |
| wait-for和await-change的超时时间(默认:30000) |
| 将wait-for的匹配模式视为正则表达式 |
| spawn命令的会话名称 |
| 按键序列中按键之间的延迟(默认:0,最大值:10000) |
| 阻塞快照直到content_hash发生变化 |
| 等待屏幕稳定的时间(默认:0) |
Environment variables
环境变量
bash
PILOTTY_SESSION="mysession" # Default session name
PILOTTY_SOCKET_DIR="/tmp/pilotty" # Override socket directory
RUST_LOG="debug" # Enable debug loggingbash
PILOTTY_SESSION="mysession" # 默认会话名称
PILOTTY_SOCKET_DIR="/tmp/pilotty" # 覆盖套接字目录
RUST_LOG="debug" # 启用调试日志Snapshot Output
快照输出
The command returns structured JSON with detected UI elements:
snapshotjson
{
"snapshot_id": 42,
"size": { "cols": 80, "rows": 24 },
"cursor": { "row": 5, "col": 10, "visible": true },
"text": "Settings:\n [x] Notifications [ ] Dark mode\n [Save] [Cancel]",
"elements": [
{ "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
{ "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
{ "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
{ "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
],
"content_hash": 12345678901234567890
}Use for a plain text view with cursor indicator:
--format text--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]The shows cursor position. Use the text content to understand screen state and navigate with keyboard commands.
[_]snapshotjson
{
"snapshot_id": 42,
"size": { "cols": 80, "rows": 24 },
"cursor": { "row": 5, "col": 10, "visible": true },
"text": "Settings:\n [x] Notifications [ ] Dark mode\n [Save] [Cancel]",
"elements": [
{ "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
{ "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
{ "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
{ "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
],
"content_hash": 12345678901234567890
}使用可获取带光标指示器的纯文本视图:
--format text--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_][_]Element Detection
元素检测
pilotty automatically detects interactive UI elements in terminal applications. Elements provide read-only context to help understand UI structure.
pilotty会自动检测终端应用中的交互式UI元素。元素提供只读上下文,帮助理解UI结构。
Element Kinds
元素类型
| Kind | Detection Patterns | Confidence | Fields |
|---|---|---|---|
| toggle | | 1.0 | |
| button | Inverse video, | 1.0 / 0.8 | |
| input | Cursor position, | 1.0 / 0.6 | |
| 类型 | 检测模式 | 置信度 | 字段 |
|---|---|---|---|
| toggle(切换开关) | | 1.0 | |
| button(按钮) | 反显文本、 | 1.0 / 0.8 | |
| input(输入框) | 光标位置、 | 1.0 / 0.6 | |
Element Fields
元素字段
| Field | Type | Description |
|---|---|---|
| string | Element type: |
| number | Row position (0-based from top) |
| number | Column position (0-based from left) |
| number | Width in terminal cells (CJK chars = 2) |
| string | Text content of the element |
| number | Detection confidence (0.0-1.0) |
| bool | Whether element has focus (only present if true) |
| bool | Toggle state (only present for toggles) |
| 字段 | 类型 | 描述 |
|---|---|---|
| string | 元素类型: |
| number | 行位置(从顶部开始,0索引) |
| number | 列位置(从左侧开始,0索引) |
| number | 终端单元格宽度(CJK字符=2) |
| string | 元素的文本内容 |
| number | 检测置信度(0.0-1.0) |
| bool | 元素是否获得焦点(仅当为true时存在) |
| bool | 切换开关状态(仅针对toggle类型) |
Confidence Levels
置信度等级
| Confidence | Meaning |
|---|---|
| 1.0 | High confidence: Cursor position, inverse video, checkbox patterns |
| 0.8 | Medium confidence: Bracket patterns |
| 0.6 | Lower confidence: Underscore input fields |
| 置信度 | 含义 |
|---|---|
| 1.0 | 高置信度:光标位置、反显文本、复选框模式 |
| 0.8 | 中置信度:括号模式如 |
| 0.6 | 低置信度:下划线输入字段如 |
Wait for Screen Changes (Recommended)
等待屏幕变化(推荐)
Stop guessing sleep durations! Use to wait for the screen to actually update:
--await-changebash
undefined停止猜测sleep时长! 使用等待屏幕实际更新:
--await-changebash
undefinedCapture baseline hash
捕获基准哈希
HASH=$(pilotty snapshot | jq '.content_hash')
HASH=$(pilotty snapshot | jq '.content_hash')
Perform action
执行操作
pilotty key Enter
pilotty key Enter
Wait for screen to change (blocks until hash differs)
等待屏幕变化(阻塞直到哈希不同)
pilotty snapshot --await-change $HASH
pilotty snapshot --await-change $HASH
Or wait for screen to stabilize (for apps that render progressively)
或等待屏幕稳定(针对渐进式渲染的应用)
pilotty snapshot --await-change $HASH --settle 100
**Flags:**
| Flag | Description |
|------|-------------|
| `--await-change <HASH>` | Block until `content_hash` differs from this value |
| `--settle <MS>` | After change detected, wait for screen to be stable for MS |
| `-t, --timeout <MS>` | Maximum wait time (default: 30000) |
**Why this is better than sleep:**
- `sleep 1` is a guess - too short causes race conditions, too long slows automation
- `--await-change` waits exactly as long as needed - no more, no less
- `--settle` handles apps that render progressively (show partial, then complete)pilotty snapshot --await-change $HASH --settle 100
**标志:**
| 标志 | 描述 |
|------|-------------|
| `--await-change <HASH>` | 阻塞直到`content_hash`与该值不同 |
| `--settle <MS>` | 检测到变化后,等待屏幕稳定MS毫秒 |
| `-t, --timeout <MS>` | 最大等待时间(默认:30000) |
**为什么这比sleep更好:**
- `sleep 1`是猜测 - 时间太短会导致竞争条件,太长会降低自动化效率
- `--await-change`只等待必要的时间 - 不多也不少
- `--settle`处理渐进式渲染的应用(先显示部分内容,再显示完整内容)Waiting for Streaming AI Responses
等待流式AI响应
When interacting with AI-powered TUIs (like opencode, etc.) that stream responses, you need a longer time since the screen keeps updating as tokens arrive:
--settlebash
undefined与AI驱动的TUI(如opencode等)交互时,这些应用会流式返回响应,因此需要更长的时间,因为屏幕会随着token的到达持续更新:
--settlebash
undefined1. Capture hash before sending prompt
1. 发送提示前捕获哈希
HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')
HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')
2. Type prompt and submit
2. 输入提示并提交
pilotty type -s myapp "write me a poem about ai agents"
pilotty key -s myapp Enter
pilotty type -s myapp "write me a poem about ai agents"
pilotty key -s myapp Enter
3. Wait for streaming response to complete
3. 等待流式响应完成
- Use longer settle (2-3s) since AI apps pause between chunks
- settle=3000:AI响应在块之间会有停顿,等待3秒确保流式传输真正完成
- Extend timeout for long responses (60s+)
- timeout=60000:为长响应延长超时时间(60秒以上)
pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000
pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000
4. Response may be scrolled - scroll up if needed to see full output
4. 响应可能已滚动,如有需要向上滚动查看完整输出
pilotty scroll -s myapp up 10
pilotty snapshot -s myapp --format text
**Key parameters for streaming:**
- `--settle 2000-3000`: AI responses have pauses between chunks; 2-3 seconds ensures streaming is truly done
- `-t 60000`: Extend timeout beyond the 30s default for longer generations
- The settle timer resets on each screen change, so it naturally waits until streaming stopspilotty scroll -s myapp up 10
pilotty snapshot -s myapp --format text
**流式传输的关键参数:**
- `--settle 2000-3000`:AI响应在块之间有停顿,2-3秒确保流式传输真正完成
- `-t 60000`:将超时时间从默认30秒延长,以适应长生成任务
- 长响应可能会滚动终端;使用`scroll up`查看开头内容
- settle计时器会在每次屏幕更新时重置,因此会等待真正的完成Manual Change Detection
手动检测变化
For manual polling (not recommended), use directly:
content_hashbash
undefined对于手动轮询(不推荐),可直接使用:
content_hashbash
undefinedGet initial state
获取初始状态
SNAP1=$(pilotty snapshot)
HASH1=$(echo "$SNAP1" | jq -r '.content_hash')
SNAP1=$(pilotty snapshot)
HASH1=$(echo "$SNAP1" | jq -r '.content_hash')
Perform action
执行操作
pilotty key Tab
pilotty key Tab
Check if screen changed
检查屏幕是否变化
SNAP2=$(pilotty snapshot)
HASH2=$(echo "$SNAP2" | jq -r '.content_hash')
if [ "$HASH1" != "$HASH2" ]; then
echo "Screen changed - re-analyze elements"
fi
undefinedSNAP2=$(pilotty snapshot)
HASH2=$(echo "$SNAP2" | jq -r '.content_hash')
if [ "$HASH1" != "$HASH2" ]; then
echo "Screen changed - re-analyze elements"
fi
undefinedUsing Elements Effectively
有效使用元素
Elements are read-only context for understanding the UI. Use keyboard navigation for reliable interaction:
bash
undefined元素是理解UI的只读上下文。使用键盘导航进行可靠交互:
bash
undefined1. Get snapshot to understand UI structure
1. 获取快照以了解UI结构
pilotty snapshot | jq '.elements'
pilotty snapshot | jq '.elements'
Output shows toggles (checked/unchecked) and buttons with positions
输出显示切换开关(已选/未选)和按钮的位置
2. Navigate and interact with keyboard (reliable approach)
2. 使用键盘导航和交互
pilotty key Tab # Move to next element
pilotty key Space # Toggle checkbox
pilotty key Enter # Activate button
pilotty key Tab # 移动到下一个元素
pilotty key Space # 切换复选框
pilotty key Enter # 激活按钮
3. Verify state changed
3. 验证状态变化
pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'
**Key insight**: Use elements to understand WHAT is on screen, use keyboard to interact with it.
---pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'
**核心要点**:使用元素了解屏幕上有什么,使用键盘与之交互。
---Navigation Approach
导航方法
pilotty uses keyboard-first navigation, just like a human would:
bash
undefinedpilotty采用优先键盘导航的方式,就像人类操作一样:
bash
undefined1. Take snapshot to see the screen
1. 拍摄快照查看屏幕
pilotty snapshot --format text
pilotty snapshot --format text
2. Navigate using keyboard
2. 使用键盘导航
pilotty key Tab # Move to next element
pilotty key Enter # Activate/select
pilotty key Escape # Cancel/back
pilotty key Up # Move up in list/menu
pilotty key Space # Toggle checkbox
pilotty key Tab # 移动到下一个元素
pilotty key Enter # 激活/选择
pilotty key Escape # 取消/返回
pilotty key Up # 在列表/菜单中向上移动
pilotty key Space # 切换复选框
3. Type text when needed
3. 必要时输入文本
pilotty type "search term"
pilotty key Enter
pilotty type "search term"
pilotty key Enter
4. Click at coordinates for mouse-enabled TUIs
4. 对支持鼠标的TUI,点击坐标位置
pilotty click 5 10 # Click at row 5, col 10
**Key insight**: Parse the snapshot text and elements to understand what's on screen, then use keyboard commands to navigate. This works reliably across all TUI applications.
---pilotty click 5 10 # 点击第5行第10列
**核心要点**:解析快照文本和元素以了解屏幕内容,然后使用键盘命令导航。这在所有TUI应用中都能可靠工作。
---Example: Edit file with vim
示例:使用vim编辑文件
bash
undefinedbash
undefined1. Spawn vim
1. 启动vim
pilotty spawn --name editor vim /tmp/hello.txt
pilotty spawn --name editor vim /tmp/hello.txt
2. Wait for vim to load and capture baseline hash
2. 等待vim加载并捕获基准哈希
pilotty wait-for -s editor "hello.txt"
HASH=$(pilotty snapshot -s editor | jq '.content_hash')
pilotty wait-for -s editor "hello.txt"
HASH=$(pilotty snapshot -s editor | jq '.content_hash')
3. Enter insert mode
3. 进入插入模式
pilotty key -s editor i
pilotty key -s editor i
4. Type content
4. 输入内容
pilotty type -s editor "Hello from pilotty!"
pilotty type -s editor "Hello from pilotty!"
5. Wait for screen to update, then exit (no sleep needed!)
5. 等待屏幕更新,然后退出(无需sleep!)
pilotty snapshot -s editor --await-change $HASH --settle 50
pilotty key -s editor "Escape : w q Enter"
pilotty snapshot -s editor --await-change $HASH --settle 50
pilotty key -s editor "Escape : w q Enter"
6. Verify session ended
6. 验证会话已结束
pilotty list-sessions
Alternative using individual keys:
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enterpilotty list-sessions
使用单个按键的替代方式:
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor EnterExample: Dialog checklist interaction
示例:Dialog复选框交互
bash
undefinedbash
undefined1. Spawn dialog checklist (--name before command)
1. 启动dialog复选框(--name需在命令之前)
pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off
pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off
2. Wait for dialog to render (use await-change, not sleep!)
2. 等待dialog渲染完成(使用await-change,而非sleep!)
pilotty snapshot -s opts --settle 200 # Wait for initial render to stabilize
pilotty snapshot -s opts --settle 200 # 等待初始渲染稳定
3. Get snapshot and examine elements, capture hash
3. 获取快照并检查元素,捕获哈希
SNAP=$(pilotty snapshot -s opts)
echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")'
HASH=$(echo "$SNAP" | jq '.content_hash')
SNAP=$(pilotty snapshot -s opts)
echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")'
HASH=$(echo "$SNAP" | jq '.content_hash')
4. Navigate to "darkmode" and toggle it
4. 导航到“darkmode”并切换
pilotty key -s opts Down # Move to second option
pilotty key -s opts Space # Toggle it on
pilotty key -s opts Down # 移动到第二个选项
pilotty key -s opts Space # 切换为开启
5. Wait for change and verify
5. 等待变化并验证
pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'
pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'
6. Confirm selection
6. 确认选择
pilotty key -s opts Enter
pilotty key -s opts Enter
7. Clean up
7. 清理
pilotty kill -s opts
undefinedpilotty kill -s opts
undefinedExample: Form filling with elements
示例:使用元素填充表单
bash
undefinedbash
undefined1. Spawn a form application
1. 启动表单应用
pilotty spawn --name form my-form-app
pilotty spawn --name form my-form-app
2. Get snapshot to understand form structure
2. 获取快照以了解表单结构
pilotty snapshot -s form | jq '.elements'
pilotty snapshot -s form | jq '.elements'
Shows inputs, toggles, and buttons with positions for click command
显示输入框、切换开关和按钮的位置,用于click命令
3. Tab to first input (likely already focused)
3. 切换到第一个输入框(可能已获得焦点)
pilotty type -s form "myusername"
pilotty type -s form "myusername"
4. Tab to password field
4. 切换到密码字段
pilotty key -s form Tab
pilotty type -s form "mypassword"
pilotty key -s form Tab
pilotty type -s form "mypassword"
5. Tab to remember me and toggle
5. 切换到“记住我”并切换
pilotty key -s form Tab
pilotty key -s form Space
pilotty key -s form Tab
pilotty key -s form Space
6. Tab to Login and activate
6. 切换到“登录”并激活
pilotty key -s form Tab
pilotty key -s form Enter
pilotty key -s form Tab
pilotty key -s form Enter
7. Check result
7. 检查结果
pilotty snapshot -s form --format text
undefinedpilotty snapshot -s form --format text
undefinedExample: Monitor with htop
示例:使用htop监控
bash
undefinedbash
undefined1. Spawn htop
1. 启动htop
pilotty spawn --name monitor htop
pilotty spawn --name monitor htop
2. Wait for display
2. 等待显示
pilotty wait-for -s monitor "CPU"
pilotty wait-for -s monitor "CPU"
3. Take snapshot to see current state
3. 拍摄快照查看当前状态
pilotty snapshot -s monitor --format text
pilotty snapshot -s monitor --format text
4. Send commands
4. 发送命令
pilotty key -s monitor F9 # Kill menu
pilotty key -s monitor q # Quit
pilotty key -s monitor F9 # 打开终止菜单
pilotty key -s monitor q # 退出
5. Kill session
5. 终止会话
pilotty kill -s monitor
undefinedpilotty kill -s monitor
undefinedExample: Interact with AI TUI (opencode, etc.)
示例:与AI TUI交互(如opencode等)
AI-powered TUIs stream responses, requiring special handling:
bash
undefinedAI驱动的TUI会流式返回响应,需要特殊处理:
bash
undefined1. Spawn the AI app
1. 启动AI应用
pilotty spawn --name ai opencode
pilotty spawn --name ai opencode
2. Wait for the prompt to be ready
2. 等待提示准备就绪
pilotty wait-for -s ai "Ask anything" -t 15000
pilotty wait-for -s ai "Ask anything" -t 15000
3. Capture baseline hash
3. 捕获基准哈希
HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')
HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')
4. Type prompt and submit
4. 输入提示并提交
pilotty type -s ai "explain the architecture of this codebase"
pilotty key -s ai Enter
pilotty type -s ai "explain the architecture of this codebase"
pilotty key -s ai Enter
5. Wait for streaming response to complete
5. 等待流式响应完成
- settle=3000: Wait 3s of no changes to ensure streaming is done
- settle=3000:等待3秒无变化,确保流式传输完成
- timeout=60000: Allow up to 60s for long responses
- timeout=60000:允许最长60秒等待长响应
pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text
pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text
6. If response is long and scrolled, scroll up to see full output
6. 如果响应过长已滚动,向上滚动查看完整输出
pilotty scroll -s ai up 20
pilotty snapshot -s ai --format text
pilotty scroll -s ai up 20
pilotty snapshot -s ai --format text
7. Clean up
7. 清理
pilotty kill -s ai
**Gotchas with AI apps:**
- Use `--settle 2000-3000` because AI responses pause between chunks
- Extend timeout with `-t 60000` for complex prompts
- Long responses may scroll the terminal; use `scroll up` to see the beginning
- The settle timer resets on each screen update, so it waits for true completion
---pilotty kill -s ai
**AI应用的注意事项:**
- 使用`--settle 2000-3000`,因为AI响应在块之间会有停顿
- 使用`-t 60000`延长超时时间以处理复杂提示
- 长响应可能会滚动终端;使用`scroll up`查看开头
- settle计时器会在每次屏幕更新时重置,因此会等待真正的完成
---Sessions
会话
Each session is isolated with its own:
- PTY (pseudo-terminal)
- Screen buffer
- Child process
bash
undefined每个会话都是独立的,拥有自己的:
- PTY(伪终端)
- 屏幕缓冲区
- 子进程
bash
undefinedRun multiple apps (--name must come before the command)
运行多个应用(--name必须在命令之前)
pilotty spawn --name monitoring htop
pilotty spawn --name editor vim file.txt
pilotty spawn --name monitoring htop
pilotty spawn --name editor vim file.txt
Target specific session
目标指定会话
pilotty snapshot -s monitoring
pilotty key -s editor Ctrl+S
pilotty snapshot -s monitoring
pilotty key -s editor Ctrl+S
List all
列出所有会话
pilotty list-sessions
pilotty list-sessions
Kill specific
终止指定会话
pilotty kill -s editor
The first session spawned without `--name` is automatically named `default`.
> **Important:** The `--name` flag must come **before** the command. Everything after the command is passed as arguments to that command.pilotty kill -s editor
第一个未使用`--name`启动的会话会自动命名为`default`。
> **重要提示:** `--name`标志必须放在**命令之前**。命令之后的所有内容都会作为参数传递给该命令。Daemon Architecture
守护进程架构
pilotty uses a background daemon for session management:
- Auto-start: Daemon starts on first command
- Auto-stop: Shuts down after 5 minutes with no sessions
- Session cleanup: Sessions removed when process exits (within 500ms)
- Shared state: Multiple CLI calls share sessions
You rarely need to manage the daemon manually.
pilotty使用后台守护进程进行会话管理:
- 自动启动:首次执行命令时启动守护进程
- 自动停止:无会话5分钟后自动关闭
- 会话清理:进程退出时移除会话(500ms内)
- 共享状态:多个CLI调用共享会话
您几乎不需要手动管理守护进程。
Error Handling
错误处理
Errors include actionable suggestions:
json
{
"code": "SESSION_NOT_FOUND",
"message": "Session 'abc123' not found",
"suggestion": "Run 'pilotty list-sessions' to see available sessions"
}json
{
"code": "SPAWN_FAILED",
"message": "Failed to spawn process: command not found",
"suggestion": "Check that the command exists and is in PATH"
}错误信息包含可行的建议:
json
{
"code": "SESSION_NOT_FOUND",
"message": "Session 'abc123' not found",
"suggestion": "Run 'pilotty list-sessions' to see available sessions"
}json
{
"code": "SPAWN_FAILED",
"message": "Failed to spawn process: command not found",
"suggestion": "Check that the command exists and is in PATH"
}Common Patterns
常见模式
Reliable action + wait (recommended)
可靠操作+等待(推荐)
bash
undefinedbash
undefinedThe pattern: capture hash, act, await change
模式:捕获哈希,执行操作,等待变化
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH --settle 50
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH --settle 50
This replaces fragile patterns like:
这替代了脆弱的模式:
pilotty key Enter && sleep 1 && pilotty snapshot # BAD: guessing
pilotty key Enter && sleep 1 && pilotty snapshot # 糟糕:靠猜测
undefinedundefinedWait then act
等待后操作
bash
pilotty spawn my-app
pilotty wait-for "Ready" # Ensure app is ready
pilotty snapshot # Then snapshotbash
pilotty spawn my-app
pilotty wait-for "Ready" # 确保应用准备就绪
pilotty snapshot # 然后拍摄快照Check state before action
操作前检查状态
bash
pilotty snapshot --format text | grep "Error" # Check for errors
pilotty key Enter # Then proceedbash
pilotty snapshot --format text | grep "Error" # 检查是否有错误
pilotty key Enter # 然后继续Check for specific element
检查特定元素
bash
undefinedbash
undefinedCheck if the first toggle is checked
检查第一个切换开关是否已选中
pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1
pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1
Find element at specific position
查找特定位置的元素
pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'
undefinedpilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'
undefinedRetry on timeout
超时重试
bash
pilotty wait-for "Ready" -t 5000 || {
pilotty snapshot --format text # Check what's on screen
# Adjust approach based on actual state
}bash
pilotty wait-for "Ready" -t 5000 || {
pilotty snapshot --format text # 检查屏幕内容
# 根据实际状态调整方法
}Deep-dive Documentation
深入文档
For detailed patterns and edge cases, see:
| Reference | Description |
|---|---|
| references/session-management.md | Multi-session patterns, isolation, cleanup |
| references/key-input.md | Complete key combinations reference |
| references/element-detection.md | Detection rules, confidence, patterns |
有关详细模式和边缘情况,请参阅:
| 参考文档 | 描述 |
|---|---|
| references/session-management.md | 多会话模式、隔离、清理 |
| references/key-input.md | 完整按键组合参考 |
| references/element-detection.md | 检测规则、置信度、模式 |
Ready-to-use Templates
即用型模板
Executable workflow scripts:
| Template | Description |
|---|---|
| templates/vim-workflow.sh | Edit file with vim, save, exit |
| templates/dialog-interaction.sh | Handle dialog/whiptail prompts |
| templates/multi-session.sh | Parallel TUI orchestration |
| templates/element-detection.sh | Element detection demo |
Usage:
bash
./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh可执行的工作流脚本:
| 模板 | 描述 |
|---|---|
| templates/vim-workflow.sh | 使用vim编辑文件、保存、退出 |
| templates/dialog-interaction.sh | 处理dialog/whiptail提示 |
| templates/multi-session.sh | 并行TUI编排 |
| templates/element-detection.sh | 元素检测演示 |
使用方法:
bash
./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh