pilotty

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Terminal Automation with pilotty

基于pilotty的终端自动化

CRITICAL: Argument Positioning

重点:参数位置

All flags (
--name
,
-s
,
--format
, etc.) MUST come BEFORE positional arguments:
bash
undefined
所有标志(
--name
-s
--format
等)必须放在位置参数之前:
bash
undefined

CORRECT - flags before command/arguments

正确 - 标志在命令/参数之前

pilotty spawn --name myapp vim file.txt pilotty key -s myapp Enter pilotty snapshot -s myapp --format text
pilotty spawn --name myapp vim file.txt pilotty key -s myapp Enter pilotty snapshot -s myapp --format text

WRONG - flags after command (they get passed to the app, not pilotty!)

错误 - 标志在命令之后(它们会被传递给应用,而非pilotty!)

pilotty spawn vim file.txt --name myapp # FAILS: --name goes to vim pilotty key Enter -s myapp # FAILS: -s goes nowhere useful

This is the #1 cause of agent failures. When in doubt: **flags first, then command/args**.

---
pilotty spawn vim file.txt --name myapp # 失败:--name被传递给vim pilotty key Enter -s myapp # 失败:-s没有起到有效作用

这是导致Agent失败的头号原因。如有疑问:**先写标志,再写命令/参数**。

---

Quick start

快速开始

bash
pilotty spawn vim file.txt        # Start TUI app in managed session
pilotty wait-for "file.txt"       # Wait for app to be ready
pilotty snapshot                  # Get screen state with UI elements
pilotty key i                     # Enter insert mode
pilotty type "Hello, World!"      # Type text
pilotty key Escape                # Exit insert mode
pilotty kill                      # End session
bash
pilotty spawn vim file.txt        # 在托管会话中启动TUI应用
pilotty wait-for "file.txt"       # 等待应用准备就绪
pilotty snapshot                  # 获取包含UI元素的屏幕状态
pilotty key i                     # 进入插入模式
pilotty type "Hello, World!"      # 输入文本
pilotty key Escape                # 退出插入模式
pilotty kill                      # 结束会话

Core workflow

核心工作流

  1. Spawn:
    pilotty spawn <command>
    starts the app in a background PTY
  2. Wait:
    pilotty wait-for <text>
    ensures the app is ready
  3. Snapshot:
    pilotty snapshot
    returns screen state with detected UI elements
  4. Understand: Parse
    elements[]
    to identify buttons, inputs, toggles
  5. Interact: Use keyboard commands (
    key
    ,
    type
    ) to navigate and interact
  6. Re-snapshot: Check
    content_hash
    to detect screen changes
  1. 启动(Spawn)
    pilotty spawn <command>
    在后台PTY中启动应用
  2. 等待(Wait)
    pilotty wait-for <text>
    确保应用已准备就绪
  3. 快照(Snapshot)
    pilotty snapshot
    返回包含检测到的UI元素的屏幕状态
  4. 解析(Understand):解析
    elements[]
    以识别按钮、输入框、切换开关
  5. 交互(Interact):使用键盘命令(
    key
    type
    )进行导航和交互
  6. 重新快照(Re-snapshot):检查
    content_hash
    以检测屏幕变化

Commands

命令

Session management

会话管理

bash
pilotty spawn <command>           # Start TUI app (e.g., pilotty spawn htop)
pilotty spawn --name myapp <cmd>  # Start with custom session name (--name before command)
pilotty kill                      # Kill default session
pilotty kill -s myapp             # Kill specific session
pilotty list-sessions             # List all active sessions
pilotty daemon                    # Manually start daemon (usually auto-starts)
pilotty shutdown                  # Stop daemon and all sessions
pilotty examples                  # Show end-to-end workflow example
bash
pilotty spawn <command>           # 启动TUI应用(例如:pilotty spawn htop)
pilotty spawn --name myapp <cmd>  # 使用自定义会话名称启动(--name需在命令之前)
pilotty kill                      # 终止默认会话
pilotty kill -s myapp             # 终止指定会话
pilotty list-sessions             # 列出所有活跃会话
pilotty daemon                    # 手动启动守护进程(通常会自动启动)
pilotty shutdown                  # 停止守护进程和所有会话
pilotty examples                  # 展示端到端工作流示例

Screen capture

屏幕捕获

bash
pilotty snapshot                  # Full JSON with text content and elements
pilotty snapshot --format compact # JSON without text field
pilotty snapshot --format text    # Plain text with cursor indicator
pilotty snapshot -s myapp         # Snapshot specific session
bash
pilotty snapshot                  # 包含文本内容和元素的完整JSON
pilotty snapshot --format compact # 不含text字段的JSON
pilotty snapshot --format text    # 带光标指示器的纯文本
pilotty snapshot -s myapp         # 捕获指定会话的快照

Wait for screen to change (eliminates need for sleep!)

等待屏幕变化(无需再用sleep!)

HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH # Block until screen changes pilotty snapshot --await-change $HASH --settle 50 # Wait for 50ms stability
undefined
HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH # 阻塞直到屏幕变化 pilotty snapshot --await-change $HASH --settle 50 # 等待50ms稳定时间
undefined

Input

输入操作

bash
pilotty type "hello"              # Type text at cursor
pilotty type -s myapp "text"      # Type in specific session

pilotty key Enter                 # Press Enter
pilotty key Ctrl+C                # Send interrupt
pilotty key Escape                # Send Escape
pilotty key Tab                   # Send Tab
pilotty key F1                    # Function key
pilotty key Alt+F                 # Alt combination
pilotty key Up                    # Arrow key
pilotty key -s myapp Ctrl+S       # Key in specific session
bash
pilotty type "hello"              # 在光标位置输入文本
pilotty type -s myapp "text"      # 在指定会话中输入文本

pilotty key Enter                 # 按下回车键
pilotty key Ctrl+C                # 发送中断信号
pilotty key Escape                # 按下Esc键
pilotty key Tab                   # 按下Tab键
pilotty key F1                    # 按下功能键F1
pilotty key Alt+F                 # 按下Alt+F组合键
pilotty key Up                    # 按下上箭头键
pilotty key -s myapp Ctrl+S       # 在指定会话中按下Ctrl+S

Key sequences (space-separated, sent in order)

按键序列(空格分隔,按顺序发送)

pilotty key "Ctrl+X m" # Emacs chord: Ctrl+X then m pilotty key "Escape : w q Enter" # vim :wq sequence pilotty key "a b c" --delay 50 # Send a, b, c with 50ms delay pilotty key -s myapp "Tab Tab Enter" # Sequence in specific session
undefined
pilotty key "Ctrl+X m" # Emacs组合键:先按Ctrl+X再按m pilotty key "Escape : w q Enter" # vim的:wq序列 pilotty key "a b c" --delay 50 # 发送a、b、c,每个按键间隔50ms pilotty key -s myapp "Tab Tab Enter" # 在指定会话中执行按键序列
undefined

Interaction

交互操作

bash
pilotty click 5 10                # Click at row 5, col 10
pilotty click -s myapp 10 20      # Click in specific session
pilotty scroll up                 # Scroll up 1 line
pilotty scroll down 5             # Scroll down 5 lines
pilotty scroll up 10 -s myapp     # Scroll in specific session
bash
pilotty click 5 10                # 点击第5行第10列
pilotty click -s myapp 10 20      # 在指定会话中点击第10行第20列
pilotty scroll up                 # 向上滚动1行
pilotty scroll down 5             # 向下滚动5行
pilotty scroll up 10 -s myapp     # 在指定会话中向上滚动10行

Terminal control

终端控制

bash
pilotty resize 120 40             # Resize terminal to 120 cols x 40 rows
pilotty resize 80 24 -s myapp     # Resize specific session

pilotty wait-for "Ready"          # Wait for text to appear (30s default)
pilotty wait-for "Error" -r       # Wait for regex pattern
pilotty wait-for "Done" -t 5000   # Wait with 5s timeout
pilotty wait-for "~" -s editor    # Wait in specific session
bash
pilotty resize 120 40             # 将终端调整为120列×40行
pilotty resize 80 24 -s myapp     # 调整指定会话的终端尺寸

pilotty wait-for "Ready"          # 等待文本出现(默认超时30秒)
pilotty wait-for "Error" -r       # 等待正则表达式匹配的内容
pilotty wait-for "Done" -t 5000   # 等待文本出现,超时时间5秒
pilotty wait-for "~" -s editor    # 在指定会话中等待文本出现

Global options

全局选项

OptionDescription
-s, --session <name>
Target specific session (default: "default")
--format <fmt>
Snapshot format: full, compact, text
-t, --timeout <ms>
Timeout for wait-for and await-change (default: 30000)
-r, --regex
Treat wait-for pattern as regex
--name <name>
Session name for spawn command
--delay <ms>
Delay between keys in a sequence (default: 0, max: 10000)
--await-change <hash>
Block snapshot until content_hash differs
--settle <ms>
Wait for screen to be stable for this many ms (default: 0)
选项描述
-s, --session <name>
目标指定会话(默认:"default")
--format <fmt>
快照格式:full、compact、text
-t, --timeout <ms>
wait-for和await-change的超时时间(默认:30000)
-r, --regex
将wait-for的匹配模式视为正则表达式
--name <name>
spawn命令的会话名称
--delay <ms>
按键序列中按键之间的延迟(默认:0,最大值:10000)
--await-change <hash>
阻塞快照直到content_hash发生变化
--settle <ms>
等待屏幕稳定的时间(默认:0)

Environment variables

环境变量

bash
PILOTTY_SESSION="mysession"       # Default session name
PILOTTY_SOCKET_DIR="/tmp/pilotty" # Override socket directory
RUST_LOG="debug"                  # Enable debug logging
bash
PILOTTY_SESSION="mysession"       # 默认会话名称
PILOTTY_SOCKET_DIR="/tmp/pilotty" # 覆盖套接字目录
RUST_LOG="debug"                  # 启用调试日志

Snapshot Output

快照输出

The
snapshot
command returns structured JSON with detected UI elements:
json
{
  "snapshot_id": 42,
  "size": { "cols": 80, "rows": 24 },
  "cursor": { "row": 5, "col": 10, "visible": true },
  "text": "Settings:\n  [x] Notifications  [ ] Dark mode\n  [Save]  [Cancel]",
  "elements": [
    { "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
    { "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
    { "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
    { "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
  ],
  "content_hash": 12345678901234567890
}
Use
--format text
for a plain text view with cursor indicator:
--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]
The
[_]
shows cursor position. Use the text content to understand screen state and navigate with keyboard commands.

snapshot
命令返回包含检测到的UI元素的结构化JSON:
json
{
  "snapshot_id": 42,
  "size": { "cols": 80, "rows": 24 },
  "cursor": { "row": 5, "col": 10, "visible": true },
  "text": "Settings:\n  [x] Notifications  [ ] Dark mode\n  [Save]  [Cancel]",
  "elements": [
    { "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
    { "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
    { "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
    { "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
  ],
  "content_hash": 12345678901234567890
}
使用
--format text
可获取带光标指示器的纯文本视图:
--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]
[_]
表示光标位置。可使用文本内容了解屏幕状态,并通过键盘命令进行导航。

Element Detection

元素检测

pilotty automatically detects interactive UI elements in terminal applications. Elements provide read-only context to help understand UI structure.
pilotty会自动检测终端应用中的交互式UI元素。元素提供只读上下文,帮助理解UI结构。

Element Kinds

元素类型

KindDetection PatternsConfidenceFields
toggle
[x]
,
[ ]
,
[*]
,
,
1.0
checked: bool
buttonInverse video,
[OK]
,
<Cancel>
,
(Submit)
1.0 / 0.8
focused: bool
(if true)
inputCursor position,
____
underscores
1.0 / 0.6
focused: bool
(if true)
类型检测模式置信度字段
toggle(切换开关)
[x]
,
[ ]
,
[*]
,
,
1.0
checked: bool
button(按钮)反显文本、
[OK]
<Cancel>
(Submit)
1.0 / 0.8
focused: bool
(仅当为true时存在)
input(输入框)光标位置、
____
下划线
1.0 / 0.6
focused: bool
(仅当为true时存在)

Element Fields

元素字段

FieldTypeDescription
kind
stringElement type:
button
,
input
, or
toggle
row
numberRow position (0-based from top)
col
numberColumn position (0-based from left)
width
numberWidth in terminal cells (CJK chars = 2)
text
stringText content of the element
confidence
numberDetection confidence (0.0-1.0)
focused
boolWhether element has focus (only present if true)
checked
boolToggle state (only present for toggles)
字段类型描述
kind
string元素类型:
button
input
toggle
row
number行位置(从顶部开始,0索引)
col
number列位置(从左侧开始,0索引)
width
number终端单元格宽度(CJK字符=2)
text
string元素的文本内容
confidence
number检测置信度(0.0-1.0)
focused
bool元素是否获得焦点(仅当为true时存在)
checked
bool切换开关状态(仅针对toggle类型)

Confidence Levels

置信度等级

ConfidenceMeaning
1.0High confidence: Cursor position, inverse video, checkbox patterns
0.8Medium confidence: Bracket patterns
[OK]
,
<Cancel>
0.6Lower confidence: Underscore input fields
____
置信度含义
1.0高置信度:光标位置、反显文本、复选框模式
0.8中置信度:括号模式如
[OK]
<Cancel>
0.6低置信度:下划线输入字段如
____

Wait for Screen Changes (Recommended)

等待屏幕变化(推荐)

Stop guessing sleep durations! Use
--await-change
to wait for the screen to actually update:
bash
undefined
停止猜测sleep时长! 使用
--await-change
等待屏幕实际更新:
bash
undefined

Capture baseline hash

捕获基准哈希

HASH=$(pilotty snapshot | jq '.content_hash')
HASH=$(pilotty snapshot | jq '.content_hash')

Perform action

执行操作

pilotty key Enter
pilotty key Enter

Wait for screen to change (blocks until hash differs)

等待屏幕变化(阻塞直到哈希不同)

pilotty snapshot --await-change $HASH
pilotty snapshot --await-change $HASH

Or wait for screen to stabilize (for apps that render progressively)

或等待屏幕稳定(针对渐进式渲染的应用)

pilotty snapshot --await-change $HASH --settle 100

**Flags:**
| Flag | Description |
|------|-------------|
| `--await-change <HASH>` | Block until `content_hash` differs from this value |
| `--settle <MS>` | After change detected, wait for screen to be stable for MS |
| `-t, --timeout <MS>` | Maximum wait time (default: 30000) |

**Why this is better than sleep:**
- `sleep 1` is a guess - too short causes race conditions, too long slows automation
- `--await-change` waits exactly as long as needed - no more, no less
- `--settle` handles apps that render progressively (show partial, then complete)
pilotty snapshot --await-change $HASH --settle 100

**标志:**
| 标志 | 描述 |
|------|-------------|
| `--await-change <HASH>` | 阻塞直到`content_hash`与该值不同 |
| `--settle <MS>` | 检测到变化后,等待屏幕稳定MS毫秒 |
| `-t, --timeout <MS>` | 最大等待时间(默认:30000) |

**为什么这比sleep更好:**
- `sleep 1`是猜测 - 时间太短会导致竞争条件,太长会降低自动化效率
- `--await-change`只等待必要的时间 - 不多也不少
- `--settle`处理渐进式渲染的应用(先显示部分内容,再显示完整内容)

Waiting for Streaming AI Responses

等待流式AI响应

When interacting with AI-powered TUIs (like opencode, etc.) that stream responses, you need a longer
--settle
time since the screen keeps updating as tokens arrive:
bash
undefined
与AI驱动的TUI(如opencode等)交互时,这些应用会流式返回响应,因此需要更长的
--settle
时间,因为屏幕会随着token的到达持续更新:
bash
undefined

1. Capture hash before sending prompt

1. 发送提示前捕获哈希

HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')
HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')

2. Type prompt and submit

2. 输入提示并提交

pilotty type -s myapp "write me a poem about ai agents" pilotty key -s myapp Enter
pilotty type -s myapp "write me a poem about ai agents" pilotty key -s myapp Enter

3. Wait for streaming response to complete

3. 等待流式响应完成

- Use longer settle (2-3s) since AI apps pause between chunks

- settle=3000:AI响应在块之间会有停顿,等待3秒确保流式传输真正完成

- Extend timeout for long responses (60s+)

- timeout=60000:为长响应延长超时时间(60秒以上)

pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000
pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000

4. Response may be scrolled - scroll up if needed to see full output

4. 响应可能已滚动,如有需要向上滚动查看完整输出

pilotty scroll -s myapp up 10 pilotty snapshot -s myapp --format text

**Key parameters for streaming:**
- `--settle 2000-3000`: AI responses have pauses between chunks; 2-3 seconds ensures streaming is truly done
- `-t 60000`: Extend timeout beyond the 30s default for longer generations
- The settle timer resets on each screen change, so it naturally waits until streaming stops
pilotty scroll -s myapp up 10 pilotty snapshot -s myapp --format text

**流式传输的关键参数:**
- `--settle 2000-3000`:AI响应在块之间有停顿,2-3秒确保流式传输真正完成
- `-t 60000`:将超时时间从默认30秒延长,以适应长生成任务
- 长响应可能会滚动终端;使用`scroll up`查看开头内容
- settle计时器会在每次屏幕更新时重置,因此会等待真正的完成

Manual Change Detection

手动检测变化

For manual polling (not recommended), use
content_hash
directly:
bash
undefined
对于手动轮询(不推荐),可直接使用
content_hash
bash
undefined

Get initial state

获取初始状态

SNAP1=$(pilotty snapshot) HASH1=$(echo "$SNAP1" | jq -r '.content_hash')
SNAP1=$(pilotty snapshot) HASH1=$(echo "$SNAP1" | jq -r '.content_hash')

Perform action

执行操作

pilotty key Tab
pilotty key Tab

Check if screen changed

检查屏幕是否变化

SNAP2=$(pilotty snapshot) HASH2=$(echo "$SNAP2" | jq -r '.content_hash')
if [ "$HASH1" != "$HASH2" ]; then echo "Screen changed - re-analyze elements" fi
undefined
SNAP2=$(pilotty snapshot) HASH2=$(echo "$SNAP2" | jq -r '.content_hash')
if [ "$HASH1" != "$HASH2" ]; then echo "Screen changed - re-analyze elements" fi
undefined

Using Elements Effectively

有效使用元素

Elements are read-only context for understanding the UI. Use keyboard navigation for reliable interaction:
bash
undefined
元素是理解UI的只读上下文。使用键盘导航进行可靠交互:
bash
undefined

1. Get snapshot to understand UI structure

1. 获取快照以了解UI结构

pilotty snapshot | jq '.elements'
pilotty snapshot | jq '.elements'

Output shows toggles (checked/unchecked) and buttons with positions

输出显示切换开关(已选/未选)和按钮的位置

2. Navigate and interact with keyboard (reliable approach)

2. 使用键盘导航和交互

pilotty key Tab # Move to next element pilotty key Space # Toggle checkbox pilotty key Enter # Activate button
pilotty key Tab # 移动到下一个元素 pilotty key Space # 切换复选框 pilotty key Enter # 激活按钮

3. Verify state changed

3. 验证状态变化

pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'

**Key insight**: Use elements to understand WHAT is on screen, use keyboard to interact with it.

---
pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'

**核心要点**:使用元素了解屏幕上有什么,使用键盘与之交互。

---

Navigation Approach

导航方法

pilotty uses keyboard-first navigation, just like a human would:
bash
undefined
pilotty采用优先键盘导航的方式,就像人类操作一样:
bash
undefined

1. Take snapshot to see the screen

1. 拍摄快照查看屏幕

pilotty snapshot --format text
pilotty snapshot --format text

2. Navigate using keyboard

2. 使用键盘导航

pilotty key Tab # Move to next element pilotty key Enter # Activate/select pilotty key Escape # Cancel/back pilotty key Up # Move up in list/menu pilotty key Space # Toggle checkbox
pilotty key Tab # 移动到下一个元素 pilotty key Enter # 激活/选择 pilotty key Escape # 取消/返回 pilotty key Up # 在列表/菜单中向上移动 pilotty key Space # 切换复选框

3. Type text when needed

3. 必要时输入文本

pilotty type "search term" pilotty key Enter
pilotty type "search term" pilotty key Enter

4. Click at coordinates for mouse-enabled TUIs

4. 对支持鼠标的TUI,点击坐标位置

pilotty click 5 10 # Click at row 5, col 10

**Key insight**: Parse the snapshot text and elements to understand what's on screen, then use keyboard commands to navigate. This works reliably across all TUI applications.

---
pilotty click 5 10 # 点击第5行第10列

**核心要点**:解析快照文本和元素以了解屏幕内容,然后使用键盘命令导航。这在所有TUI应用中都能可靠工作。

---

Example: Edit file with vim

示例:使用vim编辑文件

bash
undefined
bash
undefined

1. Spawn vim

1. 启动vim

pilotty spawn --name editor vim /tmp/hello.txt
pilotty spawn --name editor vim /tmp/hello.txt

2. Wait for vim to load and capture baseline hash

2. 等待vim加载并捕获基准哈希

pilotty wait-for -s editor "hello.txt" HASH=$(pilotty snapshot -s editor | jq '.content_hash')
pilotty wait-for -s editor "hello.txt" HASH=$(pilotty snapshot -s editor | jq '.content_hash')

3. Enter insert mode

3. 进入插入模式

pilotty key -s editor i
pilotty key -s editor i

4. Type content

4. 输入内容

pilotty type -s editor "Hello from pilotty!"
pilotty type -s editor "Hello from pilotty!"

5. Wait for screen to update, then exit (no sleep needed!)

5. 等待屏幕更新,然后退出(无需sleep!)

pilotty snapshot -s editor --await-change $HASH --settle 50 pilotty key -s editor "Escape : w q Enter"
pilotty snapshot -s editor --await-change $HASH --settle 50 pilotty key -s editor "Escape : w q Enter"

6. Verify session ended

6. 验证会话已结束

pilotty list-sessions

Alternative using individual keys:
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enter
pilotty list-sessions

使用单个按键的替代方式:
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enter

Example: Dialog checklist interaction

示例:Dialog复选框交互

bash
undefined
bash
undefined

1. Spawn dialog checklist (--name before command)

1. 启动dialog复选框(--name需在命令之前)

pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off
pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off

2. Wait for dialog to render (use await-change, not sleep!)

2. 等待dialog渲染完成(使用await-change,而非sleep!)

pilotty snapshot -s opts --settle 200 # Wait for initial render to stabilize
pilotty snapshot -s opts --settle 200 # 等待初始渲染稳定

3. Get snapshot and examine elements, capture hash

3. 获取快照并检查元素,捕获哈希

SNAP=$(pilotty snapshot -s opts) echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")' HASH=$(echo "$SNAP" | jq '.content_hash')
SNAP=$(pilotty snapshot -s opts) echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")' HASH=$(echo "$SNAP" | jq '.content_hash')

4. Navigate to "darkmode" and toggle it

4. 导航到“darkmode”并切换

pilotty key -s opts Down # Move to second option pilotty key -s opts Space # Toggle it on
pilotty key -s opts Down # 移动到第二个选项 pilotty key -s opts Space # 切换为开启

5. Wait for change and verify

5. 等待变化并验证

pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'
pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'

6. Confirm selection

6. 确认选择

pilotty key -s opts Enter
pilotty key -s opts Enter

7. Clean up

7. 清理

pilotty kill -s opts
undefined
pilotty kill -s opts
undefined

Example: Form filling with elements

示例:使用元素填充表单

bash
undefined
bash
undefined

1. Spawn a form application

1. 启动表单应用

pilotty spawn --name form my-form-app
pilotty spawn --name form my-form-app

2. Get snapshot to understand form structure

2. 获取快照以了解表单结构

pilotty snapshot -s form | jq '.elements'
pilotty snapshot -s form | jq '.elements'

Shows inputs, toggles, and buttons with positions for click command

显示输入框、切换开关和按钮的位置,用于click命令

3. Tab to first input (likely already focused)

3. 切换到第一个输入框(可能已获得焦点)

pilotty type -s form "myusername"
pilotty type -s form "myusername"

4. Tab to password field

4. 切换到密码字段

pilotty key -s form Tab pilotty type -s form "mypassword"
pilotty key -s form Tab pilotty type -s form "mypassword"

5. Tab to remember me and toggle

5. 切换到“记住我”并切换

pilotty key -s form Tab pilotty key -s form Space
pilotty key -s form Tab pilotty key -s form Space

6. Tab to Login and activate

6. 切换到“登录”并激活

pilotty key -s form Tab pilotty key -s form Enter
pilotty key -s form Tab pilotty key -s form Enter

7. Check result

7. 检查结果

pilotty snapshot -s form --format text
undefined
pilotty snapshot -s form --format text
undefined

Example: Monitor with htop

示例:使用htop监控

bash
undefined
bash
undefined

1. Spawn htop

1. 启动htop

pilotty spawn --name monitor htop
pilotty spawn --name monitor htop

2. Wait for display

2. 等待显示

pilotty wait-for -s monitor "CPU"
pilotty wait-for -s monitor "CPU"

3. Take snapshot to see current state

3. 拍摄快照查看当前状态

pilotty snapshot -s monitor --format text
pilotty snapshot -s monitor --format text

4. Send commands

4. 发送命令

pilotty key -s monitor F9 # Kill menu pilotty key -s monitor q # Quit
pilotty key -s monitor F9 # 打开终止菜单 pilotty key -s monitor q # 退出

5. Kill session

5. 终止会话

pilotty kill -s monitor
undefined
pilotty kill -s monitor
undefined

Example: Interact with AI TUI (opencode, etc.)

示例:与AI TUI交互(如opencode等)

AI-powered TUIs stream responses, requiring special handling:
bash
undefined
AI驱动的TUI会流式返回响应,需要特殊处理:
bash
undefined

1. Spawn the AI app

1. 启动AI应用

pilotty spawn --name ai opencode
pilotty spawn --name ai opencode

2. Wait for the prompt to be ready

2. 等待提示准备就绪

pilotty wait-for -s ai "Ask anything" -t 15000
pilotty wait-for -s ai "Ask anything" -t 15000

3. Capture baseline hash

3. 捕获基准哈希

HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')
HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')

4. Type prompt and submit

4. 输入提示并提交

pilotty type -s ai "explain the architecture of this codebase" pilotty key -s ai Enter
pilotty type -s ai "explain the architecture of this codebase" pilotty key -s ai Enter

5. Wait for streaming response to complete

5. 等待流式响应完成

- settle=3000: Wait 3s of no changes to ensure streaming is done

- settle=3000:等待3秒无变化,确保流式传输完成

- timeout=60000: Allow up to 60s for long responses

- timeout=60000:允许最长60秒等待长响应

pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text
pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text

6. If response is long and scrolled, scroll up to see full output

6. 如果响应过长已滚动,向上滚动查看完整输出

pilotty scroll -s ai up 20 pilotty snapshot -s ai --format text
pilotty scroll -s ai up 20 pilotty snapshot -s ai --format text

7. Clean up

7. 清理

pilotty kill -s ai

**Gotchas with AI apps:**
- Use `--settle 2000-3000` because AI responses pause between chunks
- Extend timeout with `-t 60000` for complex prompts
- Long responses may scroll the terminal; use `scroll up` to see the beginning
- The settle timer resets on each screen update, so it waits for true completion

---
pilotty kill -s ai

**AI应用的注意事项:**
- 使用`--settle 2000-3000`,因为AI响应在块之间会有停顿
- 使用`-t 60000`延长超时时间以处理复杂提示
- 长响应可能会滚动终端;使用`scroll up`查看开头
- settle计时器会在每次屏幕更新时重置,因此会等待真正的完成

---

Sessions

会话

Each session is isolated with its own:
  • PTY (pseudo-terminal)
  • Screen buffer
  • Child process
bash
undefined
每个会话都是独立的,拥有自己的:
  • PTY(伪终端)
  • 屏幕缓冲区
  • 子进程
bash
undefined

Run multiple apps (--name must come before the command)

运行多个应用(--name必须在命令之前)

pilotty spawn --name monitoring htop pilotty spawn --name editor vim file.txt
pilotty spawn --name monitoring htop pilotty spawn --name editor vim file.txt

Target specific session

目标指定会话

pilotty snapshot -s monitoring pilotty key -s editor Ctrl+S
pilotty snapshot -s monitoring pilotty key -s editor Ctrl+S

List all

列出所有会话

pilotty list-sessions
pilotty list-sessions

Kill specific

终止指定会话

pilotty kill -s editor

The first session spawned without `--name` is automatically named `default`.

> **Important:** The `--name` flag must come **before** the command. Everything after the command is passed as arguments to that command.
pilotty kill -s editor

第一个未使用`--name`启动的会话会自动命名为`default`。

> **重要提示:** `--name`标志必须放在**命令之前**。命令之后的所有内容都会作为参数传递给该命令。

Daemon Architecture

守护进程架构

pilotty uses a background daemon for session management:
  • Auto-start: Daemon starts on first command
  • Auto-stop: Shuts down after 5 minutes with no sessions
  • Session cleanup: Sessions removed when process exits (within 500ms)
  • Shared state: Multiple CLI calls share sessions
You rarely need to manage the daemon manually.
pilotty使用后台守护进程进行会话管理:
  • 自动启动:首次执行命令时启动守护进程
  • 自动停止:无会话5分钟后自动关闭
  • 会话清理:进程退出时移除会话(500ms内)
  • 共享状态:多个CLI调用共享会话
您几乎不需要手动管理守护进程。

Error Handling

错误处理

Errors include actionable suggestions:
json
{
  "code": "SESSION_NOT_FOUND",
  "message": "Session 'abc123' not found",
  "suggestion": "Run 'pilotty list-sessions' to see available sessions"
}
json
{
  "code": "SPAWN_FAILED",
  "message": "Failed to spawn process: command not found",
  "suggestion": "Check that the command exists and is in PATH"
}

错误信息包含可行的建议:
json
{
  "code": "SESSION_NOT_FOUND",
  "message": "Session 'abc123' not found",
  "suggestion": "Run 'pilotty list-sessions' to see available sessions"
}
json
{
  "code": "SPAWN_FAILED",
  "message": "Failed to spawn process: command not found",
  "suggestion": "Check that the command exists and is in PATH"
}

Common Patterns

常见模式

Reliable action + wait (recommended)

可靠操作+等待(推荐)

bash
undefined
bash
undefined

The pattern: capture hash, act, await change

模式:捕获哈希,执行操作,等待变化

HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH --settle 50
HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH --settle 50

This replaces fragile patterns like:

这替代了脆弱的模式:

pilotty key Enter && sleep 1 && pilotty snapshot # BAD: guessing

pilotty key Enter && sleep 1 && pilotty snapshot # 糟糕:靠猜测

undefined
undefined

Wait then act

等待后操作

bash
pilotty spawn my-app
pilotty wait-for "Ready"    # Ensure app is ready
pilotty snapshot            # Then snapshot
bash
pilotty spawn my-app
pilotty wait-for "Ready"    # 确保应用准备就绪
pilotty snapshot            # 然后拍摄快照

Check state before action

操作前检查状态

bash
pilotty snapshot --format text | grep "Error"  # Check for errors
pilotty key Enter                               # Then proceed
bash
pilotty snapshot --format text | grep "Error"  # 检查是否有错误
pilotty key Enter                               # 然后继续

Check for specific element

检查特定元素

bash
undefined
bash
undefined

Check if the first toggle is checked

检查第一个切换开关是否已选中

pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1
pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1

Find element at specific position

查找特定位置的元素

pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'
undefined
pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'
undefined

Retry on timeout

超时重试

bash
pilotty wait-for "Ready" -t 5000 || {
  pilotty snapshot --format text   # Check what's on screen
  # Adjust approach based on actual state
}

bash
pilotty wait-for "Ready" -t 5000 || {
  pilotty snapshot --format text   # 检查屏幕内容
  # 根据实际状态调整方法
}

Deep-dive Documentation

深入文档

For detailed patterns and edge cases, see:
ReferenceDescription
references/session-management.mdMulti-session patterns, isolation, cleanup
references/key-input.mdComplete key combinations reference
references/element-detection.mdDetection rules, confidence, patterns
有关详细模式和边缘情况,请参阅:
参考文档描述
references/session-management.md多会话模式、隔离、清理
references/key-input.md完整按键组合参考
references/element-detection.md检测规则、置信度、模式

Ready-to-use Templates

即用型模板

Executable workflow scripts:
TemplateDescription
templates/vim-workflow.shEdit file with vim, save, exit
templates/dialog-interaction.shHandle dialog/whiptail prompts
templates/multi-session.shParallel TUI orchestration
templates/element-detection.shElement detection demo
Usage:
bash
./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh
可执行的工作流脚本:
模板描述
templates/vim-workflow.sh使用vim编辑文件、保存、退出
templates/dialog-interaction.sh处理dialog/whiptail提示
templates/multi-session.sh并行TUI编排
templates/element-detection.sh元素检测演示
使用方法:
bash
./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh