pilotty

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Terminal Automation with pilotty

基于pilotty的终端自动化

CRITICAL: Argument Positioning

重点：参数位置

All flags (
--name
,
-s
,
--format
, etc.) MUST come BEFORE positional arguments:

bash

undefined

所有标志（
--name
、
-s
、
--format
等）必须放在位置参数之前：

bash

undefined

CORRECT - flags before command/arguments

正确 - 标志在命令/参数之前

pilotty spawn --name myapp vim file.txt pilotty key -s myapp Enter pilotty snapshot -s myapp --format text

WRONG - flags after command (they get passed to the app, not pilotty!)

错误 - 标志在命令之后（它们会被传递给应用，而非pilotty！）

pilotty spawn vim file.txt --name myapp # FAILS: --name goes to vim pilotty key Enter -s myapp # FAILS: -s goes nowhere useful


This is the #1 cause of agent failures. When in doubt: **flags first, then command/args**.

---

pilotty spawn vim file.txt --name myapp # 失败：--name被传递给vim pilotty key Enter -s myapp # 失败：-s没有起到有效作用


这是导致Agent失败的头号原因。如有疑问：**先写标志，再写命令/参数**。

---

Quick start

快速开始

bash

pilotty spawn vim file.txt        # Start TUI app in managed session
pilotty wait-for "file.txt"       # Wait for app to be ready
pilotty snapshot                  # Get screen state with UI elements
pilotty key i                     # Enter insert mode
pilotty type "Hello, World!"      # Type text
pilotty key Escape                # Exit insert mode
pilotty kill                      # End session

bash

pilotty spawn vim file.txt        # 在托管会话中启动TUI应用
pilotty wait-for "file.txt"       # 等待应用准备就绪
pilotty snapshot                  # 获取包含UI元素的屏幕状态
pilotty key i                     # 进入插入模式
pilotty type "Hello, World!"      # 输入文本
pilotty key Escape                # 退出插入模式
pilotty kill                      # 结束会话

Core workflow

核心工作流

Spawn:
```
pilotty spawn <command>
```
starts the app in a background PTY
Wait:
```
pilotty wait-for <text>
```
ensures the app is ready
Snapshot:
```
pilotty snapshot
```
returns screen state with detected UI elements
Understand: Parse
```
elements[]
```
to identify buttons, inputs, toggles
Interact: Use keyboard commands (
```
key
```
,
```
type
```
) to navigate and interact
Re-snapshot: Check
```
content_hash
```
to detect screen changes

启动（Spawn）：
```
pilotty spawn <command>
```
在后台PTY中启动应用
等待（Wait）：
```
pilotty wait-for <text>
```
确保应用已准备就绪
快照（Snapshot）：
```
pilotty snapshot
```
返回包含检测到的UI元素的屏幕状态
解析（Understand）：解析
```
elements[]
```
以识别按钮、输入框、切换开关
交互（Interact）：使用键盘命令（
```
key
```
、
```
type
```
）进行导航和交互
重新快照（Re-snapshot）：检查
```
content_hash
```
以检测屏幕变化

Commands

命令

Session management

会话管理

bash

pilotty spawn <command>           # Start TUI app (e.g., pilotty spawn htop)
pilotty spawn --name myapp <cmd>  # Start with custom session name (--name before command)
pilotty kill                      # Kill default session
pilotty kill -s myapp             # Kill specific session
pilotty list-sessions             # List all active sessions
pilotty daemon                    # Manually start daemon (usually auto-starts)
pilotty shutdown                  # Stop daemon and all sessions
pilotty examples                  # Show end-to-end workflow example

bash

pilotty spawn <command>           # 启动TUI应用（例如：pilotty spawn htop）
pilotty spawn --name myapp <cmd>  # 使用自定义会话名称启动（--name需在命令之前）
pilotty kill                      # 终止默认会话
pilotty kill -s myapp             # 终止指定会话
pilotty list-sessions             # 列出所有活跃会话
pilotty daemon                    # 手动启动守护进程（通常会自动启动）
pilotty shutdown                  # 停止守护进程和所有会话
pilotty examples                  # 展示端到端工作流示例

Screen capture

屏幕捕获

bash

pilotty snapshot                  # Full JSON with text content and elements
pilotty snapshot --format compact # JSON without text field
pilotty snapshot --format text    # Plain text with cursor indicator
pilotty snapshot -s myapp         # Snapshot specific session

bash

pilotty snapshot                  # 包含文本内容和元素的完整JSON
pilotty snapshot --format compact # 不含text字段的JSON
pilotty snapshot --format text    # 带光标指示器的纯文本
pilotty snapshot -s myapp         # 捕获指定会话的快照

Wait for screen to change (eliminates need for sleep!)

等待屏幕变化（无需再用sleep！）

HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH # Block until screen changes pilotty snapshot --await-change $HASH --settle 50 # Wait for 50ms stability

undefined

HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH # 阻塞直到屏幕变化 pilotty snapshot --await-change $HASH --settle 50 # 等待50ms稳定时间

undefined

Input

输入操作

bash

pilotty type "hello"              # Type text at cursor
pilotty type -s myapp "text"      # Type in specific session

pilotty key Enter                 # Press Enter
pilotty key Ctrl+C                # Send interrupt
pilotty key Escape                # Send Escape
pilotty key Tab                   # Send Tab
pilotty key F1                    # Function key
pilotty key Alt+F                 # Alt combination
pilotty key Up                    # Arrow key
pilotty key -s myapp Ctrl+S       # Key in specific session

bash

pilotty type "hello"              # 在光标位置输入文本
pilotty type -s myapp "text"      # 在指定会话中输入文本

pilotty key Enter                 # 按下回车键
pilotty key Ctrl+C                # 发送中断信号
pilotty key Escape                # 按下Esc键
pilotty key Tab                   # 按下Tab键
pilotty key F1                    # 按下功能键F1
pilotty key Alt+F                 # 按下Alt+F组合键
pilotty key Up                    # 按下上箭头键
pilotty key -s myapp Ctrl+S       # 在指定会话中按下Ctrl+S

Key sequences (space-separated, sent in order)

按键序列（空格分隔，按顺序发送）

pilotty key "Ctrl+X m" # Emacs chord: Ctrl+X then m pilotty key "Escape : w q Enter" # vim :wq sequence pilotty key "a b c" --delay 50 # Send a, b, c with 50ms delay pilotty key -s myapp "Tab Tab Enter" # Sequence in specific session

undefined

pilotty key "Ctrl+X m" # Emacs组合键：先按Ctrl+X再按m pilotty key "Escape : w q Enter" # vim的:wq序列 pilotty key "a b c" --delay 50 # 发送a、b、c，每个按键间隔50ms pilotty key -s myapp "Tab Tab Enter" # 在指定会话中执行按键序列

undefined

Interaction

交互操作

bash

pilotty click 5 10                # Click at row 5, col 10
pilotty click -s myapp 10 20      # Click in specific session
pilotty scroll up                 # Scroll up 1 line
pilotty scroll down 5             # Scroll down 5 lines
pilotty scroll up 10 -s myapp     # Scroll in specific session

bash

pilotty click 5 10                # 点击第5行第10列
pilotty click -s myapp 10 20      # 在指定会话中点击第10行第20列
pilotty scroll up                 # 向上滚动1行
pilotty scroll down 5             # 向下滚动5行
pilotty scroll up 10 -s myapp     # 在指定会话中向上滚动10行

Terminal control

终端控制

bash

pilotty resize 120 40             # Resize terminal to 120 cols x 40 rows
pilotty resize 80 24 -s myapp     # Resize specific session

pilotty wait-for "Ready"          # Wait for text to appear (30s default)
pilotty wait-for "Error" -r       # Wait for regex pattern
pilotty wait-for "Done" -t 5000   # Wait with 5s timeout
pilotty wait-for "~" -s editor    # Wait in specific session

bash

pilotty resize 120 40             # 将终端调整为120列×40行
pilotty resize 80 24 -s myapp     # 调整指定会话的终端尺寸

pilotty wait-for "Ready"          # 等待文本出现（默认超时30秒）
pilotty wait-for "Error" -r       # 等待正则表达式匹配的内容
pilotty wait-for "Done" -t 5000   # 等待文本出现，超时时间5秒
pilotty wait-for "~" -s editor    # 在指定会话中等待文本出现

Global options

全局选项

Option	Description
`-s, --session <name>`	Target specific session (default: "default")
`--format <fmt>`	Snapshot format: full, compact, text
`-t, --timeout <ms>`	Timeout for wait-for and await-change (default: 30000)
`-r, --regex`	Treat wait-for pattern as regex
`--name <name>`	Session name for spawn command
`--delay <ms>`	Delay between keys in a sequence (default: 0, max: 10000)
`--await-change <hash>`	Block snapshot until content_hash differs
`--settle <ms>`	Wait for screen to be stable for this many ms (default: 0)

选项	描述
`-s, --session <name>`	目标指定会话（默认："default"）
`--format <fmt>`	快照格式：full、compact、text
`-t, --timeout <ms>`	wait-for和await-change的超时时间（默认：30000）
`-r, --regex`	将wait-for的匹配模式视为正则表达式
`--name <name>`	spawn命令的会话名称
`--delay <ms>`	按键序列中按键之间的延迟（默认：0，最大值：10000）
`--await-change <hash>`	阻塞快照直到content_hash发生变化
`--settle <ms>`	等待屏幕稳定的时间（默认：0）

Environment variables

环境变量

bash

PILOTTY_SESSION="mysession"       # Default session name
PILOTTY_SOCKET_DIR="/tmp/pilotty" # Override socket directory
RUST_LOG="debug"                  # Enable debug logging

bash

PILOTTY_SESSION="mysession"       # 默认会话名称
PILOTTY_SOCKET_DIR="/tmp/pilotty" # 覆盖套接字目录
RUST_LOG="debug"                  # 启用调试日志

Snapshot Output

快照输出

The

snapshot

command returns structured JSON with detected UI elements:

json

{
  "snapshot_id": 42,
  "size": { "cols": 80, "rows": 24 },
  "cursor": { "row": 5, "col": 10, "visible": true },
  "text": "Settings:\n  [x] Notifications  [ ] Dark mode\n  [Save]  [Cancel]",
  "elements": [
    { "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
    { "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
    { "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
    { "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
  ],
  "content_hash": 12345678901234567890
}

Use

--format text

for a plain text view with cursor indicator:

--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]

The

[_]

shows cursor position. Use the text content to understand screen state and navigate with keyboard commands.

snapshot

命令返回包含检测到的UI元素的结构化JSON：

json

{
  "snapshot_id": 42,
  "size": { "cols": 80, "rows": 24 },
  "cursor": { "row": 5, "col": 10, "visible": true },
  "text": "Settings:\n  [x] Notifications  [ ] Dark mode\n  [Save]  [Cancel]",
  "elements": [
    { "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
    { "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
    { "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
    { "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
  ],
  "content_hash": 12345678901234567890
}

使用

--format text

可获取带光标指示器的纯文本视图：

--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]

[_]

表示光标位置。可使用文本内容了解屏幕状态，并通过键盘命令进行导航。

Element Detection

元素检测

pilotty automatically detects interactive UI elements in terminal applications. Elements provide read-only context to help understand UI structure.

pilotty会自动检测终端应用中的交互式UI元素。元素提供只读上下文，帮助理解UI结构。

Element Kinds

元素类型

Kind	Detection Patterns	Confidence	Fields
toggle	`[x]` , `[ ]` , `[*]` , `☑` , `☐`	1.0	`checked: bool`
button	Inverse video, `[OK]` , `<Cancel>` , `(Submit)`	1.0 / 0.8	`focused: bool` (if true)
input	Cursor position, `____` underscores	1.0 / 0.6	`focused: bool` (if true)

类型	检测模式	置信度	字段
toggle（切换开关）	`[x]` , `[ ]` , `[*]` , `☑` , `☐`	1.0	`checked: bool`
button（按钮）	反显文本、 `[OK]` 、 `<Cancel>` 、 `(Submit)`	1.0 / 0.8	`focused: bool` （仅当为true时存在）
input（输入框）	光标位置、 `____` 下划线	1.0 / 0.6	`focused: bool` （仅当为true时存在）

Element Fields

元素字段

Field	Type	Description
`kind`	string	Element type: `button` , `input` , or `toggle`
`row`	number	Row position (0-based from top)
`col`	number	Column position (0-based from left)
`width`	number	Width in terminal cells (CJK chars = 2)
`text`	string	Text content of the element
`confidence`	number	Detection confidence (0.0-1.0)
`focused`	bool	Whether element has focus (only present if true)
`checked`	bool	Toggle state (only present for toggles)

字段	类型	描述
`kind`	string	元素类型： `button` 、 `input` 或 `toggle`
`row`	number	行位置（从顶部开始，0索引）
`col`	number	列位置（从左侧开始，0索引）
`width`	number	终端单元格宽度（CJK字符=2）
`text`	string	元素的文本内容
`confidence`	number	检测置信度（0.0-1.0）
`focused`	bool	元素是否获得焦点（仅当为true时存在）
`checked`	bool	切换开关状态（仅针对toggle类型）

Confidence Levels

置信度等级

Confidence	Meaning
1.0	High confidence: Cursor position, inverse video, checkbox patterns
0.8	Medium confidence: Bracket patterns `[OK]` , `<Cancel>`
0.6	Lower confidence: Underscore input fields `____`

置信度	含义
1.0	高置信度：光标位置、反显文本、复选框模式
0.8	中置信度：括号模式如 `[OK]` 、 `<Cancel>`
0.6	低置信度：下划线输入字段如 `____`

Wait for Screen Changes (Recommended)

等待屏幕变化（推荐）

Stop guessing sleep durations! Use

--await-change

to wait for the screen to actually update:

bash

undefined

停止猜测sleep时长！ 使用

--await-change

等待屏幕实际更新：

bash

undefined

Capture baseline hash

捕获基准哈希

HASH=$(pilotty snapshot | jq '.content_hash')

Perform action

执行操作

pilotty key Enter

Wait for screen to change (blocks until hash differs)

等待屏幕变化（阻塞直到哈希不同）

pilotty snapshot --await-change $HASH

Or wait for screen to stabilize (for apps that render progressively)

或等待屏幕稳定（针对渐进式渲染的应用）

pilotty snapshot --await-change $HASH --settle 100


**Flags:**
| Flag | Description |
|------|-------------|
| `--await-change <HASH>` | Block until `content_hash` differs from this value |
| `--settle <MS>` | After change detected, wait for screen to be stable for MS |
| `-t, --timeout <MS>` | Maximum wait time (default: 30000) |

**Why this is better than sleep:**
- `sleep 1` is a guess - too short causes race conditions, too long slows automation
- `--await-change` waits exactly as long as needed - no more, no less
- `--settle` handles apps that render progressively (show partial, then complete)

pilotty snapshot --await-change $HASH --settle 100


**标志：**
| 标志 | 描述 |
|------|-------------|
| `--await-change <HASH>` | 阻塞直到`content_hash`与该值不同 |
| `--settle <MS>` | 检测到变化后，等待屏幕稳定MS毫秒 |
| `-t, --timeout <MS>` | 最大等待时间（默认：30000） |

**为什么这比sleep更好：**
- `sleep 1`是猜测 - 时间太短会导致竞争条件，太长会降低自动化效率
- `--await-change`只等待必要的时间 - 不多也不少
- `--settle`处理渐进式渲染的应用（先显示部分内容，再显示完整内容）

Waiting for Streaming AI Responses

等待流式AI响应

When interacting with AI-powered TUIs (like opencode, etc.) that stream responses, you need a longer

--settle

time since the screen keeps updating as tokens arrive:

bash

undefined

与AI驱动的TUI（如opencode等）交互时，这些应用会流式返回响应，因此需要更长的

--settle

时间，因为屏幕会随着token的到达持续更新：

bash

undefined

1. Capture hash before sending prompt

1. 发送提示前捕获哈希

HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')

2. Type prompt and submit

2. 输入提示并提交

pilotty type -s myapp "write me a poem about ai agents" pilotty key -s myapp Enter

3. Wait for streaming response to complete

3. 等待流式响应完成

- Use longer settle (2-3s) since AI apps pause between chunks

- settle=3000：AI响应在块之间会有停顿，等待3秒确保流式传输真正完成

- Extend timeout for long responses (60s+)

- timeout=60000：为长响应延长超时时间（60秒以上）

pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000

4. Response may be scrolled - scroll up if needed to see full output

4. 响应可能已滚动，如有需要向上滚动查看完整输出

pilotty scroll -s myapp up 10 pilotty snapshot -s myapp --format text


**Key parameters for streaming:**
- `--settle 2000-3000`: AI responses have pauses between chunks; 2-3 seconds ensures streaming is truly done
- `-t 60000`: Extend timeout beyond the 30s default for longer generations
- The settle timer resets on each screen change, so it naturally waits until streaming stops

pilotty scroll -s myapp up 10 pilotty snapshot -s myapp --format text


**流式传输的关键参数：**
- `--settle 2000-3000`：AI响应在块之间有停顿，2-3秒确保流式传输真正完成
- `-t 60000`：将超时时间从默认30秒延长，以适应长生成任务
- 长响应可能会滚动终端；使用`scroll up`查看开头内容
- settle计时器会在每次屏幕更新时重置，因此会等待真正的完成

Manual Change Detection

手动检测变化

For manual polling (not recommended), use

content_hash

directly:

bash

undefined

对于手动轮询（不推荐），可直接使用

content_hash

：

bash

undefined

Get initial state

获取初始状态

SNAP1=$(pilotty snapshot) HASH1=$(echo "$SNAP1" | jq -r '.content_hash')

Perform action

执行操作

pilotty key Tab

Check if screen changed

检查屏幕是否变化

SNAP2=$(pilotty snapshot) HASH2=$(echo "$SNAP2" | jq -r '.content_hash')

if [ "$HASH1" != "$HASH2" ]; then echo "Screen changed - re-analyze elements" fi

undefined

SNAP2=$(pilotty snapshot) HASH2=$(echo "$SNAP2" | jq -r '.content_hash')

if [ "$HASH1" != "$HASH2" ]; then echo "Screen changed - re-analyze elements" fi

undefined

Using Elements Effectively

有效使用元素

Elements are read-only context for understanding the UI. Use keyboard navigation for reliable interaction:

bash

undefined

元素是理解UI的只读上下文。使用键盘导航进行可靠交互：

bash

undefined

1. Get snapshot to understand UI structure

1. 获取快照以了解UI结构

pilotty snapshot | jq '.elements'

Output shows toggles (checked/unchecked) and buttons with positions

输出显示切换开关（已选/未选）和按钮的位置

2. Navigate and interact with keyboard (reliable approach)

2. 使用键盘导航和交互

pilotty key Tab # Move to next element pilotty key Space # Toggle checkbox pilotty key Enter # Activate button

pilotty key Tab # 移动到下一个元素 pilotty key Space # 切换复选框 pilotty key Enter # 激活按钮

3. Verify state changed

3. 验证状态变化

pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'


**Key insight**: Use elements to understand WHAT is on screen, use keyboard to interact with it.

---

pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'


**核心要点**：使用元素了解屏幕上有什么，使用键盘与之交互。

---

Navigation Approach

导航方法

pilotty uses keyboard-first navigation, just like a human would:

bash

undefined

pilotty采用优先键盘导航的方式，就像人类操作一样：

bash

undefined

1. Take snapshot to see the screen

1. 拍摄快照查看屏幕

pilotty snapshot --format text

2. Navigate using keyboard

2. 使用键盘导航

pilotty key Tab # Move to next element pilotty key Enter # Activate/select pilotty key Escape # Cancel/back pilotty key Up # Move up in list/menu pilotty key Space # Toggle checkbox

pilotty key Tab # 移动到下一个元素 pilotty key Enter # 激活/选择 pilotty key Escape # 取消/返回 pilotty key Up # 在列表/菜单中向上移动 pilotty key Space # 切换复选框

3. Type text when needed

3. 必要时输入文本

pilotty type "search term" pilotty key Enter

4. Click at coordinates for mouse-enabled TUIs

4. 对支持鼠标的TUI，点击坐标位置

pilotty click 5 10 # Click at row 5, col 10


**Key insight**: Parse the snapshot text and elements to understand what's on screen, then use keyboard commands to navigate. This works reliably across all TUI applications.

---

pilotty click 5 10 # 点击第5行第10列


**核心要点**：解析快照文本和元素以了解屏幕内容，然后使用键盘命令导航。这在所有TUI应用中都能可靠工作。

---

Example: Edit file with vim

示例：使用vim编辑文件

bash

undefined

bash

undefined

1. Spawn vim

1. 启动vim

pilotty spawn --name editor vim /tmp/hello.txt

2. Wait for vim to load and capture baseline hash

2. 等待vim加载并捕获基准哈希

pilotty wait-for -s editor "hello.txt" HASH=$(pilotty snapshot -s editor | jq '.content_hash')

3. Enter insert mode

3. 进入插入模式

pilotty key -s editor i

4. Type content

4. 输入内容

pilotty type -s editor "Hello from pilotty!"

5. Wait for screen to update, then exit (no sleep needed!)

5. 等待屏幕更新，然后退出（无需sleep！）

pilotty snapshot -s editor --await-change $HASH --settle 50 pilotty key -s editor "Escape : w q Enter"

6. Verify session ended

6. 验证会话已结束

pilotty list-sessions


Alternative using individual keys:
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enter

pilotty list-sessions


使用单个按键的替代方式：
```bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enter

Example: Dialog checklist interaction

示例：Dialog复选框交互

bash

undefined

bash

undefined

1. Spawn dialog checklist (--name before command)

1. 启动dialog复选框（--name需在命令之前）

pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4
"notifications" "Push notifications" on
"darkmode" "Dark mode theme" off
"autosave" "Auto-save documents" on
"telemetry" "Usage analytics" off

2. Wait for dialog to render (use await-change, not sleep!)

2. 等待dialog渲染完成（使用await-change，而非sleep！）

pilotty snapshot -s opts --settle 200 # Wait for initial render to stabilize

pilotty snapshot -s opts --settle 200 # 等待初始渲染稳定

3. Get snapshot and examine elements, capture hash

3. 获取快照并检查元素，捕获哈希

SNAP=$(pilotty snapshot -s opts) echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")' HASH=$(echo "$SNAP" | jq '.content_hash')

4. Navigate to "darkmode" and toggle it

4. 导航到“darkmode”并切换

pilotty key -s opts Down # Move to second option pilotty key -s opts Space # Toggle it on

pilotty key -s opts Down # 移动到第二个选项 pilotty key -s opts Space # 切换为开启

5. Wait for change and verify

5. 等待变化并验证

pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'

6. Confirm selection

6. 确认选择

pilotty key -s opts Enter

7. Clean up

7. 清理

pilotty kill -s opts

undefined

pilotty kill -s opts

undefined

Example: Form filling with elements

示例：使用元素填充表单

bash

undefined

bash

undefined

1. Spawn a form application

1. 启动表单应用

pilotty spawn --name form my-form-app

2. Get snapshot to understand form structure

2. 获取快照以了解表单结构

pilotty snapshot -s form | jq '.elements'

Shows inputs, toggles, and buttons with positions for click command

显示输入框、切换开关和按钮的位置，用于click命令

3. Tab to first input (likely already focused)

3. 切换到第一个输入框（可能已获得焦点）

pilotty type -s form "myusername"

4. Tab to password field

4. 切换到密码字段

pilotty key -s form Tab pilotty type -s form "mypassword"

5. Tab to remember me and toggle

5. 切换到“记住我”并切换

pilotty key -s form Tab pilotty key -s form Space

6. Tab to Login and activate

6. 切换到“登录”并激活

pilotty key -s form Tab pilotty key -s form Enter

7. Check result

7. 检查结果

pilotty snapshot -s form --format text

undefined

pilotty snapshot -s form --format text

undefined

Example: Monitor with htop

示例：使用htop监控

bash

undefined

bash

undefined

1. Spawn htop

1. 启动htop

pilotty spawn --name monitor htop

2. Wait for display

2. 等待显示

pilotty wait-for -s monitor "CPU"

3. Take snapshot to see current state

3. 拍摄快照查看当前状态

pilotty snapshot -s monitor --format text

4. Send commands

4. 发送命令

pilotty key -s monitor F9 # Kill menu pilotty key -s monitor q # Quit

pilotty key -s monitor F9 # 打开终止菜单 pilotty key -s monitor q # 退出

5. Kill session

5. 终止会话

pilotty kill -s monitor

undefined

pilotty kill -s monitor

undefined

Example: Interact with AI TUI (opencode, etc.)

示例：与AI TUI交互（如opencode等）

AI-powered TUIs stream responses, requiring special handling:

bash

undefined

AI驱动的TUI会流式返回响应，需要特殊处理：

bash

undefined

1. Spawn the AI app

1. 启动AI应用

pilotty spawn --name ai opencode

2. Wait for the prompt to be ready

2. 等待提示准备就绪

pilotty wait-for -s ai "Ask anything" -t 15000

3. Capture baseline hash

3. 捕获基准哈希

HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')

4. Type prompt and submit

4. 输入提示并提交

pilotty type -s ai "explain the architecture of this codebase" pilotty key -s ai Enter

5. Wait for streaming response to complete

5. 等待流式响应完成

- settle=3000: Wait 3s of no changes to ensure streaming is done

- settle=3000：等待3秒无变化，确保流式传输完成

- timeout=60000: Allow up to 60s for long responses

- timeout=60000：允许最长60秒等待长响应

pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text

6. If response is long and scrolled, scroll up to see full output

6. 如果响应过长已滚动，向上滚动查看完整输出

pilotty scroll -s ai up 20 pilotty snapshot -s ai --format text

7. Clean up

7. 清理

pilotty kill -s ai


**Gotchas with AI apps:**
- Use `--settle 2000-3000` because AI responses pause between chunks
- Extend timeout with `-t 60000` for complex prompts
- Long responses may scroll the terminal; use `scroll up` to see the beginning
- The settle timer resets on each screen update, so it waits for true completion

---

pilotty kill -s ai


**AI应用的注意事项：**
- 使用`--settle 2000-3000`，因为AI响应在块之间会有停顿
- 使用`-t 60000`延长超时时间以处理复杂提示
- 长响应可能会滚动终端；使用`scroll up`查看开头
- settle计时器会在每次屏幕更新时重置，因此会等待真正的完成

---

Sessions

会话

Each session is isolated with its own:

PTY (pseudo-terminal)
Screen buffer
Child process

bash

undefined

每个会话都是独立的，拥有自己的：

PTY（伪终端）
屏幕缓冲区
子进程

bash

undefined

Run multiple apps (--name must come before the command)

运行多个应用（--name必须在命令之前）

pilotty spawn --name monitoring htop pilotty spawn --name editor vim file.txt

Target specific session

目标指定会话

pilotty snapshot -s monitoring pilotty key -s editor Ctrl+S

List all

列出所有会话

pilotty list-sessions

Kill specific

终止指定会话

pilotty kill -s editor


The first session spawned without `--name` is automatically named `default`.

> **Important:** The `--name` flag must come **before** the command. Everything after the command is passed as arguments to that command.

pilotty kill -s editor


第一个未使用`--name`启动的会话会自动命名为`default`。

> **重要提示：** `--name`标志必须放在**命令之前**。命令之后的所有内容都会作为参数传递给该命令。

Daemon Architecture

守护进程架构

pilotty uses a background daemon for session management:

Auto-start: Daemon starts on first command
Auto-stop: Shuts down after 5 minutes with no sessions
Session cleanup: Sessions removed when process exits (within 500ms)
Shared state: Multiple CLI calls share sessions

You rarely need to manage the daemon manually.

pilotty使用后台守护进程进行会话管理：

自动启动：首次执行命令时启动守护进程
自动停止：无会话5分钟后自动关闭
会话清理：进程退出时移除会话（500ms内）
共享状态：多个CLI调用共享会话

您几乎不需要手动管理守护进程。

Error Handling

错误处理

Errors include actionable suggestions:

json

{
  "code": "SESSION_NOT_FOUND",
  "message": "Session 'abc123' not found",
  "suggestion": "Run 'pilotty list-sessions' to see available sessions"
}

json

{
  "code": "SPAWN_FAILED",
  "message": "Failed to spawn process: command not found",
  "suggestion": "Check that the command exists and is in PATH"
}

错误信息包含可行的建议：

json

{
  "code": "SESSION_NOT_FOUND",
  "message": "Session 'abc123' not found",
  "suggestion": "Run 'pilotty list-sessions' to see available sessions"
}

json

{
  "code": "SPAWN_FAILED",
  "message": "Failed to spawn process: command not found",
  "suggestion": "Check that the command exists and is in PATH"
}

Common Patterns

常见模式

Reliable action + wait (recommended)

可靠操作+等待（推荐）

bash

undefined

bash

undefined

The pattern: capture hash, act, await change

模式：捕获哈希，执行操作，等待变化

HASH=$(pilotty snapshot | jq '.content_hash') pilotty key Enter pilotty snapshot --await-change $HASH --settle 50

This replaces fragile patterns like:

这替代了脆弱的模式：

pilotty key Enter && sleep 1 && pilotty snapshot # BAD: guessing

pilotty key Enter && sleep 1 && pilotty snapshot # 糟糕：靠猜测

undefined

undefined

Wait then act

等待后操作

bash

pilotty spawn my-app
pilotty wait-for "Ready"    # Ensure app is ready
pilotty snapshot            # Then snapshot

bash

pilotty spawn my-app
pilotty wait-for "Ready"    # 确保应用准备就绪
pilotty snapshot            # 然后拍摄快照

Check state before action

操作前检查状态

bash

pilotty snapshot --format text | grep "Error"  # Check for errors
pilotty key Enter                               # Then proceed

bash

pilotty snapshot --format text | grep "Error"  # 检查是否有错误
pilotty key Enter                               # 然后继续

Check for specific element

检查特定元素

bash

undefined

bash

undefined

Check if the first toggle is checked

检查第一个切换开关是否已选中

pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1

Find element at specific position

查找特定位置的元素

pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'

undefined

pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'

undefined

Retry on timeout

超时重试

bash

pilotty wait-for "Ready" -t 5000 || {
  pilotty snapshot --format text   # Check what's on screen
  # Adjust approach based on actual state
}

bash

pilotty wait-for "Ready" -t 5000 || {
  pilotty snapshot --format text   # 检查屏幕内容
  # 根据实际状态调整方法
}

Deep-dive Documentation

深入文档

For detailed patterns and edge cases, see:

Reference	Description
references/session-management.md	Multi-session patterns, isolation, cleanup
references/key-input.md	Complete key combinations reference
references/element-detection.md	Detection rules, confidence, patterns

有关详细模式和边缘情况，请参阅：

参考文档	描述
references/session-management.md	多会话模式、隔离、清理
references/key-input.md	完整按键组合参考
references/element-detection.md	检测规则、置信度、模式

Ready-to-use Templates

即用型模板

Executable workflow scripts:

Template	Description
templates/vim-workflow.sh	Edit file with vim, save, exit
templates/dialog-interaction.sh	Handle dialog/whiptail prompts
templates/multi-session.sh	Parallel TUI orchestration
templates/element-detection.sh	Element detection demo

Usage:

bash

./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh

可执行的工作流脚本：

模板	描述
templates/vim-workflow.sh	使用vim编辑文件、保存、退出
templates/dialog-interaction.sh	处理dialog/whiptail提示
templates/multi-session.sh	并行TUI编排
templates/element-detection.sh	元素检测演示

使用方法：

bash

./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh