pilotty
Original:🇺🇸 English
Translated
4 scripts
Automates terminal TUI applications (vim, htop, lazygit, dialog) through managed PTY sessions. Use when the user needs to interact with terminal apps, edit files in vim/nano, navigate TUI menus, click terminal buttons/checkboxes, or automate CLI workflows with interactive prompts.
7installs
Sourcemsmps/pilotty
Added on
NPX Install
npx skill4agent add msmps/pilotty pilottyTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Terminal Automation with pilotty
CRITICAL: Argument Positioning
All flags (, , , etc.) MUST come BEFORE positional arguments:
--name-s--formatbash
# CORRECT - flags before command/arguments
pilotty spawn --name myapp vim file.txt
pilotty key -s myapp Enter
pilotty snapshot -s myapp --format text
# WRONG - flags after command (they get passed to the app, not pilotty!)
pilotty spawn vim file.txt --name myapp # FAILS: --name goes to vim
pilotty key Enter -s myapp # FAILS: -s goes nowhere usefulThis is the #1 cause of agent failures. When in doubt: flags first, then command/args.
Quick start
bash
pilotty spawn vim file.txt # Start TUI app in managed session
pilotty wait-for "file.txt" # Wait for app to be ready
pilotty snapshot # Get screen state with UI elements
pilotty key i # Enter insert mode
pilotty type "Hello, World!" # Type text
pilotty key Escape # Exit insert mode
pilotty kill # End sessionCore workflow
- Spawn: starts the app in a background PTY
pilotty spawn <command> - Wait: ensures the app is ready
pilotty wait-for <text> - Snapshot: returns screen state with detected UI elements
pilotty snapshot - Understand: Parse to identify buttons, inputs, toggles
elements[] - Interact: Use keyboard commands (,
key) to navigate and interacttype - Re-snapshot: Check to detect screen changes
content_hash
Commands
Session management
bash
pilotty spawn <command> # Start TUI app (e.g., pilotty spawn htop)
pilotty spawn --name myapp <cmd> # Start with custom session name (--name before command)
pilotty kill # Kill default session
pilotty kill -s myapp # Kill specific session
pilotty list-sessions # List all active sessions
pilotty daemon # Manually start daemon (usually auto-starts)
pilotty shutdown # Stop daemon and all sessions
pilotty examples # Show end-to-end workflow exampleScreen capture
bash
pilotty snapshot # Full JSON with text content and elements
pilotty snapshot --format compact # JSON without text field
pilotty snapshot --format text # Plain text with cursor indicator
pilotty snapshot -s myapp # Snapshot specific session
# Wait for screen to change (eliminates need for sleep!)
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH # Block until screen changes
pilotty snapshot --await-change $HASH --settle 50 # Wait for 50ms stabilityInput
bash
pilotty type "hello" # Type text at cursor
pilotty type -s myapp "text" # Type in specific session
pilotty key Enter # Press Enter
pilotty key Ctrl+C # Send interrupt
pilotty key Escape # Send Escape
pilotty key Tab # Send Tab
pilotty key F1 # Function key
pilotty key Alt+F # Alt combination
pilotty key Up # Arrow key
pilotty key -s myapp Ctrl+S # Key in specific session
# Key sequences (space-separated, sent in order)
pilotty key "Ctrl+X m" # Emacs chord: Ctrl+X then m
pilotty key "Escape : w q Enter" # vim :wq sequence
pilotty key "a b c" --delay 50 # Send a, b, c with 50ms delay
pilotty key -s myapp "Tab Tab Enter" # Sequence in specific sessionInteraction
bash
pilotty click 5 10 # Click at row 5, col 10
pilotty click -s myapp 10 20 # Click in specific session
pilotty scroll up # Scroll up 1 line
pilotty scroll down 5 # Scroll down 5 lines
pilotty scroll up 10 -s myapp # Scroll in specific sessionTerminal control
bash
pilotty resize 120 40 # Resize terminal to 120 cols x 40 rows
pilotty resize 80 24 -s myapp # Resize specific session
pilotty wait-for "Ready" # Wait for text to appear (30s default)
pilotty wait-for "Error" -r # Wait for regex pattern
pilotty wait-for "Done" -t 5000 # Wait with 5s timeout
pilotty wait-for "~" -s editor # Wait in specific sessionGlobal options
| Option | Description |
|---|---|
| Target specific session (default: "default") |
| Snapshot format: full, compact, text |
| Timeout for wait-for and await-change (default: 30000) |
| Treat wait-for pattern as regex |
| Session name for spawn command |
| Delay between keys in a sequence (default: 0, max: 10000) |
| Block snapshot until content_hash differs |
| Wait for screen to be stable for this many ms (default: 0) |
Environment variables
bash
PILOTTY_SESSION="mysession" # Default session name
PILOTTY_SOCKET_DIR="/tmp/pilotty" # Override socket directory
RUST_LOG="debug" # Enable debug loggingSnapshot Output
The command returns structured JSON with detected UI elements:
snapshotjson
{
"snapshot_id": 42,
"size": { "cols": 80, "rows": 24 },
"cursor": { "row": 5, "col": 10, "visible": true },
"text": "Settings:\n [x] Notifications [ ] Dark mode\n [Save] [Cancel]",
"elements": [
{ "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
{ "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
{ "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
{ "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
],
"content_hash": 12345678901234567890
}Use for a plain text view with cursor indicator:
--format text--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]The shows cursor position. Use the text content to understand screen state and navigate with keyboard commands.
[_]Element Detection
pilotty automatically detects interactive UI elements in terminal applications. Elements provide read-only context to help understand UI structure.
Element Kinds
| Kind | Detection Patterns | Confidence | Fields |
|---|---|---|---|
| toggle | | 1.0 | |
| button | Inverse video, | 1.0 / 0.8 | |
| input | Cursor position, | 1.0 / 0.6 | |
Element Fields
| Field | Type | Description |
|---|---|---|
| string | Element type: |
| number | Row position (0-based from top) |
| number | Column position (0-based from left) |
| number | Width in terminal cells (CJK chars = 2) |
| string | Text content of the element |
| number | Detection confidence (0.0-1.0) |
| bool | Whether element has focus (only present if true) |
| bool | Toggle state (only present for toggles) |
Confidence Levels
| Confidence | Meaning |
|---|---|
| 1.0 | High confidence: Cursor position, inverse video, checkbox patterns |
| 0.8 | Medium confidence: Bracket patterns |
| 0.6 | Lower confidence: Underscore input fields |
Wait for Screen Changes (Recommended)
Stop guessing sleep durations! Use to wait for the screen to actually update:
--await-changebash
# Capture baseline hash
HASH=$(pilotty snapshot | jq '.content_hash')
# Perform action
pilotty key Enter
# Wait for screen to change (blocks until hash differs)
pilotty snapshot --await-change $HASH
# Or wait for screen to stabilize (for apps that render progressively)
pilotty snapshot --await-change $HASH --settle 100Flags:
| Flag | Description |
|---|---|
| Block until |
| After change detected, wait for screen to be stable for MS |
| Maximum wait time (default: 30000) |
Why this is better than sleep:
- is a guess - too short causes race conditions, too long slows automation
sleep 1 - waits exactly as long as needed - no more, no less
--await-change - handles apps that render progressively (show partial, then complete)
--settle
Waiting for Streaming AI Responses
When interacting with AI-powered TUIs (like opencode, etc.) that stream responses, you need a longer time since the screen keeps updating as tokens arrive:
--settlebash
# 1. Capture hash before sending prompt
HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')
# 2. Type prompt and submit
pilotty type -s myapp "write me a poem about ai agents"
pilotty key -s myapp Enter
# 3. Wait for streaming response to complete
# - Use longer settle (2-3s) since AI apps pause between chunks
# - Extend timeout for long responses (60s+)
pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000
# 4. Response may be scrolled - scroll up if needed to see full output
pilotty scroll -s myapp up 10
pilotty snapshot -s myapp --format textKey parameters for streaming:
- : AI responses have pauses between chunks; 2-3 seconds ensures streaming is truly done
--settle 2000-3000 - : Extend timeout beyond the 30s default for longer generations
-t 60000 - The settle timer resets on each screen change, so it naturally waits until streaming stops
Manual Change Detection
For manual polling (not recommended), use directly:
content_hashbash
# Get initial state
SNAP1=$(pilotty snapshot)
HASH1=$(echo "$SNAP1" | jq -r '.content_hash')
# Perform action
pilotty key Tab
# Check if screen changed
SNAP2=$(pilotty snapshot)
HASH2=$(echo "$SNAP2" | jq -r '.content_hash')
if [ "$HASH1" != "$HASH2" ]; then
echo "Screen changed - re-analyze elements"
fiUsing Elements Effectively
Elements are read-only context for understanding the UI. Use keyboard navigation for reliable interaction:
bash
# 1. Get snapshot to understand UI structure
pilotty snapshot | jq '.elements'
# Output shows toggles (checked/unchecked) and buttons with positions
# 2. Navigate and interact with keyboard (reliable approach)
pilotty key Tab # Move to next element
pilotty key Space # Toggle checkbox
pilotty key Enter # Activate button
# 3. Verify state changed
pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'Key insight: Use elements to understand WHAT is on screen, use keyboard to interact with it.
Navigation Approach
pilotty uses keyboard-first navigation, just like a human would:
bash
# 1. Take snapshot to see the screen
pilotty snapshot --format text
# 2. Navigate using keyboard
pilotty key Tab # Move to next element
pilotty key Enter # Activate/select
pilotty key Escape # Cancel/back
pilotty key Up # Move up in list/menu
pilotty key Space # Toggle checkbox
# 3. Type text when needed
pilotty type "search term"
pilotty key Enter
# 4. Click at coordinates for mouse-enabled TUIs
pilotty click 5 10 # Click at row 5, col 10Key insight: Parse the snapshot text and elements to understand what's on screen, then use keyboard commands to navigate. This works reliably across all TUI applications.
Example: Edit file with vim
bash
# 1. Spawn vim
pilotty spawn --name editor vim /tmp/hello.txt
# 2. Wait for vim to load and capture baseline hash
pilotty wait-for -s editor "hello.txt"
HASH=$(pilotty snapshot -s editor | jq '.content_hash')
# 3. Enter insert mode
pilotty key -s editor i
# 4. Type content
pilotty type -s editor "Hello from pilotty!"
# 5. Wait for screen to update, then exit (no sleep needed!)
pilotty snapshot -s editor --await-change $HASH --settle 50
pilotty key -s editor "Escape : w q Enter"
# 6. Verify session ended
pilotty list-sessionsAlternative using individual keys:
bash
pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor EnterExample: Dialog checklist interaction
bash
# 1. Spawn dialog checklist (--name before command)
pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4 \
"notifications" "Push notifications" on \
"darkmode" "Dark mode theme" off \
"autosave" "Auto-save documents" on \
"telemetry" "Usage analytics" off
# 2. Wait for dialog to render (use await-change, not sleep!)
pilotty snapshot -s opts --settle 200 # Wait for initial render to stabilize
# 3. Get snapshot and examine elements, capture hash
SNAP=$(pilotty snapshot -s opts)
echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")'
HASH=$(echo "$SNAP" | jq '.content_hash')
# 4. Navigate to "darkmode" and toggle it
pilotty key -s opts Down # Move to second option
pilotty key -s opts Space # Toggle it on
# 5. Wait for change and verify
pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'
# 6. Confirm selection
pilotty key -s opts Enter
# 7. Clean up
pilotty kill -s optsExample: Form filling with elements
bash
# 1. Spawn a form application
pilotty spawn --name form my-form-app
# 2. Get snapshot to understand form structure
pilotty snapshot -s form | jq '.elements'
# Shows inputs, toggles, and buttons with positions for click command
# 3. Tab to first input (likely already focused)
pilotty type -s form "myusername"
# 4. Tab to password field
pilotty key -s form Tab
pilotty type -s form "mypassword"
# 5. Tab to remember me and toggle
pilotty key -s form Tab
pilotty key -s form Space
# 6. Tab to Login and activate
pilotty key -s form Tab
pilotty key -s form Enter
# 7. Check result
pilotty snapshot -s form --format textExample: Monitor with htop
bash
# 1. Spawn htop
pilotty spawn --name monitor htop
# 2. Wait for display
pilotty wait-for -s monitor "CPU"
# 3. Take snapshot to see current state
pilotty snapshot -s monitor --format text
# 4. Send commands
pilotty key -s monitor F9 # Kill menu
pilotty key -s monitor q # Quit
# 5. Kill session
pilotty kill -s monitorExample: Interact with AI TUI (opencode, etc.)
AI-powered TUIs stream responses, requiring special handling:
bash
# 1. Spawn the AI app
pilotty spawn --name ai opencode
# 2. Wait for the prompt to be ready
pilotty wait-for -s ai "Ask anything" -t 15000
# 3. Capture baseline hash
HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')
# 4. Type prompt and submit
pilotty type -s ai "explain the architecture of this codebase"
pilotty key -s ai Enter
# 5. Wait for streaming response to complete
# - settle=3000: Wait 3s of no changes to ensure streaming is done
# - timeout=60000: Allow up to 60s for long responses
pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text
# 6. If response is long and scrolled, scroll up to see full output
pilotty scroll -s ai up 20
pilotty snapshot -s ai --format text
# 7. Clean up
pilotty kill -s aiGotchas with AI apps:
- Use because AI responses pause between chunks
--settle 2000-3000 - Extend timeout with for complex prompts
-t 60000 - Long responses may scroll the terminal; use to see the beginning
scroll up - The settle timer resets on each screen update, so it waits for true completion
Sessions
Each session is isolated with its own:
- PTY (pseudo-terminal)
- Screen buffer
- Child process
bash
# Run multiple apps (--name must come before the command)
pilotty spawn --name monitoring htop
pilotty spawn --name editor vim file.txt
# Target specific session
pilotty snapshot -s monitoring
pilotty key -s editor Ctrl+S
# List all
pilotty list-sessions
# Kill specific
pilotty kill -s editorThe first session spawned without is automatically named .
--namedefaultImportant: Theflag must come before the command. Everything after the command is passed as arguments to that command.--name
Daemon Architecture
pilotty uses a background daemon for session management:
- Auto-start: Daemon starts on first command
- Auto-stop: Shuts down after 5 minutes with no sessions
- Session cleanup: Sessions removed when process exits (within 500ms)
- Shared state: Multiple CLI calls share sessions
You rarely need to manage the daemon manually.
Error Handling
Errors include actionable suggestions:
json
{
"code": "SESSION_NOT_FOUND",
"message": "Session 'abc123' not found",
"suggestion": "Run 'pilotty list-sessions' to see available sessions"
}json
{
"code": "SPAWN_FAILED",
"message": "Failed to spawn process: command not found",
"suggestion": "Check that the command exists and is in PATH"
}Common Patterns
Reliable action + wait (recommended)
bash
# The pattern: capture hash, act, await change
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH --settle 50
# This replaces fragile patterns like:
# pilotty key Enter && sleep 1 && pilotty snapshot # BAD: guessingWait then act
bash
pilotty spawn my-app
pilotty wait-for "Ready" # Ensure app is ready
pilotty snapshot # Then snapshotCheck state before action
bash
pilotty snapshot --format text | grep "Error" # Check for errors
pilotty key Enter # Then proceedCheck for specific element
bash
# Check if the first toggle is checked
pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1
# Find element at specific position
pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'Retry on timeout
bash
pilotty wait-for "Ready" -t 5000 || {
pilotty snapshot --format text # Check what's on screen
# Adjust approach based on actual state
}Deep-dive Documentation
For detailed patterns and edge cases, see:
| Reference | Description |
|---|---|
| references/session-management.md | Multi-session patterns, isolation, cleanup |
| references/key-input.md | Complete key combinations reference |
| references/element-detection.md | Detection rules, confidence, patterns |
Ready-to-use Templates
Executable workflow scripts:
| Template | Description |
|---|---|
| templates/vim-workflow.sh | Edit file with vim, save, exit |
| templates/dialog-interaction.sh | Handle dialog/whiptail prompts |
| templates/multi-session.sh | Parallel TUI orchestration |
| templates/element-detection.sh | Element detection demo |
Usage:
bash
./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh