Loading...
Loading...
Control the mouse, keyboard, and read screen content via accessibility. Use this skill when the user asks to click somewhere on screen, type text into an app, move the mouse, press keyboard shortcuts, read what's on screen, get the accessibility tree of the current window, automate desktop interactions, or control the computer.
npx skill4agent add dalehurley/phpbot desktop-control| Script | Type | Description |
|---|---|---|
| Python | Mouse movement, clicking, dragging, scrolling |
| Python | Text typing, key presses, hotkeys |
| Python | Screen info, capture, accessibility tree reading |
pyautogui| Parameter | Required | Description | Example |
|---|---|---|---|
| Yes | | click |
| For most | X coordinate (pixels from left) | 500 |
| For most | Y coordinate (pixels from top) | 300 |
| No | Mouse button: | left |
| For drag | Destination X coordinate | 700 |
| For drag | Destination Y coordinate | 400 |
| For scroll | Scroll amount (positive=up, negative=down) | -3 |
# Move mouse
python3 skills/desktop-control/scripts/mouse.py move --x 500 --y 300
# Click at position
python3 skills/desktop-control/scripts/mouse.py click --x 500 --y 300
# Double click
python3 skills/desktop-control/scripts/mouse.py doubleclick --x 500 --y 300
# Right click
python3 skills/desktop-control/scripts/mouse.py rightclick --x 500 --y 300
# Drag from one position to another
python3 skills/desktop-control/scripts/mouse.py drag --x 100 --y 100 --to-x 500 --to-y 500
# Scroll down 3 clicks
python3 skills/desktop-control/scripts/mouse.py scroll --amount -3
# Scroll up 5 clicks at specific position
python3 skills/desktop-control/scripts/mouse.py scroll --x 500 --y 300 --amount 5
# Get current mouse position
python3 skills/desktop-control/scripts/mouse.py position| Parameter | Required | Description | Example |
|---|---|---|---|
| Yes | | type |
| For type | Text to type | Hello World |
| For press | Key name to press | enter |
| For hotkey | Key combination, plus-separated | command+c |
| No | Delay between keystrokes in seconds (default: 0.02) | 0.05 |
# Type text
python3 skills/desktop-control/scripts/keyboard.py type --text "Hello World"
# Type slowly
python3 skills/desktop-control/scripts/keyboard.py type --text "Hello" --interval 0.1
# Press a single key
python3 skills/desktop-control/scripts/keyboard.py press --key enter
python3 skills/desktop-control/scripts/keyboard.py press --key tab
python3 skills/desktop-control/scripts/keyboard.py press --key escape
# Keyboard shortcuts (hotkeys)
python3 skills/desktop-control/scripts/keyboard.py hotkey --keys "command+c"
python3 skills/desktop-control/scripts/keyboard.py hotkey --keys "command+shift+s"
python3 skills/desktop-control/scripts/keyboard.py hotkey --keys "alt+tab"
python3 skills/desktop-control/scripts/keyboard.py hotkey --keys "command+space"enterreturntabspacebackspacedeleteescapeupdownleftrighthomeendpageuppagedownf1f12commandctrlaltshiftcapslock| Parameter | Required | Description | Example |
|---|---|---|---|
| Yes | | read-ui |
| For capture | Screenshot output path | /tmp/screen.png |
| For capture region | Region to capture |
# Get screen size and mouse position
python3 skills/desktop-control/scripts/screen.py info
# Take a screenshot
python3 skills/desktop-control/scripts/screen.py capture --output /tmp/screen.png
# Capture a specific region
python3 skills/desktop-control/scripts/screen.py capture --x 0 --y 0 --width 800 --height 600 --output /tmp/region.png
# Read the accessibility tree of the frontmost application (MOST USEFUL)
python3 skills/desktop-control/scripts/screen.py read-ui
# Read accessibility tree with depth limit
python3 skills/desktop-control/scripts/screen.py read-ui --depth 3read-uipython3 skills/desktop-control/scripts/screen.py read-uipython3 skills/desktop-control/scripts/mouse.py click --x 500 --y 300
python3 skills/desktop-control/scripts/keyboard.py type --text "search query"
python3 skills/desktop-control/scripts/keyboard.py press --key enterclick on the search bar
type "hello" into the text field
press command+s to save
what's on the screen right now
read the UI elements of the current window
move the mouse to the center of the screen
scroll down in this window