Automationweb-infra-dev/midscene-sk...
computer-automation
Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands.
Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack.
⚠️ Takes over the user's real mouse and keyboard. For web apps, prefer "Browser Automation" instead.
Only use this for desktop-native apps (Electron, Qt, native macOS/Windows/Linux) that cannot run in a browser.
Triggers: open app, press key, desktop, computer, click on screen, type text, screenshot desktop,
launch application, switch window, desktop automation, control computer, mouse click, keyboard shortcut,
screen capture, find on screen, read screen, verify window, close app, test Electron app
Powered by Midscene.js (https://midscenejs.com)