open-computer-use

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Open Computer Use

Open Computer Use

Overview

概述

Open Computer Use exposes Computer Use as a local CLI and stdio MCP server. It is not Codex.app-specific; adapt the commands and MCP config to the agent runtime you are operating in.
It supports the same core tool surface across macOS, Linux, and Windows:
list_apps
,
get_app_state
,
click
,
perform_secondary_action
,
scroll
,
drag
,
type_text
,
press_key
, and
set_value
.
Open Computer Use 将Computer Use以本地CLI和标准输入输出MCP服务器的形式对外提供。它并非Codex.app专属工具;请根据你所使用的Agent运行时环境调整命令与MCP配置。
它在macOS、Linux和Windows系统上支持相同的核心工具集:
list_apps
get_app_state
click
perform_secondary_action
scroll
drag
type_text
press_key
set_value

Core Workflow

核心工作流程

  1. Check the CLI is installed with
    open-computer-use -h
    . If installation or setup is missing, read references/installation.md.
  2. On macOS, run
    open-computer-use doctor
    before the first real GUI task. If permissions are missing, ask the user to approve Accessibility and Screen Recording in the onboarding UI.
  3. Inspect available apps before acting:
    open-computer-use call list_apps
    .
  4. Capture current UI state with
    open-computer-use call get_app_state --args '{"app":"TextEdit"}'
    .
  5. Prefer element-targeted actions using
    element_index
    from the latest
    get_app_state
    result.
  6. For multi-step CLI work, use
    open-computer-use call --calls '<json-array>'
    so one process can reuse the latest element index mapping.
  7. For agent runtimes that support local MCP servers, configure
    open-computer-use mcp
    and call the exposed Computer Use tools directly. Read references/usage.md.
  8. If communication, permission, or desktop-session access fails, read references/troubleshooting.md.
  1. 执行
    open-computer-use -h
    检查CLI是否已安装。若未完成安装或配置,请查阅references/installation.md
  2. 在macOS系统上,首次执行实际GUI任务前,请运行
    open-computer-use doctor
    。若缺少权限,请引导用户在引导界面中批准辅助功能与屏幕录制权限。
  3. 执行操作前先查看可用应用:
    open-computer-use call list_apps
  4. 通过
    open-computer-use call get_app_state --args '{"app":"TextEdit"}'
    捕获当前UI状态。
  5. 优先使用最新
    get_app_state
    结果中的
    element_index
    来执行针对元素的操作。
  6. 对于多步骤CLI操作,使用
    open-computer-use call --calls '<json-array>'
    ,这样单个进程可复用最新的元素索引映射。
  7. 对于支持本地MCP服务器的Agent运行时环境,请配置
    open-computer-use mcp
    并直接调用暴露的Computer Use工具。请查阅references/usage.md
  8. 若出现通信、权限或桌面会话访问失败问题,请查阅references/troubleshooting.md

Operating Rules

操作规则

  • Treat the target desktop as the user's real session. Do not inspect password managers, unrelated private content, or sensitive apps unless the user explicitly asked for that task.
  • Ask before sending, deleting, purchasing, approving, uploading, or making other externally visible changes.
  • Do not assume Codex.app plugin helpers are available. Use the installed
    open-computer-use
    CLI or an explicit MCP config.
  • Always run
    get_app_state
    before using
    element_index
    ; do not guess indexes across sessions or after large UI changes.
  • Prefer semantic actions and
    set_value
    for editable controls. Use coordinate
    click
    ,
    scroll
    , and
    drag
    only when the element tree does not expose a safer target.
  • On macOS, do not enable
    OPEN_COMPUTER_USE_ALLOW_GLOBAL_POINTER_FALLBACKS=1
    unless the user explicitly wants diagnostic behavior that may move the real pointer.
  • On Windows and Linux, confirm the command is running inside the logged-in desktop session before assuming GUI automation is available.
  • 将目标桌面视为用户的真实会话。除非用户明确要求,否则不得查看密码管理器、无关私人内容或敏感应用。
  • 在发送、删除、购买、批准、上传或执行其他对外可见的操作前,请先征得用户同意。
  • 不要假设Codex.app插件助手可用。请使用已安装的
    open-computer-use
    CLI或明确的MCP配置。
  • 在使用
    element_index
    前务必运行
    get_app_state
    ;不要在不同会话间或大幅UI变更后猜测索引值。
  • 对于可编辑控件,优先使用语义化操作和
    set_value
    。仅当元素树未提供更安全的目标时,才使用基于坐标的
    click
    scroll
    drag
    操作。
  • 在macOS系统上,除非用户明确需要可能移动真实指针的诊断行为,否则不要启用
    OPEN_COMPUTER_USE_ALLOW_GLOBAL_POINTER_FALLBACKS=1
  • 在Windows和Linux系统上,在假设GUI自动化可用前,请确认命令是在已登录的桌面会话中运行的。

Common CLI Actions

常见CLI操作

sh
open-computer-use -h
open-computer-use doctor
open-computer-use call list_apps
open-computer-use call get_app_state --args '{"app":"TextEdit"}'
open-computer-use call click --args '{"app":"TextEdit","element_index":0}'
open-computer-use call type_text --args '{"app":"TextEdit","text":"Hello from Open Computer Use"}'
For a short sequence that reuses state in one process:
sh
open-computer-use call --calls '[
  {"tool":"get_app_state","args":{"app":"TextEdit"}},
  {"tool":"press_key","args":{"app":"TextEdit","key":"Return"}}
]'
sh
open-computer-use -h
open-computer-use doctor
open-computer-use call list_apps
open-computer-use call get_app_state --args '{"app":"TextEdit"}'
open-computer-use call click --args '{"app":"TextEdit","element_index":0}'
open-computer-use call type_text --args '{"app":"TextEdit","text":"Hello from Open Computer Use"}'
对于可在单个进程中复用状态的短序列操作:
sh
open-computer-use call --calls '[
  {"tool":"get_app_state","args":{"app":"TextEdit"}},
  {"tool":"press_key","args":{"app":"TextEdit","key":"Return"}}
]'

MCP Usage

MCP使用方法

For runtimes that can launch local MCP servers over stdio, use:
toml
[mcp_servers.open_computer_use]
command = "open-computer-use"
args = ["mcp"]
Read references/usage.md for JSON config examples, direct tool-call patterns, and platform notes.
对于可通过标准输入输出启动本地MCP服务器的运行时环境,请使用:
toml
[mcp_servers.open_computer_use]
command = "open-computer-use"
args = ["mcp"]
请查阅references/usage.md获取JSON配置示例、直接工具调用模式及平台相关说明。

References

参考资料

  • references/installation.md: one-time CLI install, agent MCP install commands, and macOS permissions.
  • references/usage.md: MCP config, direct CLI calls, sequencing, and platform behavior.
  • references/troubleshooting.md: permission, desktop-session, app discovery, and action failures.
  • references/installation.md:一次性CLI安装、Agent MCP安装命令及macOS权限设置。
  • references/usage.md:MCP配置、直接CLI调用、操作序列及平台行为说明。
  • references/troubleshooting.md:权限、桌面会话、应用发现及操作失败问题排查。