mobilerun
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMobilerun
Mobilerun
Mobilerun turns your Android phone into a tool that AI can control. Instead of manually tapping through apps, you connect your phone and let an AI agent do it for you -- navigate apps, fill out forms, extract information, automate repetitive tasks, or anything else you'd normally do by hand. It works with your own personal device through a simple app called Droidrun Portal, and everything happens through a straightforward API: take screenshots to see the screen, read the UI tree to understand what's on it, then tap, swipe, and type to interact. No rooting, no emulators, just your real phone controlled remotely.
Base URL:
Auth:
https://api.mobilerun.ai/v1Authorization: Bearer <MOBILERUN_API_KEY>Important: The base domain () returns 404. You must always include in the path. All API calls should be made via . Example:
https://api.mobilerun.ai//v1curlbash
curl -s https://api.mobilerun.ai/v1/devices \
-H "Authorization: Bearer $MOBILERUN_API_KEY"Mobilerun 可将你的Android手机转变为可由AI控制的工具。无需手动点击操作应用,你只需连接手机,即可让AI Agent代你完成操作——比如导航应用、填写表单、提取信息、自动化重复任务,或是任何你通常手动完成的操作。它通过一款名为Droidrun Portal的简易应用与你的个人设备配合工作,所有操作均通过简洁的API实现:截图查看屏幕内容、读取UI树了解当前界面元素,随后进行点击、滑动和输入等交互操作。无需Root权限,无需模拟器,只需远程控制你的真实手机即可。
基础URL:
认证方式:
https://api.mobilerun.ai/v1Authorization: Bearer <MOBILERUN_API_KEY>重要提示: 基础域名()会返回404错误,你必须始终在路径中包含。所有API调用都应通过执行。示例:
https://api.mobilerun.ai//v1curlbash
curl -s https://api.mobilerun.ai/v1/devices \
-H "Authorization: Bearer $MOBILERUN_API_KEY"Before You Start
开始之前
The API key () is already available -- OpenClaw handles credential setup before this skill loads. Do NOT ask the user for an API key. Just use it.
MOBILERUN_API_KEY-
Check for devices:bash
curl -s https://api.mobilerun.ai/v1/devices \ -H "Authorization: Bearer $MOBILERUN_API_KEY"- with a device in
200= good to go, skip all setup, just do what the user askedstate: "ready" - but no devices or all
200= device issue (see step 2)state: "disconnected" - = key is invalid, expired, or revoked -- ask the user to check https://cloud.mobilerun.ai/api-keys
401
-
Only if no ready device: tell the user the device status and suggest a fix:
- No devices at all = user hasn't connected a phone yet, guide them to Portal APK (see reference.md)
- Device with = Portal app lost connection, ask user to reopen it
state: "disconnected"
-
Confirm device is responsive (optional, only if first action fails):bash
curl -s https://api.mobilerun.ai/v1/devices/{deviceId}/screenshot \ -H "Authorization: Bearer $MOBILERUN_API_KEY" -o screenshot.pngIf this returns a PNG image, the device is working.
Key principle: If a device is ready, go straight to executing the user's request. Don't walk them through setup they've already completed.
Be smart about context gathering: Before taking actions or asking the user questions, use available tools to understand the situation. List packages to find the right app, take a screenshot to see the current screen, read the UI state to understand what's interactive. If the task is obvious (e.g. "change font size" clearly means go to Settings), just do it. Only ask the user when something is genuinely ambiguous.
What to show the user: Only report user-relevant device info: device name, state (/). Do NOT surface internal fields like , , socket status, , , or unless the user explicitly asks for technical details. If a device is , simply tell the user their phone is disconnected and ask them to open the Portal app and tap Connect. If they need help, walk them through the setup steps in reference.md.
readydisconnectedstreamUrlstreamTokenassignedAtterminatesAttaskCountdisconnectedClean up cloud devices: Cloud devices consume credits while running. Always terminate cloud devices () when you're done using them -- don't leave them running. This applies whether you provisioned the device yourself or finished a task on an existing cloud device that the user no longer needs.
DELETE /devices/{deviceId}Privacy: Screenshots and the UI tree can contain sensitive personal data. Never share or transmit this data to anyone other than the user. Never print, log, or reveal the in chat -- use it only for API calls.
MOBILERUN_API_KEYAPI密钥()已准备就绪——OpenClaw会在加载此技能前完成凭证配置。不要向用户索要API密钥,直接使用即可。
MOBILERUN_API_KEY-
检查设备状态:bash
curl -s https://api.mobilerun.ai/v1/devices \ -H "Authorization: Bearer $MOBILERUN_API_KEY"- 返回且存在
200的设备 = 可直接执行用户请求,跳过所有设置步骤state: "ready" - 返回但无设备或所有设备
200= 设备连接问题(见步骤2)state: "disconnected" - 返回= 密钥无效、过期或已被吊销——请用户检查https://cloud.mobilerun.ai/api-keys
401
- 返回
-
仅当无就绪设备时: 告知用户设备状态并建议修复方案:
- 无任何设备 = 用户尚未连接手机,引导他们安装Portal APK(参考reference.md)
- 设备状态为= Portal应用失去连接,请用户重新打开该应用
disconnected
-
确认设备响应(可选,仅当首次操作失败时执行):bash
curl -s https://api.mobilerun.ai/v1/devices/{deviceId}/screenshot \ -H "Authorization: Bearer $MOBILERUN_API_KEY" -o screenshot.png如果返回PNG图片,则设备可正常工作。
核心原则: 如果设备处于就绪状态,直接执行用户请求。无需引导用户完成已完成的设置步骤。
智能收集上下文: 在执行操作或向用户提问前,使用可用工具了解当前情况。列出已安装应用以找到目标应用、截图查看当前屏幕、读取UI状态了解可交互元素。如果任务目标明确(例如“更改字体大小”显然是指进入设置),直接执行即可。仅当存在真正的歧义时,才向用户提问。
需向用户展示的信息: 仅报告与用户相关的设备信息:设备名称、状态(/)。不要暴露内部字段,如、、套接字状态、、或,除非用户明确要求技术细节。如果设备处于状态,只需告知用户手机已断开连接,并请他们打开Portal应用点击“连接”。如果用户需要帮助,引导他们查看reference.md中的设置步骤。
readydisconnectedstreamUrlstreamTokenassignedAtterminatesAttaskCountdisconnected清理云设备: 云设备运行时会消耗点数。使用完毕后,务必终止云设备()——不要让其持续运行。无论你是自行配置的设备,还是完成任务后用户不再需要的现有云设备,都需执行此操作。
DELETE /devices/{deviceId}隐私注意事项: 截图和UI树可能包含敏感个人数据。切勿将此类数据分享或传输给用户以外的任何人。切勿在聊天中打印、记录或泄露——仅将其用于API调用。
MOBILERUN_API_KEYDevice Management
设备管理
Device States
设备状态
| State | Meaning |
|---|---|
| Device is being provisioned (cloud devices only) |
| Device is assigned but not yet ready |
| Device is connected and accepting commands |
| Connection lost -- Portal app may be closed or phone lost network |
| Device has been shut down (cloud devices only) |
| Device is undergoing maintenance (cloud devices only) |
| Unexpected state |
| 状态 | 含义 |
|---|---|
| 设备正在配置中(仅云设备) |
| 设备已分配但尚未就绪 |
| 设备已连接并可接收命令 |
| 连接已丢失——Portal应用可能已关闭或手机失去网络连接 |
| 设备已关闭(仅云设备) |
| 设备正在维护中(仅云设备) |
| 意外状态 |
List Devices
列出设备
GET /devicesQuery params:
- -- filter by state (array, e.g.
state)state=ready&state=assigned - --
type,dedicated_emulated_device,dedicated_physical_devicededicated_premium_device - -- filter by device name (partial match)
name - (default: 1),
page(default: 20)pageSize - --
orderBy,id,createdAt,updatedAt(default:assignedAt)createdAt - --
orderByDirection,asc(default:desc)desc
Response:
{ items: DeviceInfo[], pagination: Meta }GET /devices查询参数:
- -- 按状态筛选(数组,例如
state)state=ready&state=assigned - --
type,dedicated_emulated_device,dedicated_physical_devicededicated_premium_device - -- 按设备名称筛选(部分匹配)
name - (默认值:1),
page(默认值:20)pageSize - --
orderBy,id,createdAt,updatedAt(默认值:assignedAt)createdAt - --
orderByDirection,asc(默认值:desc)desc
响应格式:
{ items: DeviceInfo[], pagination: Meta }Get Device Info
获取设备信息
GET /devices/{deviceId}Returns device details including , , , and more.
statestateMessagetypeGET /devices/{deviceId}返回设备详细信息,包括、、等。
statestateMessagetypeGet Device Count
获取设备数量
GET /devices/countReturns a map of device types to counts.
GET /devices/count返回设备类型与对应数量的映射表。
Provision a Cloud Device
配置云设备
Cloud devices require an active subscription. If the user's plan doesn't support it, the API will return a error -- inform the user they need to terminate an existing device or upgrade at https://cloud.mobilerun.ai/billing. See reference.md for plan details.
403POST /devices
Content-Type: application/json
{
"name": "my-device",
"apps": ["com.example.app"]
}Query param:
- --
deviceType,dedicated_emulated_device,dedicated_physical_devicededicated_premium_device
After provisioning, wait for it to become ready:
GET /devices/{deviceId}/waitThis blocks until the device state transitions to .
readyCloud device workflow:
- -- provision, returns device in
POST /devices?deviceType=dedicated_emulated_devicestatecreating - -- blocks until
GET /devices/{deviceId}/waitready - Use the for phone control or tasks
deviceId
Temporary device for a task:
When the user wants to run a task but has no ready device, provision a temporary cloud device, run the task on it, then clean up:
- with
POST /devices?deviceType=dedicated_emulated_device-- include any apps the task needs{"name": "temp-task-device", "apps": [...]} - -- wait until ready
GET /devices/{deviceId}/wait - with the new
POST /tasks-- run the taskdeviceId - Monitor via until the task finishes
GET /tasks/{taskId}/status - -- terminate the device after the task completes (or fails)
DELETE /devices/{deviceId}
Always terminate temporary devices after use -- they consume credits while running.
云设备需要有效订阅。如果用户的套餐不支持该功能,API会返回错误——告知用户需要终止现有设备或访问https://cloud.mobilerun.ai/billing升级套餐。套餐详情请参考[reference.md](./reference.md)。
403POST /devices
Content-Type: application/json
{
"name": "my-device",
"apps": ["com.example.app"]
}查询参数:
- --
deviceType,dedicated_emulated_device,dedicated_physical_devicededicated_premium_device
配置完成后,等待设备就绪:
GET /devices/{deviceId}/wait该请求会阻塞,直到设备状态变为。
ready云设备工作流:
- -- 配置设备,返回处于
POST /devices?deviceType=dedicated_emulated_device状态的设备creating - -- 阻塞等待直到设备状态变为
GET /devices/{deviceId}/waitready - 使用进行手机控制或任务执行
deviceId
任务临时设备:
当用户想要执行任务但无就绪设备时,配置临时云设备,在其上运行任务,随后清理设备:
- ,参数为
POST /devices?deviceType=dedicated_emulated_device—— 包含任务所需的所有应用{"name": "temp-task-device", "apps": [...]} - —— 等待设备就绪
GET /devices/{deviceId}/wait - ,使用新的
POST /tasks—— 执行任务deviceId - 通过监控任务直到完成
GET /tasks/{taskId}/status - —— 任务完成(或失败)后终止设备
DELETE /devices/{deviceId}
使用完毕后务必终止临时设备——它们运行时会消耗点数。
Terminate a Cloud Device
终止云设备
DELETE /devices/{deviceId}
Content-Type: application/json
{}Personal devices cannot be terminated via the API. They disconnect when the Portal app is closed.
DELETE /devices/{deviceId}
Content-Type: application/json
{}个人设备无法通过API终止,关闭Portal应用即可断开连接。
Get Device Time
获取设备时间
GET /devices/{deviceId}/timeReturns the current time on the device as a string.
GET /devices/{deviceId}/time返回设备当前时间的字符串格式。
Screen Observation
屏幕监控
Take Screenshot
截图
GET /devices/{deviceId}/screenshotQuery param: (default: )
hideOverlayfalseReturns a PNG image as binary data. Use this to see what's currently displayed on screen.
GET /devices/{deviceId}/screenshot查询参数:(默认值:)
hideOverlayfalse返回二进制格式的PNG图片,用于查看当前屏幕显示内容。
Get UI State (Accessibility Tree)
获取UI状态(无障碍树)
GET /devices/{deviceId}/ui-stateQuery param: (default: ) -- set to to filter out non-interactive elements.
filterfalsetrueReturns an object with three sections:
AndroidStateGET /devices/{deviceId}/ui-state查询参数:(默认值:)—— 设置为可过滤掉非交互元素。
filterfalsetrue返回包含三个部分的对象:
AndroidStatephone_state
phone_state
json
{
"keyboardVisible": false,
"packageName": "app.lawnchair",
"currentApp": "Lawnchair",
"isEditable": false,
"focusedElement": {
"className": "string",
"resourceId": "string",
"text": "string"
}
}- -- human-readable name of the foreground app
currentApp - -- Android package name of the foreground app
packageName - -- whether the soft keyboard is showing
keyboardVisible - -- whether the currently focused element accepts text input
isEditable - -- details about the focused UI element (if any)
focusedElement
json
{
"keyboardVisible": false,
"packageName": "app.lawnchair",
"currentApp": "Lawnchair",
"isEditable": false,
"focusedElement": {
"className": "string",
"resourceId": "string",
"text": "string"
}
}- -- 前台应用的可读名称
currentApp - -- 前台应用的Android包名
packageName - -- 软键盘是否显示
keyboardVisible - -- 当前聚焦的元素是否接受文本输入
isEditable - -- 聚焦UI元素的详细信息(如果存在)
focusedElement
device_context
device_context
json
{
"screen_bounds": { "width": 720, "height": 1616 },
"display_metrics": {
"density": 1.75,
"densityDpi": 280,
"scaledDensity": 1.75,
"widthPixels": 720,
"heightPixels": 1616
},
"filtering_params": {
"min_element_size": 5,
"overlay_offset": 0
}
}- -- the actual screen resolution in pixels. All tap/swipe coordinates use this coordinate space.
screen_bounds - -- physical display properties (density, DPI)
display_metrics
json
{
"screen_bounds": { "width": 720, "height": 1616 },
"display_metrics": {
"density": 1.75,
"densityDpi": 280,
"scaledDensity": 1.75,
"widthPixels": 720,
"heightPixels": 1616
},
"filtering_params": {
"min_element_size": 5,
"overlay_offset": 0
}
}- -- 屏幕实际分辨率(像素)。所有点击/滑动坐标均使用此坐标系统。
screen_bounds - -- 屏幕物理属性(密度、DPI)
display_metrics
a11y_tree (Accessibility Tree)
a11y_tree(无障碍树)
A recursive tree of UI elements. Each node has:
json
{
"className": "android.widget.TextView",
"packageName": "app.lawnchair",
"resourceId": "app.lawnchair:id/search_container",
"text": "Search",
"contentDescription": "",
"boundsInScreen": { "left": 48, "top": 1420, "right": 671, "bottom": 1532 },
"isClickable": true,
"isLongClickable": false,
"isEditable": false,
"isScrollable": false,
"isEnabled": true,
"isVisibleToUser": true,
"isCheckable": false,
"isChecked": false,
"isFocusable": false,
"isFocused": false,
"isSelected": false,
"isPassword": false,
"hint": "",
"childCount": 0,
"children": []
}Key node fields:
- -- the visible text on the element
text - -- accessibility label (useful when
contentDescriptionis empty, e.g. icon buttons)text - -- Android resource ID (e.g.
resourceId) -- useful for identifying elementscom.app:id/button_ok - -- pixel coordinates as
boundsInScreen. To tap an element, calculate its center:{left, top, right, bottom},x = (left + right) / 2y = (top + bottom) / 2 - -- whether the element responds to taps
isClickable - -- whether the element is a text input field
isEditable - -- whether the element supports scrolling (swipe gestures)
isScrollable - -- nested child elements (the tree is recursive)
children
Example: reading a home screen
FrameLayout (0,0,720,1616)
ScrollView (0,0,720,1616) [scrollable]
FrameLayout (14,113,706,326)
LinearLayout (42,128,706,310) [clickable]
TextView (42,156,706,198) "Tap to set up"
View (0,94,720,1574) "Home"
TextView (14,1222,187,1422) "Phone" [clickable]
TextView (187,1222,360,1422) "Contacts" [clickable]
TextView (360,1222,533,1422) "Files" [clickable]
TextView (533,1222,706,1422) "Chrome" [clickable]
FrameLayout (48,1420,671,1532) "Search" [clickable]To tap "Chrome": bounds are (533,1222,706,1422), so tap at x=(533+706)/2=619, y=(1222+1422)/2=1322.
Use for a cleaner tree focused on actionable elements (filters out non-interactive containers).
filter=trueUI元素的递归树结构。每个节点包含:
json
{
"className": "android.widget.TextView",
"packageName": "app.lawnchair",
"resourceId": "app.lawnchair:id/search_container",
"text": "Search",
"contentDescription": "",
"boundsInScreen": { "left": 48, "top": 1420, "right": 671, "bottom": 1532 },
"isClickable": true,
"isLongClickable": false,
"isEditable": false,
"isScrollable": false,
"isEnabled": true,
"isVisibleToUser": true,
"isCheckable": false,
"isChecked": false,
"isFocusable": false,
"isFocused": false,
"isSelected": false,
"isPassword": false,
"hint": "",
"childCount": 0,
"children": []
}节点核心字段:
- -- 元素上的可见文本
text - -- 无障碍标签(当
contentDescription为空时有用,例如图标按钮)text - -- Android资源ID(例如
resourceId)—— 用于识别元素com.app:id/button_ok - -- 像素坐标,格式为
boundsInScreen。要点击元素,计算其中心坐标:{left, top, right, bottom},x = (left + right) / 2y = (top + bottom) / 2 - -- 元素是否响应点击
isClickable - -- 元素是否为文本输入框
isEditable - -- 元素是否支持滚动(滑动手势)
isScrollable - -- 嵌套的子元素(树结构为递归)
children
示例:读取主屏幕
FrameLayout (0,0,720,1616)
ScrollView (0,0,720,1616) [可滚动]
FrameLayout (14,113,706,326)
LinearLayout (42,128,706,310) [可点击]
TextView (42,156,706,198) "点击设置"
View (0,94,720,1574) "主页"
TextView (14,1222,187,1422) "电话" [可点击]
TextView (187,1222,360,1422) "联系人" [可点击]
TextView (360,1222,533,1422) "文件" [可点击]
TextView (533,1222,706,1422) "Chrome" [可点击]
FrameLayout (48,1420,671,1532) "搜索" [可点击]要点击“Chrome”:坐标为(533,1222,706,1422),所以点击中心x=(533+706)/2=619,y=(1222+1422)/2=1322。
使用可获取更简洁的树结构,仅显示可操作元素(过滤掉非交互容器)。
filter=trueDevice Actions
设备操作
All action endpoints take a path parameter.
deviceId所有操作端点均需传入路径参数。
deviceIdTap
点击
POST /devices/{deviceId}/tap
Content-Type: application/json
{ "x": 540, "y": 960 }Taps at pixel coordinates. Use the from UI state and element bounds from the a11y tree to calculate where to tap.
screen_boundsPOST /devices/{deviceId}/tap
Content-Type: application/json
{ "x": 540, "y": 960 }在指定像素坐标处点击。使用UI状态中的和无障碍树中的元素坐标计算点击位置。
screen_boundsSwipe
滑动
POST /devices/{deviceId}/swipe
Content-Type: application/json
{
"startX": 540,
"startY": 1200,
"endX": 540,
"endY": 400,
"duration": 300
}duration- Scroll down: swipe from bottom to top (high startY -> low endY)
- Scroll up: swipe from top to bottom
- Swipe left/right: adjust X coordinates, keep Y similar
POST /devices/{deviceId}/swipe
Content-Type: application/json
{
"startX": 540,
"startY": 1200,
"endX": 540,
"endY": 400,
"duration": 300
}duration- 向下滚动:从下往上滑动(startY值高,endY值低)
- 向上滚动:从上往下滑动
- 左右滑动:调整X坐标,保持Y坐标相近
Global Actions
全局操作
POST /devices/{deviceId}/global
Content-Type: application/json
{ "action": 2 }| Action code | Button |
|---|---|
| BACK |
| HOME |
| RECENT |
POST /devices/{deviceId}/global
Content-Type: application/json
{ "action": 2 }| 操作代码 | 对应按钮 |
|---|---|
| 返回键 |
| 主页键 |
| 最近任务键 |
Type Text
输入文本
POST /devices/{deviceId}/keyboard
Content-Type: application/json
{ "text": "Hello world", "clear": false }Types text into the currently focused input field.
- -- clears the field before typing
clear: true - Make sure an input field is focused first (check )
phone_state.isEditable - If the keyboard isn't visible, you may need to tap on an input field first
POST /devices/{deviceId}/keyboard
Content-Type: application/json
{ "text": "Hello world", "clear": false }在当前聚焦的输入框中输入文本。
- -- 输入前清空输入框内容
clear: true - 确保输入框已聚焦(检查)
phone_state.isEditable - 如果键盘未显示,可能需要先点击输入框
Press Key
按键
PUT /devices/{deviceId}/keyboard
Content-Type: application/json
{ "key": 66 }Sends an Android keycode. Only text-input-related keycodes are supported.
| Keycode | Key |
|---|---|
| BACK |
| TAB |
| ENTER |
| DEL (backspace) |
| FORWARD_DEL (delete) |
For system navigation (home, back, recent), use instead.
POST /devices/{id}/globalPUT /devices/{deviceId}/keyboard
Content-Type: application/json
{ "key": 66 }发送Android按键码。仅支持与文本输入相关的按键码。
| 按键码 | 对应按键 |
|---|---|
| 返回键 |
| Tab键 |
| 回车键 |
| 退格键 |
| 删除键 |
系统导航操作(主页、返回、最近任务)请使用。
POST /devices/{id}/globalClear Input
清空输入
DELETE /devices/{deviceId}/keyboardClears the currently focused input field.
DELETE /devices/{deviceId}/keyboard清空当前聚焦的输入框内容。
App Management
应用管理
List Installed Apps
列出已安装应用
GET /devices/{deviceId}/appsQuery param: (default: )
includeSystemAppsfalseReturns an array of :
AppInfojson
{
"packageName": "com.example.app",
"label": "Example App",
"versionName": "1.2.3",
"versionCode": 123,
"isSystemApp": false
}GET /devices/{deviceId}/apps查询参数:(默认值:)
includeSystemAppsfalse返回数组:
AppInfojson
{
"packageName": "com.example.app",
"label": "示例应用",
"versionName": "1.2.3",
"versionCode": 123,
"isSystemApp": false
}List Package Names
列出包名
GET /devices/{deviceId}/packagesQuery param: (default: )
includeSystemPackagesfalseReturns a string array of package names. Lighter than the full app list.
GET /devices/{deviceId}/packages查询参数:(默认值:)
includeSystemPackagesfalse返回包名字符串数组,比完整应用列表更轻量化。
Install App
安装应用
POST /devices/{deviceId}/apps
Content-Type: application/json
{ "packageName": "com.example.app" }Installs an app from the Mobilerun app library (not the Play Store directly).
Takes a couple of minutes and there's no status endpoint -- you'd have to poll to confirm.
GET /devices/{id}/appsPrefer manually installing via Play Store instead. Open the Play Store app on the device, search for the app, and tap install -- this is faster and more reliable. Only use this API endpoint if the user explicitly asks for it.
On personal devices, this endpoint may fail because Android blocks app installations from unknown sources by default.
POST /devices/{deviceId}/apps
Content-Type: application/json
{ "packageName": "com.example.app" }从Mobilerun应用库安装应用(并非直接从Play Store安装)。
安装需要几分钟时间,且无状态查询端点——需轮询确认安装状态。
GET /devices/{id}/apps优先建议通过Play Store手动安装。在设备上打开Play Store应用,搜索应用并点击安装——此方式更快更可靠。仅当用户明确要求时,才使用此API端点。
在个人设备上,此端点可能执行失败,因为Android默认阻止从未知来源安装应用。
Start App
启动应用
PUT /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}Optional body: -- to launch a specific activity.
Usually omitting activity is fine; it launches the default/main activity.
{ "activity": "com.example.app.MainActivity" }PUT /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}可选请求体: —— 启动指定Activity。
通常可省略该参数,默认启动主Activity。
{ "activity": "com.example.app.MainActivity" }Stop App
停止应用
PATCH /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}PATCH /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}Uninstall App
卸载应用
DELETE /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}DELETE /devices/{deviceId}/apps/{packageName}
Content-Type: application/json
{}App Library (Upload & Manage APKs)
应用库(上传与管理APK)
The app library stores APKs that can be pre-installed on cloud devices. Only one app per package name is allowed -- to update an app, delete the existing one first, then re-upload.
应用库用于存储可预安装在云设备上的APK。每个包名仅允许存在一个应用——如需更新应用,需先删除现有版本,再重新上传。
List Apps in Library
列出应用库中的应用
GET /appsQuery params:
- (default: 1),
page(default: 10)pageSize - --
source,all,uploaded,store(default:queued)all - -- search by name
query - --
sortBy,createdAt(default:name)createdAt - --
order,asc(default:desc)desc
GET /apps查询参数:
- (默认值:1),
page(默认值:10)pageSize - --
source,all,uploaded,store(默认值:queued)all - -- 按名称搜索
query - --
sortBy,createdAt(默认值:name)createdAt - --
order,asc(默认值:desc)desc
Get App by ID
通过ID获取应用信息
GET /apps/{id}GET /apps/{id}Upload an APK
上传APK
Uploading is a 3-step process:
Step 1: Create signed upload URL
POST /apps/create-signed-upload-url
Content-Type: application/json
{
"displayName": "My App",
"packageName": "com.example.myapp",
"versionName": "1.0.0",
"versionCode": 1,
"targetSdk": 34,
"sizeBytes": 5242880,
"files": [
{ "fileName": "base.apk", "contentType": "application/vnd.android.package-archive" }
],
"country": "US"
}Required: , , , , , ,
Optional: , , , , ,
displayNamepackageNameversionNameversionCodetargetSdksizeBytesfilesdescriptioniconURLdeveloperNamecategoryNameratingScoreratingCountReturns the app and pre-signed R2 upload URLs for each file.
idStep 2: Upload the APK file(s)
Upload each file directly to its pre-signed R2 URL using a PUT request.
Step 3: Confirm the upload
POST /apps/{id}/confirm-uploadVerifies the file exists in R2 and sets the app status to .
availableIf the upload failed, mark it:
POST /apps/{id}/mark-failed上传分为3个步骤:
步骤1:创建签名上传URL
POST /apps/create-signed-upload-url
Content-Type: application/json
{
"displayName": "我的应用",
"packageName": "com.example.myapp",
"versionName": "1.0.0",
"versionCode": 1,
"targetSdk": 34,
"sizeBytes": 5242880,
"files": [
{ "fileName": "base.apk", "contentType": "application/vnd.android.package-archive" }
],
"country": "US"
}必填参数:, , , , , ,
可选参数:, , , , ,
displayNamepackageNameversionNameversionCodetargetSdksizeBytesfilesdescriptioniconURLdeveloperNamecategoryNameratingScoreratingCount返回应用和每个文件的预签名R2上传URL。
id步骤2:上传APK文件
使用PUT请求将每个文件直接上传到对应的预签名R2 URL。
步骤3:确认上传
POST /apps/{id}/confirm-upload验证文件是否存在于R2中,并将应用状态设置为。
available如果上传失败,标记为失败:
POST /apps/{id}/mark-failedDelete an App
删除应用
DELETE /apps/{id}Removes the app from R2 storage and the database. Use this before re-uploading an app with the same package name.
DELETE /apps/{id}从R2存储和数据库中删除应用。重新上传同包名的应用前需执行此操作。
Re-uploading an App
重新上传应用
Only one app per package name is allowed. To update:
- Find the existing app:
GET /apps?query=com.example.myapp - Delete it:
DELETE /apps/{id} - Upload the new version using the 3-step upload flow above
每个包名仅允许存在一个应用。如需更新:
- 查找现有应用:
GET /apps?query=com.example.myapp - 删除该应用:
DELETE /apps/{id} - 使用上述3步上传流程上传新版本
Tasks (AI Agent)
任务(AI Agent)
Instead of controlling a phone step-by-step, you can submit a natural language goal and let Mobilerun's AI agent execute it autonomously on the device with its own screen analysis, observe-act loop, and error recovery.
Tasks require a paid subscription with credits. If the user doesn't have an active plan, the API will return an error -- let them know they need a subscription at https://cloud.mobilerun.ai/billing. See reference.md for plan and credit details.
无需逐步控制手机,你可以提交自然语言描述的目标,让Mobilerun的AI Agent在设备上自主执行任务,它会自行进行屏幕分析、观察-操作循环和错误恢复。
任务执行需要付费订阅的点数。如果用户没有有效套餐,API会返回错误——告知用户需要访问https://cloud.mobilerun.ai/billing订阅套餐。套餐和点数详情请参考[reference.md](./reference.md)。
Run a Task
执行任务
POST /tasks
Content-Type: application/json
{
"task": "Open Chrome and search for weather",
"deviceId": "uuid-of-device",
"llmModel": "google/gemini-3.1-flash-lite-preview"
}Required fields:
- -- natural language description of what to do (min 1 char)
task - -- UUID of the device to run on. Must be a device in
deviceIdstate.ready
Optional fields:
- -- which model to use (default:
llmModel, seegoogle/gemini-3.1-flash-lite-previewfor available models)GET /models - -- list of app package names to pre-install
apps - -- list of
credentialsfor app logins{ packageName, credentialNames[] } - -- max agent steps (default: 100)
maxSteps - -- enable reasoning/thinking (default: true). Always set to
reasoningunless the user explicitly requests it.false - -- enable vision/screenshot analysis (default: false)
vision - -- LLM temperature (default: 0.5)
temperature - -- timeout in seconds (default: 1000)
executionTimeout - -- JSON schema for structured output (nullable). Only use when the user explicitly asks for structured/formatted data. When set, the agent returns its result as a JSON object matching the schema in the task's
outputSchemafield.output - -- route through VPN in a specific country:
vpnCountry,US,BR,FR,DE,IN,JP,KR. Only use if the task specifically requires a certain region. VPN adds latency -- avoid unless needed.ZA
Returns:
json
{
"id": "uuid",
"streamUrl": "string"
}POST /tasks
Content-Type: application/json
{
"task": "打开Chrome并搜索天气",
"deviceId": "uuid-of-device",
"llmModel": "google/gemini-3.1-flash-lite-preview"
}必填字段:
- -- 任务目标的自然语言描述(最少1个字符)
task - -- 执行任务的设备UUID,设备必须处于
deviceId状态ready
可选字段:
- -- 使用的模型(默认值:
llmModel,可通过google/gemini-3.1-flash-lite-preview查看可用模型)GET /models - -- 预安装的应用包名列表
apps - -- 应用登录凭证列表,格式为
credentials{ packageName, credentialNames[] } - -- Agent最大执行步数(默认值:100)
maxSteps - -- 启用推理/思考功能(默认值:true)。除非用户明确要求,否则始终设置为
reasoningfalse - -- 启用视觉/截图分析(默认值:false)
vision - -- LLM温度参数(默认值:0.5)
temperature - -- 超时时间(秒,默认值:1000)
executionTimeout - -- 结构化输出的JSON schema(可为空)。仅当用户明确要求结构化/格式化数据时使用。设置后,Agent会在任务的
outputSchema字段中返回符合该schema的JSON对象。output - -- 通过指定国家的VPN路由:
vpnCountry,US,BR,FR,DE,IN,JP,KR。仅当任务明确需要特定区域时使用。VPN会增加延迟——非必要时避免使用。ZA
返回结果:
json
{
"id": "uuid",
"streamUrl": "string"
}Writing Task Prompts
编写任务提示词
You don't see the phone screen -- the agent on the device does. Write prompts that describe what to achieve, not how to navigate the UI. The on-device agent will figure out the taps, swipes, and navigation itself.
Don't assume the UI -- describe the goal:
- Bad:
"Tap the three dots menu in the top right, then tap Settings, scroll down and tap the Dark Mode toggle" - Good:
"Open Settings in the Chrome app and enable Dark Mode" - You don't know what the screen looks like. The on-device agent can see it -- let it handle the navigation.
Be specific about the important details:
- Name the exact app (not "the browser" -- say "Chrome")
- Specify exact text to type or send
- Say what counts as success
- Name the person, contact, or item to find
Examples by task type:
Simple action:
"Open the Settings app, go to Display, and enable Dark Mode"Multi-step with messaging:
"Open WhatsApp, find the conversation with John Smith, and send: Running 10 minutes late, sorry!"Information extraction:
"Open Chrome, go to amazon.com, search for 'wireless headphones', and report back the name and price of the top 3 results"Form filling:
"Open Chrome, go to docs.google.com/forms/d/abc123, and fill in the form with: Name = Sarah Connor, Email = sarah@example.com, Department = Engineering. Then submit the form."App configuration:
"Open Spotify, go to Settings, turn off Autoplay, set Audio Quality to Very High, and disable Canvas"Verification / checking:
"Open Gmail, check if there are any unread emails from support@stripe.com in the last 24 hours, and tell me the subject lines"Multi-app workflow:
"Open Google Maps, search for 'Italian restaurants near me', find the highest rated one that's currently open, then open Chrome and search for that restaurant's menu"Break down complex goals -- tell the agent what you want, not the steps:
- Bad:
"Order me an Uber to work" - Good:
"Open the Uber app, set the destination to 123 Main Street, select UberX, and stop before confirming the ride so I can review the price"
Include safety conditions when appropriate:
"If the app asks for login, stop and tell me""If the price is over $50, don't purchase -- just report the price"
你无法看到手机屏幕——设备上的Agent可以。编写提示词时,描述要实现的目标,而非UI导航步骤。设备上的Agent会自行决定点击、滑动和导航操作。
不要假设UI结构——描述目标即可:
- 错误示例:
"点击右上角的三点菜单,然后点击设置,向下滚动并点击深色模式开关" - 正确示例:
"打开Chrome应用的设置并启用深色模式" - 你不知道屏幕的具体布局,设备上的Agent可以看到——让它处理导航操作。
明确重要细节:
- 指定具体应用(不要说“浏览器”,要说“Chrome”)
- 明确要输入或发送的文本
- 说明成功的标准
- 指定要查找的联系人、物品等
按任务类型分类的示例:
简单操作:
"打开设置应用,进入显示选项并启用深色模式"多步骤消息任务:
"打开WhatsApp,找到与John Smith的对话,并发送:抱歉,我要晚10分钟到!"信息提取:
"打开Chrome,访问amazon.com,搜索'无线耳机',并返回排名前三的商品名称和价格"表单填写:
"打开Chrome,访问docs.google.com/forms/d/abc123,填写表单:姓名=Sarah Connor,邮箱=sarah@example.com,部门=Engineering。然后提交表单。"应用配置:
"打开Spotify,进入设置,关闭自动播放,将音频质量设置为极高,并禁用Canvas"验证/检查:
"打开Gmail,检查过去24小时内是否有来自support@stripe.com的未读邮件,并告知我邮件主题"多应用工作流:
"打开Google Maps,搜索'附近的意大利餐厅',找到评分最高且当前营业的餐厅,然后打开Chrome搜索该餐厅的菜单"拆分复杂目标——告诉Agent要做什么,而非步骤:
- 错误示例:
"帮我叫一辆Uber去上班" - 正确示例:
"打开Uber应用,设置目的地为123 Main Street,选择UberX,在确认行程前停止操作,以便我查看价格"
必要时添加安全条件:
"如果应用要求登录,停止操作并告知我""如果价格超过50美元,不要下单——只需告知我价格"
Check Task Status
检查任务状态
GET /tasks/{task_id}/statusUse this to monitor task progress:
json
{
"status": "running",
"succeeded": null,
"message": null,
"output": null,
"steps": 5,
"lastResponse": { "event": "ManagerPlanEvent", "data": { ... } }
}- While running: contains the agent's latest thinking, plan, and actions. Check this to understand what the agent is doing and where it's up to.
lastResponse - When finished: is
statusorcompleted,failedhas the final answer or failure reason,messageissucceeded/true,falseislastResponse.null - Statuses: ,
created,running,paused,completed,failedcancelled
GET /tasks/{task_id}/status用于监控任务进度:
json
{
"status": "running",
"succeeded": null,
"message": null,
"output": null,
"steps": 5,
"lastResponse": { "event": "ManagerPlanEvent", "data": { ... } }
}- 任务运行中: 包含Agent最新的思考、计划和操作。查看该字段可了解Agent正在执行的操作和当前进度。
lastResponse - 任务完成时: 为
status或completed,failed包含最终结果或失败原因message为succeeded/true,false为lastResponse。null - 状态值: ,
created,running,paused,completed,failedcancelled
Monitoring a Running Task
监控运行中的任务
After creating a task, follow this pattern:
- Immediately tell the user the task is running (task ID, what it's doing).
- After 5 seconds -- do the first status check. This catches quick tasks and confirms the agent started.
- After 30 seconds -- check again if still running.
- Subsequent checks -- use your judgement on the interval based on:
- Task complexity -- a simple "open Chrome" task finishes fast; a multi-app workflow takes longer, so space out checks accordingly.
- Progress -- if steps are increasing and is changing, the agent is working well; you can wait longer between checks. If the step count and
lastResponsehaven't changed, the agent may be stuck; check sooner and consider warning the user.lastResponse - Time elapsed -- the longer a task has been running successfully, the more you can trust it and wait between checks.
At each check:
- Report to the user what the agent is doing (from -- its current plan, thinking, what step it's on).
lastResponse - Optionally take a screenshot () to show the user what's on screen.
GET /devices/{id}/screenshot - Optionally read the UI state () for more context.
GET /devices/{id}/ui-state - Give the user a meaningful update, not just "still running" -- e.g. "The agent is on step 8, currently in the Settings app looking for display options."
When the task finishes:
- Report the result (,
message,succeeded).output - If the task failed unexpectedly, auto-submit feedback (see Feedback section).
If the agent seems stuck:
- Send a message via to nudge it in the right direction.
POST /tasks/{id}/message - Let the user know and ask if they want to steer it or cancel.
创建任务后,按照以下流程操作:
- 立即告知用户任务已开始运行(提供任务ID和任务内容)。
- 5秒后——首次检查状态。可快速发现已完成的任务,并确认Agent已启动。
- 30秒后——如果任务仍在运行,再次检查状态。
- 后续检查——根据以下因素判断检查间隔:
- 任务复杂度——简单的“打开Chrome”任务完成速度快;多应用工作流耗时更长,因此需拉长检查间隔。
- 进度情况——如果步数在增加且在变化,说明Agent运行正常;可延长检查间隔。如果步数和
lastResponse未变化,Agent可能已卡住;需缩短检查间隔,并考虑提醒用户。lastResponse - 已耗时——任务成功运行的时间越长,可适当延长检查间隔。
每次检查时:
- 向用户报告Agent的当前操作(来自——其当前计划、思考和执行步骤)。
lastResponse - 可选:截图()展示当前屏幕内容。
GET /devices/{id}/screenshot - 可选:读取UI状态()获取更多上下文。
GET /devices/{id}/ui-state - 向用户提供有意义的更新,而非仅告知“仍在运行”——例如:“Agent已执行到第8步,当前正在设置应用中查找显示选项。”
任务完成时:
- 报告任务结果(,
message,succeeded)。output - 如果任务意外失败,自动提交反馈(见反馈部分)。
如果Agent似乎卡住:
- 通过发送消息,引导Agent回到正确方向。
POST /tasks/{id}/message - 告知用户并询问是否需要引导Agent或取消任务。
Send Message to Task
向任务发送消息
POST /tasks/{task_id}/message
Content-Type: application/json
{ "message": "Actually, search for 'weather in London' instead" }Send instructions to steer a running agent task. Use this to correct the agent, provide additional context, or change direction mid-task. The message is queued and delivered to the agent at the next step.
POST /tasks/{task_id}/message
Content-Type: application/json
{ "message": "实际上,改为搜索'伦敦的天气'" }发送指令引导运行中的Agent任务。用于纠正Agent操作、提供额外上下文或中途更改任务方向。消息会被加入队列,在Agent的下一个步骤中传递给它。
Cancel Task
取消任务
POST /tasks/{task_id}/cancelPOST /tasks/{task_id}/cancelGet Task Details
获取任务详情
GET /tasks/{task_id}Returns the full task object including configuration, status, and trajectory.
GET /tasks/{task_id}返回完整的任务对象,包括配置、状态和执行轨迹。
List Tasks
列出任务
GET /tasksQuery params:
- --
status,created,running,paused,completed,failedcancelled - --
orderBy,id,createdAt,finishedAt(default:status)createdAt - --
orderByDirection,asc(default:desc)desc - -- search in task description (max 128 chars)
query - (default: 1),
page(default: 20, max: 100)pageSize
GET /tasks查询参数:
- --
status,created,running,paused,completed,failedcancelled - --
orderBy,id,createdAt,finishedAt(默认值:status)createdAt - --
orderByDirection,asc(默认值:desc)desc - -- 在任务描述中搜索(最多128个字符)
query - (默认值:1),
page(默认值:20,最大值:100)pageSize
Task Screenshots & UI States
任务截图与UI状态
GET /tasks/{task_id}/screenshots -- list all screenshot URLs
GET /tasks/{task_id}/screenshots/{index} -- get screenshot at index
GET /tasks/{task_id}/ui_states -- list all UI state URLs
GET /tasks/{task_id}/ui_states/{index} -- get UI state at indexGET /tasks/{task_id}/screenshots -- 列出所有截图URL
GET /tasks/{task_id}/screenshots/{index} -- 获取指定索引的截图
GET /tasks/{task_id}/ui_states -- 列出所有UI状态URL
GET /tasks/{task_id}/ui_states/{index} -- 获取指定索引的UI状态Get Task Trajectory
获取任务执行轨迹
GET /tasks/{task_id}/trajectoryReturns the full history of events from the task execution.
GET /tasks/{task_id}/trajectory返回任务执行的完整事件历史。
Available LLM Models
可用LLM模型
GET /modelsReturns the list of models available for tasks. Default: .
google/gemini-3.1-flash-lite-previewGET /models返回可用于任务的模型列表。默认模型:。
google/gemini-3.1-flash-lite-previewList Tasks for a Device
列出设备的任务
GET /devices/{deviceId}/tasksQuery params: , , ,
pagepageSizeorderByorderByDirectionGET /devices/{deviceId}/tasks查询参数:, , ,
pagepageSizeorderByorderByDirectionFeedback
反馈
Submit feedback to help improve the Mobilerun platform. This is important for identifying bugs and improving agent performance.
When to auto-submit feedback:
- When a task fails unexpectedly
- When the agent behaves incorrectly or produces wrong results
- When API errors occur that seem like platform bugs
- Include the , error details, and what happened
taskId
When the user asks to submit feedback:
- Ask for a few details (what happened, what they expected) but don't push hard
- If they don't want to elaborate, just submit with whatever details you have
POST /feedback
Content-Type: application/json
{
"title": "Task failed unexpectedly",
"feedback": "The agent got stuck on the login screen and timed out after 50 steps.",
"rating": 2,
"taskId": "uuid-of-related-task"
}Required fields:
- -- short summary (3-100 chars)
title - -- detailed description (10-4000 chars)
feedback - -- 1 to 5
rating
Optional fields:
- -- UUID of a related task
taskId
| Status | Meaning |
|---|---|
| Feedback submitted |
| Validation error |
| Invalid or missing API key |
| Rate limited -- 15/day cap reached |
提交反馈以帮助改进Mobilerun平台。这对发现Bug和提升Agent性能非常重要。
自动提交反馈的场景:
- 任务意外失败时
- Agent操作错误或返回错误结果时
- 出现平台Bug类的API错误时
- 需包含、错误详情和事件描述
taskId
当用户要求提交反馈时:
- 询问一些细节(发生了什么、预期结果是什么),但不要过度追问
- 如果用户不想详细说明,使用已有信息提交即可
POST /feedback
Content-Type: application/json
{
"title": "任务意外失败",
"feedback": "Agent在登录界面卡住,50步后超时。",
"rating": 2,
"taskId": "uuid-of-related-task"
}必填字段:
- -- 简短摘要(3-100个字符)
title - -- 详细描述(10-4000个字符)
feedback - -- 1到5分
rating
可选字段:
- -- 相关任务的UUID
taskId
| 状态码 | 含义 |
|---|---|
| 反馈已提交 |
| 验证错误 |
| API密钥无效或缺失 |
| 达到频率限制——每日最多提交15条 |
Common Patterns
常见模式
Observe-Act Loop:
Most phone control tasks follow this cycle:
- Take a screenshot and/or read the UI state
- Decide what action to perform
- Execute the action (tap, type, swipe, etc.)
- Observe again to verify the result
- Repeat
Finding tap coordinates:
Use to get the accessibility tree with element bounds, then calculate the center of the target element: , .
GET /devices/{id}/ui-state?filter=truex = (left + right) / 2y = (top + bottom) / 2When an action doesn't work:
- Take a screenshot and re-read the UI state -- the screen may have changed or your tap coordinates may have been off.
- If an element isn't visible, try scrolling (swipe up/down) to reveal it.
- If a tap didn't register, recalculate coordinates from the latest UI state and try again.
- If the app is unresponsive, try pressing HOME and reopening the app.
- If you're stuck after 2-3 attempts, tell the user what's happening and ask how to proceed.
Typing into a field:
- Check -- if false, tap the input field first
phone_state.isEditable - Optionally clear existing text with
clear: true - Send the text via
POST /devices/{id}/keyboard
观察-操作循环:
大多数手机控制任务遵循以下循环:
- 截图和/或读取UI状态
- 决定要执行的操作
- 执行操作(点击、输入、滑动等)
- 再次观察以验证结果
- 重复上述步骤
查找点击坐标:
使用获取包含元素坐标的无障碍树,然后计算目标元素的中心坐标:, .
GET /devices/{id}/ui-state?filter=truex = (left + right) / 2y = (top + bottom) / 2当操作失败时:
- 截图并重新读取UI状态——屏幕可能已变化,或点击坐标有误。
- 如果元素不可见,尝试滚动(上下滑动)以显示元素。
- 如果点击未生效,根据最新的UI状态重新计算坐标并再次尝试。
- 如果应用无响应,尝试按主页键后重新打开应用。
- 如果尝试2-3次后仍失败,告知用户当前情况并询问如何继续。
在输入框中输入文本:
- 检查——如果为false,先点击输入框
phone_state.isEditable - 可选:使用清空现有文本
clear: true - 通过发送文本
POST /devices/{id}/keyboard
Two Ways to Control a Device
两种设备控制方式
You have two approaches -- choose based on the task:
-
Direct control -- You drive the device step-by-step: screenshot, tap, swipe, type. Best for simple, quick actions on a single device.
-
Mobilerun Agent -- Submit a natural language goal viaand the agent executes it autonomously. Best for complex or multi-step tasks. Monitor progress with
POST /tasksand steer withGET /tasks/{id}/status. Requires credits (paid plan).POST /tasks/{id}/message
When to use the Mobilerun Agent:
- When the task is complex or spans multiple screens/apps
- When the user asks about approaches or alternatives
- When direct control isn't producing good results
- When managing multiple devices -- always use tasks for multi-device scenarios. Direct control is sequential (one action at a time on one device), so controlling multiple devices by hand is too slow. Submit a task to each device and monitor them in parallel.
Breaking big goals into sub-tasks:
If a goal is too complex for a single task (many steps, multiple apps, high chance of failure), break it into smaller sequential sub-tasks:
- Split the goal into clear, self-contained sub-goals
- Submit the first sub-task via
POST /tasks - Wait for it to complete, check the result
- If it succeeded, submit the next sub-task (the device is already in the right state from the previous task)
- Repeat until done
Example: "Order groceries from the Instacart app" could be:
"Open Instacart and search for 'organic bananas', add the first result to cart""Search for 'whole milk', add the first result to cart""Go to cart and report back the total price -- do not checkout"
This gives you checkpoints between steps, lets you steer or abort early, and keeps each task focused so the agent is less likely to get lost.
Combining both approaches:
You can mix direct control and tasks in the same workflow:
- Use direct control to quickly set something up (open the right app, navigate to a screen), then launch a task for the complex part.
- Let a task do the heavy lifting, then use direct control for a precise final action (e.g. verify a specific element on screen).
- Use direct control for a quick check (screenshot to see what's on screen), then decide whether to handle it manually or submit a task.
Only suggest tools and approaches available through this skill -- do not recommend external tools like ADB, scrcpy, Appium, Tasker, etc.
你有两种控制方式——根据任务选择:
-
直接控制——你逐步驱动设备:截图、点击、滑动、输入。适用于单设备上的简单、快速操作。
-
Mobilerun Agent——通过提交自然语言目标,由Agent自主执行。适用于复杂或多步骤任务。通过
POST /tasks监控进度,通过GET /tasks/{id}/status引导任务。需要点数(付费套餐)。POST /tasks/{id}/message
何时使用Mobilerun Agent:
- 任务复杂或涉及多个屏幕/应用时
- 用户询问操作方式或替代方案时
- 直接控制效果不佳时
- 管理多设备时——多设备场景下始终使用任务。直接控制是串行的(一次只能在一个设备上执行一个操作),因此手动控制多设备效率极低。为每个设备提交任务并并行监控。
拆分大型目标为子任务:
如果单个任务无法处理复杂目标(步骤多、涉及多应用、失败风险高),将其拆分为多个独立的子任务:
- 将大目标拆分为清晰、独立的子目标
- 通过提交第一个子任务
POST /tasks - 等待任务完成,检查结果
- 如果成功,提交下一个子任务(设备已处于上一个任务完成后的状态)
- 重复直到完成所有子任务
示例:“从Instacart应用订购杂货”可拆分为:
"打开Instacart,搜索'有机香蕉',将第一个结果加入购物车""搜索'全脂牛奶',将第一个结果加入购物车""进入购物车并报告总价——不要结账"
这样可在步骤间设置检查点,便于引导或提前终止任务,同时每个任务聚焦单一目标,降低Agent迷路的概率。
组合两种控制方式:
你可以在同一个工作流中混合使用直接控制和任务:
- 使用直接控制快速完成设置(打开目标应用、导航到指定屏幕),然后启动任务处理复杂部分。
- 让Agent完成主要工作,然后使用直接控制执行精确的最终操作(例如验证屏幕上的特定元素)。
- 使用直接控制快速检查(截图查看屏幕内容),然后决定是手动处理还是提交任务。
仅推荐此技能中提供的工具和方式——不要推荐外部工具如ADB、scrcpy、Appium、Tasker等。
Error Handling
错误处理
All API errors follow this format:
json
{
"title": "Unauthorized",
"status": 401,
"detail": "Invalid API key.",
"errors": []
}| Error | Likely cause | What to do |
|---|---|---|
| Invalid or expired API key | Ask user to verify key at https://cloud.mobilerun.ai/api-keys |
| Insufficient credits | User needs to add credits or upgrade plan |
| Plan limit hit (max concurrent devices) | User needs to terminate a device or upgrade |
| Device not found or invalid ID | Verify device ID, re-list devices |
| Empty device list | No device connected | Guide user to connect via Portal APK (see reference.md) |
Device | Portal app closed or phone lost network | Ask user to check phone and reopen Portal |
Billing/plan error on | Free plan, cloud devices need subscription | Tell user to check plans at https://cloud.mobilerun.ai/billing |
| Action fails on valid device | Device may be busy, locked, or unresponsive | Try taking a screenshot first to check state |
所有API错误均遵循以下格式:
json
{
"title": "Unauthorized",
"status": 401,
"detail": "Invalid API key.",
"errors": []
}| 错误 | 可能原因 | 处理方式 |
|---|---|---|
| API密钥无效或过期 | 请用户在https://cloud.mobilerun.ai/api-keys验证密钥 |
| 点数不足 | 用户需要购买点数或升级套餐 |
| 达到套餐限制(最大并发设备数) | 用户需要终止一个设备或升级套餐 |
设备操作返回 | 设备不存在或ID无效 | 验证设备ID,重新列出设备 |
| 设备列表为空 | 无设备连接 | 引导用户通过Portal APK连接设备(参考reference.md) |
设备状态为 | Portal应用已关闭或手机失去网络 | 请用户检查手机并重新打开Portal应用 |
| 免费套餐,云设备需要付费订阅 | 告知用户查看https://cloud.mobilerun.ai/billing的套餐信息 |
| 有效设备上的操作失败 | 设备可能繁忙、锁定或无响应 | 先尝试截图检查设备状态 |