argent-device-interact

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Unified tool surface

统一工具界面

All interaction tools below accept a
udid
parameter and auto-dispatch iOS vs Android based on its shape (UUID → iOS simulator, anything else → Android adb serial). You use the same tool names on both platforms.
For platform-specific caveats (Metro
adb reverse
, locked-screen describe errors, etc.), see § 9 Platform-specific notes at the bottom.
以下所有交互工具都接受
udid
参数,并根据其格式自动区分iOS和Android(UUID → iOS模拟器,其他格式 → Android adb序列号)。你可以在两个平台上使用相同的工具名称。
有关平台特定的注意事项(如Metro
adb reverse
、锁屏时的describe错误等),请查看底部的§9 平台特定说明。

1. Before You Start

1. 开始之前

If you delegate simulator tasks to sub-agents, make sure they have MCP permissions.
Use
list-devices
to get a target id. Results are tagged with
platform
(
ios
or
android
); booted/ready devices come first. Pick the first entry that matches the platform you need — if none are ready, call
boot-device
with
udid
(iOS) or
avdName
(Android). See
argent-ios-simulator-setup
/
argent-android-emulator-setup
for full setup flow.
Load tool schemas before first use. Gesture tools (
gesture-tap
,
gesture-swipe
,
gesture-pinch
,
gesture-rotate
,
gesture-custom
) may be deferred — their parameter schemas are not loaded until fetched. Always use ToolSearch to load the schemas of all gesture tools you plan to use before calling any of them. If you skip this step, parameters may be coerced to strings instead of numbers, causing validation errors.
如果你将模拟器任务委托给子Agent,请确保它们拥有MCP权限。
使用
list-devices
获取目标设备ID。结果会标记
platform
ios
android
);已启动/就绪的设备排在前面。选择第一个符合你所需平台的条目——如果没有就绪的设备,请使用
udid
(iOS)或
avdName
(Android)调用
boot-device
。完整设置流程请查看
argent-ios-simulator-setup
/
argent-android-emulator-setup
首次使用前加载工具 schema。手势工具(
gesture-tap
gesture-swipe
gesture-pinch
gesture-rotate
gesture-custom
)可能会延迟加载——它们的参数schema在获取前不会加载。在调用任何手势工具之前,务必使用ToolSearch加载你计划使用的所有手势工具的schema。如果跳过此步骤,参数可能会被强制转换为字符串,导致验证错误。

2. Best Practices

2. 最佳实践

  1. Always refer to tapping_rule from your argent.md rule before tapping.
  2. Before performing interactions, consider whether they can be dispatched sequentially - more on that in
    run-sequence
    .
  3. Use
    gesture-swipe
    for lists/scrolling
    , not
    gesture-custom
    , unless you need non-linear movement. Consider whether you need multiple swipes, if yes - use
    run-sequence
    .
  4. Tap a text field before typing — on iOS try
    paste
    first then fall back to
    keyboard
    ; on Android use
    keyboard
    directly (
    paste
    is iOS-only).
  5. Coordinates are normalized — always 0.0–1.0, not pixels.
  6. For app navigation, prefer
    describe
    first.
    It works on any screen without app restart. Do not navigate from screenshots on regular in-app screens unless
    describe
    failed to expose a reliable target. Use
    native-describe-screen
    only when you need app-scoped UIKit properties.
  1. 点击前务必参考argent.md规则中的tapping_rule
  2. 在执行交互操作前,考虑是否可以按顺序调度这些操作——更多信息请查看
    run-sequence
  3. 使用
    gesture-swipe
    处理列表/滚动
    ,而非
    gesture-custom
    ,除非你需要非线性移动。考虑是否需要多次滑动,如果需要,请使用
    run-sequence
  4. 输入文本前先点击文本框——在iOS上优先尝试
    paste
    ,失败后再使用
    keyboard
    ;在Android上直接使用
    keyboard
    paste
    仅支持iOS)。
  5. 坐标已归一化——始终为0.0–1.0,而非像素值。
  6. 应用导航优先使用
    describe
    。它可以在任何屏幕上工作,无需重启应用。除非
    describe
    无法提供可靠的目标,否则不要通过截图进行常规应用内屏幕导航。仅当你需要应用范围的UIKit属性时,才使用
    native-describe-screen

3. Opening Apps

3. 打开应用

Never navigate to an app by tapping home-screen icons. Use
launch-app
or
open-url
— they are instant and reliable.
绝对不要通过点击主屏幕图标导航到应用。使用
launch-app
open-url
——它们即时且可靠。

launch-app — by bundle ID

launch-app — 通过Bundle ID启动

json
{ "udid": "<UDID>", "bundleId": "com.apple.MobileSMS" }
Common IDs:
com.apple.MobileSMS
(Messages),
com.apple.mobilesafari
(Safari),
com.apple.Preferences
(Settings),
com.apple.Maps
,
com.apple.Photos
,
com.apple.mobilemail
,
com.apple.mobilenotes
,
com.apple.MobileAddressBook
(Contacts)
json
{ "udid": "<UDID>", "bundleId": "com.apple.MobileSMS" }
常见ID:
com.apple.MobileSMS
(信息)、
com.apple.mobilesafari
(Safari)、
com.apple.Preferences
(设置)、
com.apple.Maps
com.apple.Photos
com.apple.mobilemail
com.apple.mobilenotes
com.apple.MobileAddressBook
(通讯录)

open-url — by URL scheme

open-url — 通过URL Scheme打开

json
{ "udid": "<UDID>", "url": "messages://" }
Common schemes:
messages://
,
settings://
,
maps://?q=<query>
,
tel://<number>
,
mailto:<address>
,
https://...
(Safari)
json
{ "udid": "<UDID>", "url": "messages://" }
常见Scheme:
messages://
settings://
maps://?q=<query>
tel://<number>
mailto:<address>
https://...
(Safari)

4. Choosing the Right Tool

4. 选择合适的工具

ActionToolNotes
Multiple actions
run-sequence
Batch steps in one call (no intermediate screenshots)
Open an app
launch-app
Always — never tap home-screen icons
Restart an app
restart-app
Terminate and relaunch by bundle ID
Open URL/scheme
open-url
Web pages, deep links, URL schemes
Single tap
gesture-tap
Buttons, links, checkboxes
Scroll/swipe
gesture-swipe
Straight-line scroll or swipe
Long press
gesture-custom
Context menus, drag start
Drag & drop
gesture-custom
Complex drag interactions
Pinch/zoom
gesture-pinch
Two-finger pinch with auto-interpolation
Rotation
gesture-rotate
Two-finger rotation with auto-interpolation
Custom gesture
gesture-custom
Arbitrary touch sequences, optional interpolation
Hardware key
button
Home, back, power, volume, appSwitch, actionButton
Type text (fast)
paste
iOS only. Form fields — uses clipboard
Type text
keyboard
iOS+Android. Fallback when paste fails; supports Enter, Escape, arrows
Rotate device
rotate
Orientation changes
操作工具说明
多步操作
run-sequence
批量执行多步操作(无需中间截图)
打开应用
launch-app
务必使用此工具——绝对不要点击主屏幕图标
重启应用
restart-app
通过Bundle ID终止并重新启动应用
打开URL/Scheme
open-url
网页、深度链接、URL Scheme
单点点击
gesture-tap
按钮、链接、复选框
滚动/滑动
gesture-swipe
直线滚动或滑动
长按
gesture-custom
上下文菜单、拖拽开始
拖拽与放置
gesture-custom
复杂拖拽交互
捏合/缩放
gesture-pinch
双指捏合,自动插值计算
旋转
gesture-rotate
双指旋转,自动插值计算
自定义手势
gesture-custom
任意触摸序列,可选插值计算
硬件按键
button
Home、返回、电源、音量、应用切换、动作按钮
快速输入文本
paste
仅支持iOS。表单字段——使用剪贴板
输入文本
keyboard
支持iOS+Android。
paste
失败时的备选方案;支持回车、退出、方向键
旋转设备
rotate
改变屏幕方向

5. Finding Tap Targets

5. 查找点击目标

IMPORTANT. When moved to a different screen after an action or do not know the coordinates of component, always perform proper discovery first.
App typeDiscovery toolWhat it returns
Target app discovery
describe
Accessibility element tree for the current device screen (iOS AX-service or Android uiautomator) with normalized frame coordinates. Works on any app, system dialogs, and Home screen — no app restart or
bundleId
required
React Native
debugger-component-tree
React component tree with names, text, testID, and (tap: x,y)
App-scoped native
native-describe-screen
Low-level app-scoped accessibility elements with normalized and raw coordinates; requires
bundleId
Permission / system modal overlay
describe
describe
detects system dialogs automatically and returns dialog buttons with tap coordinates. Fall back to
screenshot
only if
describe
does not expose the controls
Final visual fallback
screenshot
Use only when discovery tools cannot inspect the current UI reliably. Do not derive routine in-app navigation targets from screenshots
Point follow-up native diagnostics after you already have a candidate point:
  • native-user-interactable-view-at-point
    : deepest native view that would receive touch at a known raw iOS point; requires
    bundleId
  • native-view-at-point
    : deepest visible native view at a known raw iOS point; requires
    bundleId
重要提示。执行操作后切换到其他屏幕,或不知道组件坐标时,务必先执行正确的发现操作。
应用类型发现工具返回内容
目标应用发现
describe
当前设备屏幕的无障碍元素树(iOS AX-service或Android uiautomator),包含归一化的框架坐标。适用于任何应用、系统对话框和主屏幕——无需重启应用或提供
bundleId
React Native应用
debugger-component-tree
React组件树,包含名称、文本、testID和(点击坐标:x,y)
应用范围的原生界面
native-describe-screen
低层级的应用范围无障碍元素,包含归一化和原始坐标;需要
bundleId
权限/系统模态弹窗
describe
describe
会自动检测系统对话框,并返回带有点击坐标的对话框按钮。仅当
describe
无法识别控件时,才回退到
screenshot
最终视觉备选方案
screenshot
仅当发现工具无法可靠检查当前UI时使用。不要从截图中获取常规应用内导航目标
找到候选点后,可进行后续原生诊断:
  • native-user-interactable-view-at-point
    :已知原始iOS坐标处,能接收触摸事件的最深层原生视图;需要
    bundleId
  • native-view-at-point
    :已知原始iOS坐标处,可见的最深层原生视图;需要
    bundleId

If
describe
Fails

如果
describe
失败

Read the exact error and choose the action that matches it:
  • Error mentions
    ax-service
    not available or daemon startup failure: the ax-service daemon could not start. Check that the simulator is booted. Use
    screenshot
    as a temporary fallback, or use
    native-describe-screen
    with an explicit
    bundleId
    if the app has native devtools injected.
  • describe
    returns an empty element list: the screen may be blank, loading, or showing content without accessibility labels. Use
    screenshot
    to see what is visible, then retry after the content has loaded.
  • describe
    succeeds but is not detailed enough for a React Native app: use
    debugger-component-tree
    next.
  • You need app-scoped inspection with full UIKit properties (
    accessibilityIdentifier
    ,
    viewClassName
    ): use
    native-describe-screen
    with an explicit
    bundleId
    . This requires native devtools (dylib) injection — call
    restart-app
    first if needed.
  • You already have a candidate point and want to confirm what would actually receive touch: use
    native-user-interactable-view-at-point
    . Use
    native-view-at-point
    when you want the visually deepest view instead of the hit-test target.
阅读具体错误信息,选择对应的操作:
  • 错误提到
    ax-service
    不可用或守护进程启动失败: ax-service守护进程无法启动。检查模拟器是否已启动。临时回退使用
    screenshot
    ,或如果应用已注入原生开发工具,使用带明确
    bundleId
    native-describe-screen
  • describe
    返回空元素列表: 屏幕可能为空、加载中,或显示的内容无无障碍标签。使用
    screenshot
    查看可见内容,待内容加载完成后重试。
  • describe
    执行成功,但对React Native应用来说不够详细: 接下来使用
    debugger-component-tree
  • 你需要带有完整UIKit属性(
    accessibilityIdentifier
    viewClassName
    )的应用范围检查: 使用带明确
    bundleId
    native-describe-screen
    。这需要注入原生开发工具(dylib)——必要时先调用
    restart-app
  • 你已有候选点,想要确认哪个视图会实际接收触摸事件: 使用
    native-user-interactable-view-at-point
    。如果想要视觉上最深层的视图而非点击测试目标,使用
    native-view-at-point

6. Tool Usage

6. 工具使用方法

gesture-tap — Single tap at a point

gesture-tap — 在指定坐标单点点击

json
{ "udid": "<UDID>", "x": 0.5, "y": 0.5 }
Coordinates:
0.0
= left/top,
1.0
= right/bottom.
Before tapping near the bottom of the screen in React Native apps, check that "Open Debugger to View Warnings" banners are not visible — tapping them breaks the debugger connection. Close them with the X icon if present.
json
{ "udid": "<UDID>", "x": 0.5, "y": 0.5 }
坐标说明:
0.0
= 左/上,
1.0
= 右/下。
在React Native应用中点击屏幕底部附近前,检查是否显示“Open Debugger to View Warnings”横幅——点击它们会中断调试器连接。如果存在,请点击X图标关闭。

gesture-swipe — Straight-line gesture

gesture-swipe — 直线手势

json
{ "udid": "<UDID>", "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 }
Swipe up (
fromY > toY
) = scroll content down. Default duration: 300ms. Optional:
"durationMs": 500
for slower swipe.
json
{ "udid": "<UDID>", "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 }
向上滑动(
fromY > toY
)= 向下滚动内容。默认时长:300ms。可选参数:
"durationMs": 500
实现慢速滑动。

gesture-pinch — Two-finger pinch

gesture-pinch — 双指捏合

json
{ "udid": "<UDID>", "centerX": 0.5, "centerY": 0.5, "startDistance": 0.2, "endDistance": 0.6 }
All values are normalized 0.0–1.0 (fractions of screen, not pixels) — same as all other gesture tools.
startDistance: 0.2
means fingers start 20% of the screen apart;
endDistance: 0.6
means they end 60% apart.
startDistance < endDistance
= pinch out (zoom in).
startDistance > endDistance
= pinch in (zoom out). Defaults:
angle: 0
(horizontal),
durationMs: 300
. Optional:
"angle": 90
for vertical axis,
"durationMs": 500
for slower pinch.
json
{ "udid": "<UDID>", "centerX": 0.5, "centerY": 0.5, "startDistance": 0.2, "endDistance": 0.6 }
所有值均为归一化的0.0–1.0(屏幕比例,而非像素)——与其他所有手势工具一致。
startDistance: 0.2
表示手指初始距离为屏幕的20%;
endDistance: 0.6
表示最终距离为屏幕的60%。
startDistance < endDistance
= 向外捏合(放大)。
startDistance > endDistance
= 向内捏合(缩小)。默认值:
angle: 0
(水平方向),
durationMs: 300
。可选参数:
"angle": 90
设置垂直轴,
"durationMs": 500
实现慢速捏合。

gesture-rotate — Two-finger rotation

gesture-rotate — 双指旋转

json
{
  "udid": "<UDID>",
  "centerX": 0.5,
  "centerY": 0.5,
  "radius": 0.15,
  "startAngle": 0,
  "endAngle": 90
}
All positions and radius are normalized 0.0–1.0 (fractions of screen, not pixels).
radius: 0.15
means each finger is 15% of the screen away from center.
endAngle > startAngle
= clockwise. Default duration: 300ms. Optional:
"durationMs": 500
for slower rotation.
json
{
  "udid": "<UDID>",
  "centerX": 0.5,
  "centerY": 0.5,
  "radius": 0.15,
  "startAngle": 0,
  "endAngle": 90
}
所有位置和半径均为归一化的0.0–1.0(屏幕比例,而非像素)。
radius: 0.15
表示每个手指距离中心点为屏幕的15%。
endAngle > startAngle
= 顺时针旋转。默认时长:300ms。可选参数:
"durationMs": 500
实现慢速旋转。

gesture-custom — Custom touch sequence

gesture-custom — 自定义触摸序列

For long-press, drag-and-drop, and other complex sequences, see
references/gesture-examples.md
. Set
"interpolate": 10
to auto-generate smooth intermediate Move events between keyframes.
长按、拖拽与放置及其他复杂序列的示例,请查看
references/gesture-examples.md
。设置
"interpolate": 10
可在关键帧之间自动生成平滑的移动事件。

button — Hardware button press

button — 按下硬件按键

json
{ "udid": "<UDID>", "button": "home" }
Values:
home
,
back
,
power
,
volumeUp
,
volumeDown
,
appSwitch
,
actionButton
json
{ "udid": "<UDID>", "button": "home" }
可选值:
home
back
power
volumeUp
volumeDown
appSwitch
actionButton

paste — Type text into focused field (iOS only)

paste — 向已聚焦的字段输入文本(仅iOS)

json
{ "udid": "<UDID>", "text": "Hello, world!" }
Tap the field first, then paste. Fall back to
keyboard
if it doesn't work. On Android the call is rejected by the capability gate ("Tool 'paste' is not supported on android") — use
keyboard
directly.
json
{ "udid": "<UDID>", "text": "Hello, world!" }
先点击字段,再执行paste操作。如果失败,回退使用
keyboard
。在Android上,该调用会被能力网关拒绝("Tool 'paste' is not supported on android")——直接使用
keyboard

keyboard — Type text or press special keys

keyboard — 输入文本或按下特殊按键

json
{ "udid": "<UDID>", "text": "search query", "key": "enter" }
Special keys:
enter
,
escape
,
backspace
,
tab
,
space
,
arrow-up
,
arrow-down
,
arrow-left
,
arrow-right
,
f1
f12
. Optional:
"delayMs": 100
between keystrokes (default 50ms).
json
{ "udid": "<UDID>", "text": "search query", "key": "enter" }
特殊按键:
enter
escape
backspace
tab
space
arrow-up
arrow-down
arrow-left
arrow-right
f1
f12
。可选参数:
"delayMs": 100
设置按键间隔(默认50ms)。

rotate — Change orientation

rotate — 改变设备方向

json
{ "udid": "<UDID>", "orientation": "LandscapeLeft" }
Values:
Portrait
,
LandscapeLeft
,
LandscapeRight
,
PortraitUpsideDown

json
{ "udid": "<UDID>", "orientation": "LandscapeLeft" }
可选值:
Portrait
LandscapeLeft
LandscapeRight
PortraitUpsideDown

7. Screenshots

7. 截图

Use the explicit
screenshot
tool only when:
  • You need the initial screen state before any action.
  • The auto-attached screenshot shows a transitional or loading frame.
  • You require extra context.
  • You want to check state after a delay (e.g. waiting for a network response).
  • A permission dialog, system alert, or native modal overlay is visible and
    describe
    did not expose reliable targets.
When using
screenshot
for permission or native modal navigation:
  • Do not switch to screenshot-driven navigation just because a modal is visible. On regular app screens and in-app modals, keep using
    describe
    .
  • Prefer obvious, centered alert buttons such as
    Allow
    ,
    OK
    ,
    Don't Allow
    ,
    Not Now
    , or
    Continue
    .
  • Tap one control at a time and inspect the returned auto-screenshot before doing anything else.
  • After the modal is dismissed, return to normal discovery with
    describe
    ,
    native-describe-screen
    , or
    debugger-component-tree
    .
Optional rotation parameter:
{ "udid": "<UDID>", "rotation": "LandscapeLeft" }
— rotates the capture without changing simulator orientation.
Screenshots are downscaled by default (30% of original resolution) to reduce context size.
scale
accepts values from 0.01 to 1.0. If UI elements are hard to read or you need to inspect fine detail, pass
scale: 1.0
to get full resolution:
{ "udid": "<UDID>", "scale": 1.0 }
.
仅在以下场景使用明确的
screenshot
工具:
  • 需要获取执行任何操作前的初始屏幕状态。
  • 自动附加的截图显示过渡或加载帧。
  • 需要额外上下文信息。
  • 想要在延迟后检查状态(例如等待网络响应)。
  • 显示权限对话框、系统警报或原生模态弹窗,且
    describe
    未提供可靠目标。
使用
screenshot
处理权限或原生模态导航时:
  • 不要因为模态弹窗可见就切换到基于截图的导航。在常规应用屏幕和应用内弹窗中,继续使用
    describe
  • 优先选择明显的居中警报按钮,如
    Allow
    OK
    Don't Allow
    Not Now
    Continue
  • 一次点击一个控件,在执行其他操作前检查返回的自动截图。
  • 弹窗关闭后,恢复使用
    describe
    native-describe-screen
    debugger-component-tree
    进行常规发现操作。
可选旋转参数:
{ "udid": "<UDID>", "rotation": "LandscapeLeft" }
——在不改变模拟器方向的情况下旋转截图。
截图默认会缩小(原始分辨率的30%)以减少内容大小。
scale
接受0.01到1.0之间的值。如果UI元素难以辨认或需要检查细节,传递
scale: 1.0
获取全分辨率截图:
{ "udid": "<UDID>", "scale": 1.0 }

Troubleshooting

故障排除

ProblemSolution
Screenshot times outRestart the simulator-server via
stop-simulator-server
tool
No booted iOS simulatorCall
boot-device
with the iOS
udid
No ready Android deviceCall
boot-device
with
avdName

问题解决方案
截图超时通过
stop-simulator-server
工具重启模拟器服务器
无已启动的iOS模拟器使用iOS的
udid
调用
boot-device
无就绪的Android设备使用
avdName
调用
boot-device

8. Action Sequencing with
run-sequence

8. 使用
run-sequence
进行操作序列调度

Use
run-sequence
to batch multiple interaction steps into a single tool call. Only one screenshot is returned — after all steps complete. Use cases: scrolling multiple times, typing and submitting automatically, known sequence of multiple taps, rotating device back and forth.
Do not use
run-sequence
when any step depends on observing the result of a previous step
使用
run-sequence
将多个交互步骤批量处理为单个工具调用。仅在所有步骤完成后返回一张截图。适用场景:多次滚动、自动输入并提交、已知的多步点击序列、来回旋转设备。
当任何步骤依赖于前一步的结果时,不要使用
run-sequence

Use cases

适用场景

Use the sequencing when:
  • Knowing that some action needs multiple steps without necessarily immediate insight of screenshot
  • "scroll to bottom", "scroll to top", "scroll to do X" -> sequence scroll 3-5 times
  • form interactions, "clear and retype field" -> you may use triple-tap to select all, type new value
  • "submit form" → fill all fields in sequence, tap submit
  • "go back to X" → defined tap sequence for the navigation
在以下情况使用序列调度:
  • 已知某个操作需要多步执行,且无需立即查看截图
  • “滚动到底部”、“滚动到顶部”、“滚动到目标X” → 序列执行3-5次滚动
  • 表单交互,“清除并重新输入字段” → 可使用三击选中全部内容,输入新值
  • “提交表单” → 按顺序填写所有字段,点击提交
  • “返回至X页面” → 定义好的导航点击序列

Allowed tools inside
run-sequence

run-sequence
中允许使用的工具

gesture-tap
,
gesture-swipe
,
gesture-custom
,
gesture-pinch
,
gesture-rotate
,
button
,
keyboard
,
rotate
The
udid
is shared — do not include it in each step's
args
. Optional
delayMs
per step (default 100ms).
gesture-tap
gesture-swipe
gesture-custom
gesture-pinch
gesture-rotate
button
keyboard
rotate
udid
为共享参数——不要在每个步骤的
args
中包含它。每个步骤可选
delayMs
参数(默认100ms)。

Examples

示例

Scroll down three times:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } },
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } },
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } }
  ]
}
Type into a focused field and submit:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "keyboard", "args": { "text": "hello world" } },
    { "tool": "keyboard", "args": { "key": "enter" } }
  ]
}
Tap a known button, then scroll down:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "gesture-tap", "args": { "x": 0.5, "y": 0.15 } },
    {
      "tool": "gesture-swipe",
      "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 },
      "delayMs": 300
    }
  ]
}
Stops on the first error and returns partial results.

向下滚动三次:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } },
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } },
    { "tool": "gesture-swipe", "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 } }
  ]
}
向已聚焦的字段输入文本并提交:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "keyboard", "args": { "text": "hello world" } },
    { "tool": "keyboard", "args": { "key": "enter" } }
  ]
}
点击已知按钮,然后向下滚动:
json
{
  "udid": "<UDID>",
  "steps": [
    { "tool": "gesture-tap", "args": { "x": 0.5, "y": 0.15 } },
    {
      "tool": "gesture-swipe",
      "args": { "fromX": 0.5, "fromY": 0.7, "toX": 0.5, "toY": 0.3 },
      "delayMs": 300
    }
  ]
}
遇到第一个错误时停止,并返回部分结果。

9. Platform-specific notes

9. 平台特定说明

Android

Android

  • Metro reachability: run
    adb reverse tcp:8081 tcp:8081
    on the device before the RN app starts, or Metro won't be reachable from the device. See
    argent-metro-debugger
    for the full workflow. Re-run if the device restarts.
  • First-launch permission prompts:
    reinstall-app
    on Android always installs with
    -g
    so runtime permissions are pre-granted on first launch — no flag to pass.
  • Locked screen / secure surfaces:
    describe
    throws a clear error if it can't capture (keyguard, DRM, Play Integrity). Unlock the device or fall back to
    screenshot
    .
  • APK vs .app in
    reinstall-app
    : pass
    .apk
    absolute path on Android;
    .app
    directory on iOS.
  • Metro可达性:在RN应用启动前,在设备上运行
    adb reverse tcp:8081 tcp:8081
    ,否则设备无法连接到Metro。完整流程请查看
    argent-metro-debugger
    。设备重启后需重新运行。
  • 首次启动权限提示:在Android上使用
    reinstall-app
    时,始终会带上
    -g
    参数,因此首次启动时会预先授予运行时权限——无需额外传递标志。
  • 锁屏/安全界面:如果无法捕获(锁屏、DRM、Play完整性),
    describe
    会抛出明确错误。解锁设备或回退使用
    screenshot
  • reinstall-app
    中的APK与.app
    :在Android上传递.apk的绝对路径;在iOS上传递.app目录。

iOS

iOS

(no iOS-only gotchas collected here yet — add them as they come up)
(目前暂无iOS专属注意事项——如有发现请补充)