alibabacloud-emas-apm-query

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

alibabacloud-emas-apm-query

阿里云EMAS APM查询工具

1. Scenario Description & Architecture

1. 场景说明与架构

After a mobile app integrates Alibaba Cloud EMAS APM, the crash / anr / lag / custom / memory_leak / memory_alloc events it produces every day are aggregated and reported by the SDK to the backend. A typical troubleshooting workflow is:
  1. Figure out which Issues are most worth fixing: sort by error rate / error count → pick Top 3~5
  2. Inspect what a specific Issue looks like: fetch its aggregated metrics and affected versions
  3. Find several representative samples: across different devices / versions / networks
  4. Read the stack + business log in a sample: find actionable clues
  5. Compare against the app source code and propose a fix
This skill stitches the 5 steps above into a single CLI pipeline. The entire process only calls the 4 read-only APIs of
aliyun emas-appmonitor
, and depends on no database / log service:
GetIssues  → GetIssue → GetErrors → GetError
(optional) stack ↔ user APP source → precise file:line + fix diff
Supported BizModules:
crash
/
anr
/
lag
/
custom
/
memory_leak
/
memory_alloc
Supported OS:
android
/
iphoneos
/
harmony
(
harmony
does not have
anr
/
memory_*
)
移动应用集成阿里云EMAS APM后,每日产生的崩溃(crash)、应用无响应(anr)、卡顿(lag)、自定义异常(custom)、内存泄漏(memory_leak)、内存分配(memory_alloc)事件会由SDK聚合上报至后端。典型的排查流程如下:
  1. 确定最需修复的问题:按错误率/错误数排序 → 选取Top 3~5
  2. 查看特定问题的概况:获取其聚合指标和受影响版本
  3. 选取代表性样本:覆盖不同设备/版本/网络环境
  4. 查看样本的堆栈+业务日志:寻找可操作线索
  5. 对比应用源代码并提出修复方案
本技能将上述5个步骤整合为一条CLI流水线。整个流程仅调用
aliyun emas-appmonitor
的4个只读API
,无需依赖数据库或日志服务:
GetIssues  → GetIssue → GetErrors → GetError
(可选) 堆栈 ↔ 用户APP源码 → 精准定位文件:行号 + 修复差异
支持的业务模块
crash
/
anr
/
lag
/
custom
/
memory_leak
/
memory_alloc
支持的操作系统
android
/
iphoneos
/
harmony
harmony
不支持
anr
/
memory_*
模块)

2. Prerequisites

2. 前置条件

ItemRequirementSelf-check command
Aliyun CLI version>=
3.3.3
aliyun version
Plugin
aliyun-cli-emas-appmonitor
aliyun emas-appmonitor --help
jqany version (required by scripts)
jq --version
Full installation steps:
references/cli-installation-guide.md
. Recommended: enable auto plugin installation once:
bash
aliyun configure set --auto-plugin-install true
aliyun plugin update
要求自检命令
Aliyun CLI版本>=
3.3.3
aliyun version
插件
aliyun-cli-emas-appmonitor
aliyun emas-appmonitor --help
jq任意版本(脚本依赖)
jq --version
完整安装步骤:
references/cli-installation-guide.md
。推荐:一次性开启自动插件安装:
bash
aliyun configure set --auto-plugin-install true
aliyun plugin update

3. Credential Pre-check

3. 凭证预检查

Do NOT print AK/SK values; just verify that an available profile exists:
bash
aliyun configure list
The expected output contains a
current
profile whose
Mode
/
RegionId
are non-empty. If not, configure one of AK / OAuth / StsToken / RamRoleArn per
references/cli-installation-guide.md#Configuration
.
This skill never reads or forwards AK / SK field values themselves during its whole lifecycle.
请勿打印AK/SK值;仅需验证是否存在可用的配置文件:
bash
aliyun configure list
预期输出应包含一个
current
配置文件,其
Mode
/
RegionId
不为空。若不存在,请按照
references/cli-installation-guide.md#Configuration
配置AK/OAuth/StsToken/RamRoleArn中的一种。
本技能在整个生命周期内绝不会读取或转发AK/SK字段值本身。

4. AI-mode Lifecycle

4. AI模式生命周期

Start (before the skill runs):
bash
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-emas-apm-query"
End (after troubleshooting completes):
bash
aliyun configure ai-mode disable
启动(技能运行前):
bash
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-emas-apm-query"
结束(排查完成后):
bash
aliyun configure ai-mode disable

5. RAM Permissions

5. RAM权限

This skill only needs the 4 read-only
emasha:View*
actions (
ViewIssues
/
ViewIssue
/
ViewErrors
/
ViewError
). EMAS AppMonitor's RamCode is
emasha
and does not support resource-level authorization, so
Resource
is fixed to
"*"
.
For the full least-privilege JSON policy, the equivalent system policies (
AliyunEMASAppMonitorReadOnlyAccess
/
AliyunEMASAppMonitorFullAccess
), and common permission-error troubleshooting, see
references/ram-policies.md
.
[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:
  1. Read
    references/ram-policies.md
    to get the full list of permissions required by this SKILL
  2. Use
    ram-permission-diagnose
    skill to guide the user through requesting the necessary permissions
  3. Pause and wait until the user confirms that the required permissions have been granted
本技能仅需4个只读的
emasha:View*
操作权限(
ViewIssues
/
ViewIssue
/
ViewErrors
/
ViewError
)。EMAS AppMonitor的RamCode为
emasha
,且不支持资源级授权,因此
Resource
固定为
"*"
关于最小权限JSON策略、等效系统策略(
AliyunEMASAppMonitorReadOnlyAccess
/
AliyunEMASAppMonitorFullAccess
)以及常见权限问题排查,请参考
references/ram-policies.md
[必须遵守] 权限失败处理:若执行过程中任何命令或API调用因权限错误失败,请遵循以下流程:
  1. 阅读
    references/ram-policies.md
    获取本技能所需的完整权限列表
  2. 使用
    ram-permission-diagnose
    技能引导用户申请必要权限
  3. 暂停并等待用户确认权限已授予

6. Parameter Confirmation

6. 参数确认

IMPORTANT: Parameter Confirmation — Before executing any command or API call, ALL user-customizable parameters (e.g., RegionId, instance names, CIDR blocks, passwords, domain names, resource specifications, etc.) MUST be confirmed with the user. Do NOT assume or use default values without explicit user approval.
ParameterRequiredTypeDescriptionDefault
app-key
Yesint64EMAS APP Key (typically 9+ digits). Prefer to infer from SDK initialization code in the current workspace (see the 6 rule families in
references/appkey-detection.md
); if exactly one match is found, echo it and wait for user confirmation; if multiple, list candidates and let the user pick; on miss, ask the user to provide it manually.
None (default: probed from workspace)
os
Yesenum
android
/
iphoneos
/
harmony
(H5 goes to
h5
, which is out of scope). Inferred together with
app-key
from the project type
:
build.gradle
/
AndroidManifest.xml
android
;
*.xcodeproj
/
Podfile
iphoneos
;
module.json5
+
ets/
harmony
. For cross-platform Flutter / Unity projects, the user MUST pick one (
android
/
iphoneos
).
None (default: probed from project type)
time-range
Yesobject
StartTime=<ms> EndTime=<ms> Granularity=1 GranularityUnit=<HOUR|DAY>
Last 24 hours (user-overridable)
biz-module
NolistIf omitted, all 6 modules are scanned; if specified, only that module is analyzedAll 6
digest-hash
NostringIf the user already knows a specific Issue, skip the Top-N stage and drill down directlyNone
top-n
NointNumber of Top issues
5
filter-json
NostringFurther narrow down (specific version / device model / region ...), a JSON stringNot applied
Timestamp unit: every API uses Unix milliseconds. If the user passes a value in seconds (< 1e12), the scripts will automatically multiply by 1000.
biz-module
pitfall
: the CLI
--help
lists the legacy enum (
exception / crash / lag / custom / h5JsError / h5WhiteScreen
); however,
anr / memory_leak / memory_alloc
are actually forwarded to the backend
and work. This skill scans all 6 modules requested by the user by default; see
references/biz-module-reference.md
.
time-range
pitfall
: in some environments
Granularity=60 GranularityUnit=MINUTE
is rejected by the backend (returns
Code: 200, Message: "unknown error"
). Always prefer
Granularity=1 GranularityUnit=DAY
or
GranularityUnit=HOUR
.
--os
pitfall
: the CLI
--help
marks
--os
as optional, but in practice omitting it returns an empty list without error (
Model.Items=[]
,
Total=0
). All 4 APIs must pass
--os
explicitly.
--did
pitfall
:
get-error
's
--did
is also marked optional in
--help
, but is implicitly required by the backend. Omitting it returns
Code: 100011 Parameter Not Enough
. Take it from
get-errors
'
Items[*].Did
(already handled by
dig_issue.sh
; when calling
aliyun emas-appmonitor get-error
manually, pass it explicitly).
Dual semantics of
DigestHash
:
get-errors
returns
Items[*].DigestHash
, which is the hash of a single event, different from the aggregated
--digest-hash
you passed in. When calling
get-error
next, still use the aggregated hash (the one you used in
get-issues
/
get-issue
); do not switch to the single-event hash.
Reuse
biz-module
: whichever
bizModule
was used to obtain a Top Issue from
get-issues
must be reused for the next three steps (
get-issue
/
get-errors
/
get-error
); otherwise the response will be empty (the same
DigestHash
exists under only one bizModule).
list_top_issues.sh
already attaches a
bm
field to each row so it can be reused.
重要:参数确认 — 在执行任何命令或API调用前, 所有用户可自定义的参数(如RegionId、实例名称、CIDR块、 密码、域名、资源规格等)必须与用户确认。未经用户明确批准,请勿假设或使用默认值。
参数是否必填类型描述默认值
app-key
int64EMAS AppKey(通常为9位以上数字)。优先从当前工作区的SDK初始化代码中推断(参考
references/appkey-detection.md
中的6种规则);若找到唯一匹配项,告知用户并等待确认;若找到多个匹配项,列出候选让用户选择;若未找到,请求用户手动提供。
无(默认:从工作区探测)
os
枚举值
android
/
iphoneos
/
harmony
(H5对应
h5
,不在本技能范围内)。
app-key
一起从项目类型推断
build.gradle
/
AndroidManifest.xml
android
*.xcodeproj
/
Podfile
iphoneos
module.json5
+
ets/
harmony
。对于跨平台Flutter/Unity项目,用户必须选择其中一个平台(
android
/
iphoneos
)。
无(默认:从项目类型探测)
time-range
对象
StartTime=<毫秒> EndTime=<毫秒> Granularity=1 GranularityUnit=<HOUR|DAY>
最近24小时(用户可覆盖)
biz-module
列表若省略,扫描全部6个模块;若指定,仅分析该模块全部6个
digest-hash
字符串若用户已明确某个特定问题,跳过Top-N阶段直接钻取详情
top-n
整数Top问题数量
5
filter-json
字符串进一步缩小范围(特定版本/设备型号/地区...),为JSON字符串不应用
时间戳单位:所有API均使用Unix毫秒。若用户传入秒级数值(< 1e12),脚本会自动乘以1000转换为毫秒。
biz-module
注意事项
:CLI的
--help
列出的是旧版枚举值(
exception / crash / lag / custom / h5JsError / h5WhiteScreen
);但实际上**
anr / memory_leak / memory_alloc
会被转发到后端并正常工作**。本技能默认扫描用户请求的所有6个模块;详情参考
references/biz-module-reference.md
time-range
注意事项
:在某些环境中,
Granularity=60 GranularityUnit=MINUTE
会被后端拒绝(返回
Code: 200, Message: "unknown error"
)。优先选择
Granularity=1 GranularityUnit=DAY
GranularityUnit=HOUR
--os
注意事项
:CLI的
--help
标记
--os
为可选参数,但实际上省略该参数会返回空列表且无错误提示
Model.Items=[]
,
Total=0
)。所有4个API必须显式传入
--os
--did
注意事项
get-error
--did
--help
中标记为可选,但后端隐式要求必填。省略该参数会返回
Code: 100011 Parameter Not Enough
。需从
get-errors
返回的
Items[*].Did
中获取(
dig_issue.sh
已处理此逻辑;手动调用
aliyun emas-appmonitor get-error
时需显式传入)。
DigestHash
的双重语义
get-errors
返回的
Items[*].DigestHash
单个事件的哈希值,与你传入的聚合
--digest-hash
不同。后续调用
get-error
时,仍需使用聚合哈希值(即你在
get-issues
/
get-issue
中使用的哈希值),不要切换为单个事件的哈希值。
复用
biz-module
:从
get-issues
获取Top问题时使用的
bizModule
,必须在后续三个步骤(
get-issue
/
get-errors
/
get-error
)中复用;否则会返回空响应(同一个
DigestHash
仅存在于一个bizModule下)。
list_top_issues.sh
已为每一行附加
bm
字段以便复用。

7. Core Workflow

7. 核心流程

mermaid
flowchart TD
    Start[User request] --> DetectCtx{Workspace can infer AppKey+OS?}
    DetectCtx -- single match --> ConfirmCtx[Echo for user confirmation]
    DetectCtx -- multiple matches --> PickCtx[List candidates, let user pick]
    DetectCtx -- miss --> AskCtx[Ask user for AppKey + OS]
    ConfirmCtx --> HasHash
    PickCtx --> HasHash
    AskCtx --> HasHash
    HasHash{digestHash provided?}
    HasHash -- yes --> SingleIssue[get-issue: fetch single Issue metadata]
    HasHash -- no --> ParallelIssues[Parallel get-issues over 6 bizModules]
    ParallelIssues --> TopN[Sort by errorRate, take Top 3-5]
    TopN --> IterateTop[Iterate Top Issues]
    IterateTop --> SingleIssue
    SingleIssue --> GetErrors[get-errors: latest 3-5 samples]
    GetErrors --> PickSample["Sample policy: latest / hot device / latest affected version"]
    PickSample --> GetError[get-error: stack/threads/logs/dimensions]
    GetError --> HasCode{CWD has APP source?}
    HasCode -- yes --> CodeMatch[stack -> file + line -> diff]
    HasCode -- no --> CliReport[CLI-only diagnostic report]
    CodeMatch --> Report[Final report: issue list + root cause + fix]
    CliReport --> Report
    Report --> CallFailed{Any CLI call failed?}
    CallFailed -- yes --> CliSelfDiag["CLI self-diagnose: --log-level debug / --cli-dry-run / configure list / plugin update"]
    CallFailed -- no --> endNode[Done]
    CliSelfDiag --> endNode
mermaid
flowchart TD
    Start[用户请求] --> DetectCtx{工作区可推断AppKey+OS?}
    DetectCtx -- 唯一匹配 --> ConfirmCtx[告知用户并等待确认]
    DetectCtx -- 多个匹配 --> PickCtx[列出候选,让用户选择]
    DetectCtx -- 未找到 --> AskCtx[请求用户提供AppKey + OS]
    ConfirmCtx --> HasHash
    PickCtx --> HasHash
    AskCtx --> HasHash
    HasHash{是否提供digestHash?}
    HasHash --  --> SingleIssue[get-issue: 获取单个问题元数据]
    HasHash --  --> ParallelIssues[并行调用6个bizModule的get-issues]
    ParallelIssues --> TopN[按errorRate排序,选取Top 3-5]
    TopN --> IterateTop[遍历Top问题]
    IterateTop --> SingleIssue
    SingleIssue --> GetErrors[get-errors: 获取最新3-5个样本]
    GetErrors --> PickSample["样本策略:最新/热门设备/最新受影响版本"]
    PickSample --> GetError[get-error: 获取堆栈/线程/日志/维度信息]
    GetError --> HasCode{当前目录是否有APP源码?}
    HasCode --  --> CodeMatch[堆栈 -> 文件+行号 -> 差异分析]
    HasCode --  --> CliReport[仅CLI诊断报告]
    CodeMatch --> Report[最终报告:问题列表 + 根因分析 + 修复方案]
    CliReport --> Report
    Report --> CallFailed{是否有CLI调用失败?}
    CallFailed --  --> CliSelfDiag["CLI自诊断:--log-level debug / --cli-dry-run / configure list / plugin update"]
    CallFailed --  --> endNode[完成]
    CliSelfDiag --> endNode

7.0 Runtime locate the Skill directory (
$SKILL_DIR
)

7.0 运行时定位技能目录(
$SKILL_DIR

The Skill's own path is known to the Agent at the time SKILL.md is loaded (see the
fullPath
/
path
field under
<available_skills>
in the context). Before running any bash command that needs to read bundled resources from this Skill (
scripts/
/
assets/
/
references/
), the Agent MUST first export the directory of SKILL.md to
SKILL_DIR
exactly once:
bash
undefined
Agent加载SKILL.md时会知晓技能自身路径(参考上下文
<available_skills>
下的
fullPath
/
path
字段)。在运行任何需要读取本技能捆绑资源(
scripts/
/
assets/
/
references/
)的bash命令前,Agent必须先将SKILL.md所在目录导出为
SKILL_DIR
,且仅需执行一次:
bash
undefined

The Agent fills in the absolute path of SKILL.md into the placeholder, then exports once

Agent将SKILL.md的绝对路径填入占位符,然后导出一次

export SKILL_DIR="$(cd "$(dirname "<ABSOLUTE_PATH_OF_SKILL.md>")" && pwd)"
export SKILL_DIR="$(cd "$(dirname "<ABSOLUTE_PATH_OF_SKILL.md>")" && pwd)"

Self-check: all three directories must exist

自检:三个目录必须存在

[[ -d "$SKILL_DIR/scripts" && -d "$SKILL_DIR/assets" && -d "$SKILL_DIR/references" ]]
|| { echo "[ERROR] SKILL_DIR does not point to the root of this Skill: $SKILL_DIR" >&2; exit 1; }

Rules:

1. **Do not hardcode `~/.cursor/skills-cursor/...` or `~/.claude/skills/...`**: this Skill can be distributed in the repository (`.agent/skills/alibabacloud-emas-apm-query/`) or at the user level, and the absolute path varies with the host.
2. **Do not rely on `cd` into the Skill directory to use relative paths**: the scripts drop artifacts into the current working directory (the user's APP source root); `cd` would break this semantic.
3. **The bash scripts have a fallback**: `scripts/list_top_issues.sh` and `scripts/dig_issue.sh` auto-detect their own location via `BASH_SOURCE` at the top, so they can locate `$SKILL_DIR` even if it was not exported. Other inline `jq` / `rg` commands inside SKILL.md still require the Agent to export `$SKILL_DIR` first.
[[ -d "$SKILL_DIR/scripts" && -d "$SKILL_DIR/assets" && -d "$SKILL_DIR/references" ]]
|| { echo "[ERROR] SKILL_DIR未指向本技能根目录:$SKILL_DIR" >&2; exit 1; }

规则:

1. **请勿硬编码`~/.cursor/skills-cursor/...`或`~/.claude/skills/...`**:本技能可分布在仓库(`.agent/skills/alibabacloud-emas-apm-query/`)或用户级别目录,绝对路径会随主机环境变化。
2. **请勿依赖`cd`进入技能目录使用相对路径**:脚本会将产物输出到当前工作目录(用户的APP源码根目录);`cd`会破坏此语义。
3. **bash脚本有 fallback 机制**:`scripts/list_top_issues.sh`和`scripts/dig_issue.sh`会在脚本顶部通过`BASH_SOURCE`自动检测自身位置,因此即使未导出`$SKILL_DIR`也能定位到该目录。但SKILL.md中的其他内联`jq`/`rg`命令仍需Agent先导出`$SKILL_DIR`。

7.1 Stage A: Top N (when
digest-hash
is not provided)

7.1 阶段A:Top N(未提供
digest-hash
时)

Use
scripts/list_top_issues.sh
to scan the 6 biz_modules in parallel:
bash
bash "$SKILL_DIR/scripts/list_top_issues.sh" \
  --app-key <AppKey> \
  --os <iphoneos|android|harmony> \
  --start-time <startMs> \
  --end-time <endMs> \
  --top-n 5 \
  --order-by ErrorRate
The output is a merged Top-N table, each row containing
{bm, digestHash, ec, er, edc, edr, name, type, reason}
. To add a Filter (e.g. "only version 3.5.x"), append
--filter-json '{"Key":"appVersion","Operator":"in","Values":["3.5.0","3.5.1"]}'
; see
references/filter-reference.md
.
使用
scripts/list_top_issues.sh
并行扫描6个biz_modules:
bash
bash "$SKILL_DIR/scripts/list_top_issues.sh" \
  --app-key <AppKey> \
  --os <iphoneos|android|harmony> \
  --start-time <起始毫秒> \
  --end-time <结束毫秒> \
  --top-n 5 \
  --order-by ErrorRate
输出为合并后的Top-N表格,每行包含
{bm, digestHash, ec, er, edc, edr, name, type, reason}
。如需添加过滤条件(例如“仅版本3.5.x”),追加
--filter-json '{"Key":"appVersion","Operator":"in","Values":["3.5.0","3.5.1"]}'
;详情参考
references/filter-reference.md

7.2 Stage B: Drill into a single Issue

7.2 阶段B:钻取单个问题详情

Use
scripts/dig_issue.sh
:
bash
bash "$SKILL_DIR/scripts/dig_issue.sh" \
  --app-key <AppKey> \
  --os <iphoneos|android|harmony> \
  --biz-module <crash|anr|lag|custom|memory_leak|memory_alloc> \
  --digest-hash <13-char Base36> \
  --start-time <startMs> --end-time <endMs> \
  --sample-size 3
Output directory:
emas-apm-dig-<AppKey>-<DigestHash>-<epoch>/
  01-get-issue.json
  02-get-errors.json           (contains the ClientTime/Uuid/Did triples)
  samples/<Uuid>.json          (complete JSON per sample, includes Backtrace/EventLog etc.)
  report.md                    (structured markdown report)
使用
scripts/dig_issue.sh
bash
bash "$SKILL_DIR/scripts/dig_issue.sh" \
  --app-key <AppKey> \
  --os <iphoneos|android|harmony> \
  --biz-module <crash|anr|lag|custom|memory_leak|memory_alloc> \
  --digest-hash <13位Base36哈希> \
  --start-time <起始毫秒> --end-time <结束毫秒> \
  --sample-size 3
输出目录:
emas-apm-dig-<AppKey>-<DigestHash>-<时间戳>/
  01-get-issue.json
  02-get-errors.json           (包含ClientTime/Uuid/Did三元组)
  samples/<Uuid>.json          (每个样本的完整JSON,包含Backtrace/EventLog等)
  report.md                    (结构化Markdown报告)

7.3 Stage C: Code mapping + diff (if the CWD contains APP source)

7.3 阶段C:代码映射+差异分析(当前目录包含APP源码时)

Follow
references/troubleshoot-workflow.md
:
  1. Determine the platform (Android / iOS / Harmony / RN / Flutter / Web)
  2. Model.Backtrace
    → keep APP user frames → grep the source → locate file:line
  3. Enrich the timeline using
    EventLog
    +
    Controllers
    +
    Threads
    +
    CustomInfo
  4. Emit the smallest diff (≤ 20 lines + one sentence of "why")
If the CWD does not contain the source: emit only a CLI diagnostic report (Issue overview, sample dimension comparison, representative stack), and append a hint that "switching to the APP source directory enables code-level localization".
遵循
references/troubleshoot-workflow.md
  1. 确定平台(Android / iOS / Harmony / RN / Flutter / Web)
  2. Model.Backtrace
    → 保留APP用户代码栈帧 → 搜索源码 → 定位文件:行号
  3. 使用
    EventLog
    +
    Controllers
    +
    Threads
    +
    CustomInfo
    丰富时间线
  4. 生成最小差异代码(≤20行 + 一句“原因说明”)
若当前目录不包含源码:仅生成CLI诊断报告(问题概览、样本维度对比、代表性堆栈),并附加提示“切换到APP源码目录可启用代码级定位”。

7.4 Failure handling (CLI only)

7.4 故障处理(仅CLI)

When any
aliyun emas-appmonitor
call fails, run the following self-checks in order:
bash
aliyun configure list                                   # 1. current profile / mode / region
aliyun plugin update                                    # 2. latest plugin
aliyun emas-appmonitor <cmd> ... --cli-dry-run          # 3. parameter serialization check
aliyun emas-appmonitor <cmd> ... --log-level debug      # 4. HTTP body + RequestId
Do not guide the user to query any server-side data source.
当任何
aliyun emas-appmonitor
调用失败时,按顺序执行以下自检步骤:
bash
aliyun configure list                                   # 1. 当前配置文件/模式/地域
aliyun plugin update                                    # 2. 更新到最新插件
aliyun emas-appmonitor <cmd> ... --cli-dry-run          # 3. 参数序列化检查
aliyun emas-appmonitor <cmd> ... --log-level debug      # 4. 查看HTTP请求体+RequestId
请勿引导用户查询任何服务器端数据源。

8. Success Verification

8. 成功验证

The full 6-step CLI self-verification (with runnable commands and pass/fail criteria for each step) is in
references/verification-method.md
. The correct-vs-incorrect CLI pattern matrix is in
references/acceptance-criteria.md
. Core criteria:
  1. Reachable:
    get-issues
    dry-run prints the HTTP body successfully
  2. Non-empty: some biz_module has
    Model.Total >= 1
  3. Stable: two calls with identical parameters return the same Top 5
    DigestHash
  4. Filter works: after adding a filter,
    Total
    is strictly <= the full count
  5. Three-level chain:
    issues → issue → errors → error
    can pull a Stack end to end
  6. Diagnosable: on induced errors, the output includes
    RequestId
    and
    ErrorCode
包含可运行命令和每步骤通过/失败标准的6步CLI完整自验证流程,请参考
references/verification-method.md
。正确与错误CLI模式矩阵请参考
references/acceptance-criteria.md
。核心标准:
  1. 可访问
    get-issues
    dry-run成功打印HTTP请求体
  2. 非空结果:至少一个biz_module的
    Model.Total >= 1
  3. 稳定性:相同参数的两次调用返回相同的Top 5
    DigestHash
  4. 过滤生效:添加过滤条件后,
    Total
    严格小于等于未过滤时的总数
  5. 三级链路完整
    issues → issue → errors → error
    可完整拉取到堆栈信息
  6. 可诊断:人为触发错误时,输出包含
    RequestId
    ErrorCode

9. Cleanup

9. 清理

This skill is read-only; it does not create any cloud resources that need cleanup.
Tear-down is only two things:
bash
aliyun configure ai-mode disable
本技能为只读不会创建任何需要清理的云资源。
仅需执行两项操作:
bash
aliyun configure ai-mode disable

(optional) delete the local JSON directories produced by dig_issue.sh

(可选) 删除dig_issue.sh生成的本地JSON目录

rm -rf ./emas-apm-dig-*
undefined
rm -rf ./emas-apm-dig-*
undefined

10. Best Practices

10. 最佳实践

  1. Probe first, ask later: before entering the main flow, grep SDK initialization code from the user's workspace per
    references/appkey-detection.md
    to infer
    app-key
    /
    os
    ; confirm with the user only after a hit, rather than asking upfront.
  2. Top first, then drill: do not run
    dig_issue.sh
    against every Issue from the start — first use
    list_top_issues.sh
    to aggregate the Top N, then drill into each of them. The total number of CLI calls is
    O(N)
    rather than
    O(all)
    .
  3. Always pass
    --os
    :
    --os
    on all 4 APIs is marked optional in
    --help
    , but omitting it returns empty results silently. Always specify
    android / iphoneos / harmony
    explicitly.
  4. get-error
    MUST carry
    --did
    : marked optional in
    --help
    but implicitly required by the backend; take it from
    Items[*].Did
    in the
    get-errors
    response.
  5. Reuse
    biz-module
    : the next
    get-issue
    /
    get-errors
    /
    get-error
    calls must use the same bizModule that produced the Issue in
    get-issues
    ; switching will return empty.
  6. Shrink the time window from "wide" to "narrow": start diagnosis with 24h /
    Granularity=1 GranularityUnit=DAY
    ; once a specific version / device is located, shrink to 1~4 hours with
    GranularityUnit=HOUR
    .
  7. Filters are JSON strings: the entire
    --filter
    value must be a single JSON string; build nested
    SubFilters
    with
    jq -cn
    to avoid manual escape errors (see
    references/filter-reference.md
    ).
  8. Multi-account scenarios: confirm the profile via
    aliyun configure list
    and pass
    --profile <name>
    explicitly rather than relying on implicit env-var switching.
  9. Persist
    get-error
    : this API response can be from hundreds of KB to several MB; do not truncate JSON with
    head
    /
    tail
    . Write to
    > /tmp/emas-error-XXX.json
    first and then process with
    jq
    .
  10. Android obfuscation: when you see class names like
    a.a.a.b.c
    , ask the user for
    mapping.txt
    before attempting code mapping rather than guessing.
  11. iOS not symbolicated: when
    Model.SymbolicStatus=false
    , the
    Stack
    contains many raw addresses; only emit conclusions at device / version dimensions, and re-analyze after dSYM is uploaded.
  12. Parallel QPS control:
    list_top_issues.sh
    has a built-in
    sleep 0.3s
    to avoid throttling; scanning 6 biz_modules takes 2~3 seconds in total and does not need extra concurrency.
  13. Empty
    biz-module
    results are not errors
    :
    anr / memory_*
    under
    harmony
    or very-low-traffic AppKeys returning
    Total=0
    is normal and should not be retried.
  14. Do not reverse-use this skill to write data: all 4 APIs are
    Get*
    /
    View*
    . If the user wants to "update Issue status" or "mark as fixed", that falls under write APIs like
    UpdateIssueStatus
    and is out of scope.
  1. 先探测,后询问:进入主流程前,按照
    references/appkey-detection.md
    从用户工作区搜索SDK初始化代码,推断
    app-key
    /
    os
    ;找到匹配项后再与用户确认,而非直接询问。
  2. 先Top,后钻取:不要一开始就对所有问题运行
    dig_issue.sh
    — 先使用
    list_top_issues.sh
    聚合Top N问题,再逐个钻取详情。CLI调用总数为
    O(N)
    而非
    O(全部)
  3. 始终传入
    --os
    :所有4个API的
    --os
    --help
    中标记为可选,但省略会静默返回空结果。请始终显式指定
    android / iphoneos / harmony
  4. get-error
    必须携带
    --did
    :在
    --help
    中标记为可选,但后端隐式要求必填;需从
    get-errors
    响应的
    Items[*].Did
    中获取。
  5. 复用
    biz-module
    :后续
    get-issue
    /
    get-errors
    /
    get-error
    调用必须使用与
    get-issues
    中生成该问题相同的bizModule;切换模块会返回空结果。
  6. 时间窗口从“宽”到“窄”:诊断开始时使用24小时/
    Granularity=1 GranularityUnit=DAY
    ;定位到特定版本/设备后,缩小到1~4小时并使用
    GranularityUnit=HOUR
  7. 过滤条件为JSON字符串:整个
    --filter
    值必须是单个JSON字符串;使用
    jq -cn
    构建嵌套
    SubFilters
    以避免手动转义错误(参考
    references/filter-reference.md
    )。
  8. 多账号场景:通过
    aliyun configure list
    确认配置文件,并显式传入
    --profile <名称>
    ,而非依赖隐式环境变量切换。
  9. 持久化
    get-error
    结果
    :该API响应大小从数百KB到数MB不等;请勿使用
    head
    /
    tail
    截断JSON。先写入
    > /tmp/emas-error-XXX.json
    再用
    jq
    处理。
  10. Android混淆处理:当看到
    a.a.a.b.c
    这类类名时,先请求用户提供
    mapping.txt
    再尝试代码映射,而非猜测。
  11. iOS未符号化处理:当
    Model.SymbolicStatus=false
    时,
    Stack
    包含大量原始地址;仅输出设备/版本维度的结论,上传dSYM后再重新分析。
  12. 并行QPS控制
    list_top_issues.sh
    内置
    sleep 0.3s
    避免限流;扫描6个biz_module总共耗时2~3秒,无需额外并发控制。
  13. biz-module
    空结果并非错误
    harmony
    下的
    anr / memory_*
    模块或低流量AppKey返回
    Total=0
    是正常现象,无需重试。
  14. 请勿反向使用本技能写入数据:所有4个API均为
    Get*
    /
    View*
    类型。若用户想要“更新问题状态”或“标记为已修复”,属于
    UpdateIssueStatus
    等写入API,不在本技能范围内。

11. Reference Links

11. 参考链接

DocumentPurpose
references/cli-installation-guide.md
Aliyun CLI installation / configuration / plugins / credentials
references/appkey-detection.md
Identify AppKey and OS from the user's workspace across Android / iOS / Harmony / Flutter / Unity / H5
references/ram-policies.md
Least-privilege JSON + Permission Failure Handling
references/get-issues.md
GetIssues
parameters / response / ordering
references/get-issue.md
GetIssue
parameters / response
references/get-errors.md
GetErrors
parameters / response
references/get-error.md
GetError
parameters / response
references/filter-reference.md
--filter
structure / operators / SubFilters / dry-run validation
references/biz-module-reference.md
6 biz_modules x platforms x available
filterCode
list
references/troubleshoot-workflow.md
Full flow for stack -> source -> diff
references/related-commands.md
Cheat sheet for all
aliyun emas-appmonitor
commands + skill boundary
references/verification-method.md
6-step runnable CLI verification with pass/fail criteria
references/acceptance-criteria.md
Correct vs incorrect CLI pattern matrix (for review / self-check)
assets/system-filters/index.json
Index of 14 static filter snapshots (biz_module x platform)
文档用途
references/cli-installation-guide.md
Aliyun CLI安装/配置/插件/凭证
references/appkey-detection.md
从用户工作区识别Android/iOS/Harmony/Flutter/Unity/H5的AppKey和操作系统
references/ram-policies.md
最小权限JSON策略 + 权限失败处理
references/get-issues.md
GetIssues
参数/响应/排序规则
references/get-issue.md
GetIssue
参数/响应
references/get-errors.md
GetErrors
参数/响应
references/get-error.md
GetError
参数/响应
references/filter-reference.md
--filter
结构/操作符/子过滤/ dry-run验证
references/biz-module-reference.md
6个biz_module x 平台 x 可用
filterCode
列表
references/troubleshoot-workflow.md
堆栈→源码→差异分析的完整流程
references/related-commands.md
所有
aliyun emas-appmonitor
命令速查表 + 技能边界
references/verification-method.md
6步可运行CLI验证流程及通过/失败标准
references/acceptance-criteria.md
正确与错误CLI模式矩阵(用于评审/自检)
assets/system-filters/index.json
14个静态过滤快照索引(biz_module x 平台)