alibabacloud-ram-permission-diagnose

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

RAM Permission Diagnosis and Repair

RAM权限诊断与修复

Execution constraint: When executing
aliyun
CLI commands directly (not generating commands for the user to copy), always append
--user-agent AlibabaCloud-Agent-Skills
. Do not add this flag to commands shown to users in Path B output or Path C console guidance.

执行约束：直接执行
aliyun
CLI命令（而非生成命令供用户复制）时，始终追加
--user-agent AlibabaCloud-Agent-Skills
。不要在路径B输出或路径C控制台指引中展示给用户的命令里添加该标志。

Overview

概述

When a RAM permission error is detected, run through these steps:

Quick Analysis — parse raw error fields (no tool calls), output a brief summary, ask the user to choose analysis depth
Deep Analysis — (only if user selects path B) decode if needed, run gap analysis, classify root cause
Generate Recommendations — least-privilege authorization plan
Execute Repair — present repair options and wait for user to choose

Permission level (L0–L3) is the agent's internal routing state, inferred implicitly from API call results during the flow. It determines diagnostic depth and available repair paths. Never declare or describe the level to the user. See

references/diagnose-flow.md

for level definitions.

检测到RAM权限错误时，按以下步骤执行：

快速分析——解析原始错误字段（无需调用工具），输出简要总结，询问用户选择分析深度
深度分析——(仅当用户选择路径B时执行) 必要时解码，执行差距分析，分类根本原因
生成建议——最小权限授权方案
执行修复——展示修复选项，等待用户选择

**权限级别（L0–L3）**是Agent的内部路由状态，会在流程中根据API调用结果隐式推断，决定诊断深度和可用的修复路径。切勿向用户声明或描述该级别，级别定义参见

references/diagnose-flow.md

。

Step 1: Quick Analysis

步骤1：快速分析

Parse raw error fields without any tool calls, then let the user decide how deep to go.

无需调用任何工具即可解析原始错误字段，之后让用户决定分析深度。

1a. Extract from raw error

1a. 从原始错误中提取字段

error_code

: e.g.,

NoPermission

Forbidden

InvalidSecurityToken

```
missing_action
```
: e.g.,
```
ecs:StopInstance
```

principal_type

SubUser

AssumedRoleUser

RootUser

(from

AuthPrincipalType

)

principal_display_name

: UserId or role:session (from

AuthPrincipalDisplayName

)

no_permission_type

ImplicitDeny

ExplicitDeny

(from

NoPermissionType

)

policy_type

: e.g.,

AccountLevelIdentityBasedPolicy

AssumeRolePolicy

(from

PolicyType

)

```
encoded_message
```
: retain
```
EncodedDiagnosticMessage
```
if present, for use in Step 2 if needed

error_code

：例如

NoPermission

、

Forbidden

、

InvalidSecurityToken

```
missing_action
```
：例如
```
ecs:StopInstance
```

principal_type

：

SubUser

AssumedRoleUser

RootUser

（从

AuthPrincipalType

获取）

principal_display_name

：用户ID或角色:会话名（从

AuthPrincipalDisplayName

获取）

no_permission_type

：

ImplicitDeny

或

ExplicitDeny

（从

NoPermissionType

获取）

policy_type

：例如

AccountLevelIdentityBasedPolicy

、

AssumeRolePolicy

（从

PolicyType

获取）

```
encoded_message
```
：如果存在
```
EncodedDiagnosticMessage
```
则保留，供步骤2需要时使用

1b. Output brief summary

1b. 输出简要总结

Based on the extracted fields, output a concise summary: who is affected, what action is missing, initial root cause inference.

基于提取的字段，输出简洁总结：受影响的身份、缺失的操作、初步根本原因推断。

1c. Present depth choice and wait for selection

1c. 提供深度选择并等待用户选择

Present the following and wait for the user to select — do not proceed until a choice is made:

A. Quick path (recommended when: ImplicitDeny + all key fields present + common service) — skip Step 2, generate recommendations directly from raw fields and built-in knowledge
B. Deep path (recommended when: ExplicitDeny, missing fields, or unfamiliar service) — run full Step 2 analysis for a more precise result.
Requires two optional permissions:
```
ram:DecodeDiagnosticMessage
```
(decode encoded errors) and system policy
```
AliyunRAMReadOnlyAccess
```
(gap analysis). Missing permissions limit specific capabilities but the flow continues.
Skip — stop here; output manual troubleshooting links

Mark the recommended option clearly and briefly explain why.

If user selects A: proceed to Step 3. Note in the recommendation that it is based on quick analysis; the user can request deep analysis at any time.

If user selects B: proceed to Step 2.

If user selects Skip: output error summary, links to RAM documentation (

https://help.aliyun.com/document_detail/93733.html

) and RAM console (

https://ram.console.aliyun.com/policies

), and a note on how to restart diagnosis.

Edge case — ExplicitDeny with path A forced: if

NoPermissionType = ExplicitDeny

and the user still selects A, explain that the specific Deny policy cannot be identified without deep analysis, and provide a limited recommendation with explicit uncertainty noted.

展示以下选项并等待用户选择——未做出选择前不要继续执行：

A. 快速路径（推荐场景：ImplicitDeny + 所有关键字段齐全 + 通用服务）——跳过步骤2，直接基于原始字段和内置知识生成建议
B. 深度路径（推荐场景：ExplicitDeny、字段缺失、或不熟悉的服务）——执行完整的步骤2分析以获得更精准的结果。
需要两个可选权限：
```
ram:DecodeDiagnosticMessage
```
（解码编码后的错误）和系统策略
```
AliyunRAMReadOnlyAccess
```
（差距分析）。缺失权限会限制特定功能，但流程仍可继续。
跳过——终止流程，输出手动排查链接

清晰标记推荐选项并简要说明原因。

如果用户选择A：进入步骤3。在建议中注明该结果基于快速分析，用户可随时请求深度分析。

如果用户选择B：进入步骤2。

如果用户选择跳过：输出错误总结、RAM文档链接（

https://help.aliyun.com/document_detail/93733.html

）和RAM控制台链接（

https://ram.console.aliyun.com/policies

），并说明重启诊断的方法。

边界情况——ExplicitDeny且强制选择路径A：如果

NoPermissionType = ExplicitDeny

但用户仍选择A，说明不进行深度分析无法定位具体的Deny策略，提供有限的建议并明确标注不确定性。

Step 2: Deep Analysis

步骤2：深度分析

Entered only when the user selects path B in Step 1.

First attempt classification using the raw fields from Step 1.

DecodeDiagnosticMessage

is a supplement — invoke it only when raw data is insufficient to classify with confidence.

Decode when raw data alone cannot resolve the root cause: e.g.,

ExplicitDeny

is present (need

MatchedPolicies

AccessDeniedDetail

was absent, or

PolicyType

is missing. For cases where

NoPermissionType

AuthAction

AuthPrincipalType

, and

PolicyType

are all available and point to a clear root cause, skip decode and proceed directly.

Transcribe

EncodedDiagnosticMessage

from the raw error and call:

bash

aliyun ram DecodeDiagnosticMessage --EncodedDiagnosticMessage "<transcribed-value>"

If the call returns

EntityNotExist

, re-run the original failing command and save its output to a temp file (use the system temp dir; name the file after the command context, e.g.

/tmp/aliyun_ecs_stopinstance.txt

). Extract

EncodedDiagnosticMessage

from the file and retry the decode. If the field is not found in the file, mark as L0 and continue.

SubUser

identity needs UserName resolution before gap analysis, see

references/diagnose-flow.md

→ Identity Resolution. If resolution fails, mark as L0 and continue.

Root cause categories:

MissingAction — identity policy lacks the required Action (most common)
ExplicitDeny — a Deny statement blocks access (may be identity policy or CP control policy)
TrustPolicy — role trust policy does not allow the caller to assume the role
STSInsufficient — STS temporary credential lacks permission; root cause is on the originating Role
TokenExpired — STS token has expired
SLRMissing — service-linked role has not been created
ResourcePolicy — resource-side policy (e.g., OSS Bucket Policy) is restricting access

For gap analysis trigger rules and per-root-cause handling details, see

references/diagnose-flow.md

Gap analysis (when triggered): query current policies attached to the identity, then compare against the required Action. Use

ListPoliciesForUser

(SubUser),

ListPoliciesForRole

(AssumedRoleUser), or

ListControlPolicies

(RootUser). For Custom policies, fetch the policy document with

GetPolicyVersion

. System policies: use built-in knowledge, do not call

GetPolicyVersion

When permissions are insufficient: if

DecodeDiagnosticMessage

fails (L0) or policy queries fail (L1), inform the user of the limitation and provide ready-to-use permission request materials for a RAM admin — two independent options: ① decode permission (

ram:DecodeDiagnosticMessage

) as a custom policy; ② RAM read access via system policy

AliyunRAMReadOnlyAccess

(covers gap analysis). Either or both can be requested independently. Then continue to Step 3 without waiting.

仅当用户在步骤1选择路径B时进入。

首先尝试用步骤1的原始字段进行分类，

DecodeDiagnosticMessage

是补充能力——仅当原始数据不足以自信分类时才调用。

仅当原始数据无法定位根本原因时执行解码：例如存在

ExplicitDeny

（需要

MatchedPolicies

）、缺少

AccessDeniedDetail

、或缺失

PolicyType

。如果

NoPermissionType

、

AuthAction

、

AuthPrincipalType

和

PolicyType

均齐全且指向明确的根本原因，跳过解码直接继续。

从原始错误中提取

EncodedDiagnosticMessage

并调用：

bash

aliyun ram DecodeDiagnosticMessage --EncodedDiagnosticMessage "<transcribed-value>"

如果调用返回

EntityNotExist

，重新运行原始失败命令并将输出保存到临时文件（使用系统临时目录，按命令上下文命名，例如

/tmp/aliyun_ecs_stopinstance.txt

）。从文件中提取

EncodedDiagnosticMessage

并重试解码。如果文件中未找到该字段，标记为L0继续。

如果

SubUser

身份需要在差距分析前解析用户名，参见

references/diagnose-flow.md

→ 身份解析。如果解析失败，标记为L0继续。

根本原因分类：

MissingAction——身份策略缺少所需的Action（最常见）
ExplicitDeny——Deny语句阻止了访问（可能是身份策略或CP控制策略）
TrustPolicy——角色信任策略不允许调用方扮演该角色
STSInsufficient——STS临时凭证缺少权限，根本原因在源角色上
TokenExpired——STS token已过期
SLRMissing——尚未创建服务关联角色
ResourcePolicy——资源侧策略（例如OSS Bucket Policy）限制了访问

差距分析触发规则和各根本原因的处理细节，参见

references/diagnose-flow.md

。

差距分析（触发时）：查询当前身份绑定的策略，与所需的Action对比。使用

ListPoliciesForUser

（子用户）、

ListPoliciesForRole

（扮演角色用户）或

ListControlPolicies

（根用户）。自定义策略需调用

GetPolicyVersion

获取策略文档，系统策略使用内置知识，无需调用

GetPolicyVersion

。

权限不足时：如果

DecodeDiagnosticMessage

调用失败（L0）或策略查询失败（L1），告知用户该限制，提供可直接交给RAM管理员的权限申请材料——两个独立选项：① 自定义策略申请解码权限（

ram:DecodeDiagnosticMessage

）；② 系统策略

AliyunRAMReadOnlyAccess

申请RAM只读权限（覆盖差距分析能力）。可单独申请其中一项或同时申请两项，无需等待用户反馈直接进入步骤3。

Step 3: Generate Recommendations

步骤3：生成建议

Before generating, check for caller skill permission hints (see

references/diagnose-flow.md

→ Coverage Check).

Knowledge source priority:

Built-in knowledge — for popular services (ECS, OSS, RDS, FC, SLB, VPC, SLS, STS, etc.), use known Action semantics directly. Reference
```
references/hot-services-ram.md
```
.
Caller skill hints — if
```
ram-policies.md
```
was found, use as supplementary context
Web search — search
```
{product} RAM authorization site:help.aliyun.com
```
; prefer manually maintained docs with business examples over auto-generated Action tables
System policy fallback — recommend
```
AliyunXxxReadOnlyAccess
```
or
```
AliyunXxxFullAccess
```
with a note to tighten further

Custom policy naming: suggest a name based on service and task semantics (e.g.,

ai-agent-ecs-permissions

), confirm once, reuse in the same session.

System policy: attach directly with a single command, no naming needed.

For the Trust Policy root cause path, recommendations differ — see

references/diagnose-flow.md

→ Handling Each Root Cause.

After presenting the recommendation, add a brief note: the current plan is a starting point; the user can request further refinement at any time — for example, scoping down to specific resources, adding conditions, or using resource-level policies (such as OSS bucket policies) instead of identity-level grants.

生成建议前，检查调用方技能权限提示（参见

references/diagnose-flow.md

→ 覆盖度检查）。

知识来源优先级：

内置知识——针对热门服务（ECS、OSS、RDS、FC、SLB、VPC、SLS、STS等），直接使用已知的Action语义，参考
```
references/hot-services-ram.md
```
。
调用方技能提示——如果存在
```
ram-policies.md
```
，作为补充上下文使用
网页搜索——搜索
```
{product} RAM authorization site:help.aliyun.com
```
；优先选择带业务示例的人工维护文档，而非自动生成的Action表格
系统策略兜底——推荐
```
AliyunXxxReadOnlyAccess
```
或
```
AliyunXxxFullAccess
```
，并注明可进一步收紧权限

自定义策略命名：建议基于服务和任务语义命名（例如

ai-agent-ecs-permissions

），确认一次后同一会话内复用。

系统策略：可直接通过单条命令绑定，无需命名。

信任策略根本原因路径的建议有所不同，参见

references/diagnose-flow.md

→ 各根本原因处理。

展示建议后，添加简要说明：当前方案是基础版本，用户可随时请求进一步优化——例如缩小到特定资源范围、添加条件、或使用资源级策略（例如OSS桶策略）代替身份级授权。

Step 4: Execute Repair

步骤4：执行修复

Before executing any write operation, present the change summary and all available paths to the user, then wait for the user to select a path — do not proceed or output any commands until the user has chosen:

Target (user or role name)
Change summary (policy name, action, undo method)
Path options (always present all that are available for the current level — never skip any):
- A. Direct CLI execution — agent runs commands now (only at L2)
- B. Output CLI commands — user copies and runs in their own terminal (all levels)
- C. Console guidance — step-by-step in RAM console (all levels)
- Skip — do not execute

For pre-query requirements before write operations, and full CLI command examples, see

references/ram-cli-commands.md

and

references/diagnose-flow.md

Path A: agent executes via Bash. On success → L3 confirmed; report result and undo command. On NoPermission → switch to Path B automatically.

Path B at L0/L1: output incremental Statement JSON only, with a note that existing policies could not be read and the user must merge manually.

Path B at L2: offer two sub-options: ① incremental Statement only, ② complete merged policy JSON.

Path C: provide the RAM console entry (

https://ram.console.aliyun.com/policies

) and step-by-step instructions for completing the change in the console UI.

After repair, suggest the user retry the previously failed operation. Offer to retry on their behalf if requested.

执行任何写操作前，向用户展示变更摘要和所有可用路径，然后等待用户选择路径——用户做出选择前不要继续或输出任何命令：

目标对象（用户名或角色名）
变更摘要（策略名、操作、回滚方法）
路径选项（始终展示当前级别下所有可用选项——切勿遗漏）：
- A. 直接CLI执行——Agent立即运行命令 (仅L2级别可用)
- B. 输出CLI命令——用户复制到自己的终端运行 (所有级别可用)
- C. 控制台指引——RAM控制台分步操作指引 (所有级别可用)
- 跳过——不执行修复

写操作的前置查询要求和完整CLI命令示例，参见

references/ram-cli-commands.md

和

references/diagnose-flow.md

。

路径A：Agent通过Bash执行。执行成功→确认L3级别，上报结果和回滚命令。遇到NoPermission错误→自动切换到路径B。

L0/L1级别的路径B：仅输出增量Statement JSON，注明无法读取现有策略，用户需手动合并。

L2级别的路径B：提供两个子选项：① 仅增量Statement；② 完整合并后的策略JSON。

路径C：提供RAM控制台入口（

https://ram.console.aliyun.com/policies

）和在控制台UI完成变更的分步操作指引。

修复完成后，建议用户重试之前失败的操作。如果用户请求，可代其重试。