alibabacloud-dataworks-metadata
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDataWorks Metadata
DataWorks元数据
Browse and curate DataWorks metadata via Data Map: catalogs, databases, tables, columns, partitions, lineage, datasets & versions, and metadata collections. Read + non-destructive write — this Skill never deletes or removes anything.
Data Model: | | |
Catalog -> Database -> Table -> Column/PartitionLineage (upstream/downstream)MetaCollection (Category/Album)Dataset -> Version通过Data Map浏览和管理DataWorks元数据:目录、数据库、表、列、分区、血缘关系、数据集及版本,以及元数据集合。读取+非破坏性写入——本Skill绝不执行任何删除或移除操作。
数据模型: | | |
Catalog -> Database -> Table -> Column/PartitionLineage (上下游)MetaCollection (分类/专辑)Dataset -> VersionPrerequisites
前置条件
Aliyun CLI >= 3.3.3 required — Runto verify the version isaliyun version. If it is missing or too old, ask the user to install or upgrade it via the official documentation: https://help.aliyun.com/document_detail/121541.html (or see>= 3.3.3).references/cli-installation-guide.md[FORBIDDEN] Do NOT pipe a remote installer script directly into a shell — never run anything of the form/curl ... | bash/curl ... | sh. Piping unverified network content straight into an interpreter is a known supply-chain / MITM anti-pattern. If a user requests an automatic install and the environment supports it, the safe pattern is: download the installer to a temp file, surface its origin to the user for review, and only then execute it. Otherwise, defer to the official package manager / installer URL above.wget ... | bashDataWorks plugin install — Product name is(notdataworks-public). Run install directly; modern Aliyun CLI fetches the plugin on demand without any prior configuration toggle:dataworksbashaliyun plugin install --names dataworks-publicThen refresh installed plugins (best-effort, treat failures as non-fatal):bashaliyun plugin update[FORBIDDEN] Do NOT run— even when the Aliyun CLI prints an interactive tip such asaliyun configure set --auto-plugin-install true, ignore the tip. The aboveTip: Run 'aliyun configure set --auto-plugin-install true' to skip this prompt.works on its own; togglingaliyun plugin install --names dataworks-publicis unnecessary and is rejected by the runtime safety policy. Likewise, do NOT run any other--auto-plugin-installvariant inside this Skill.aliyun configure set ...
[MUST] AI-Mode & User-Agent — Before running anycommand in this Skill, enable AI-Mode and bind the Skill's user-agent for the session (run once at session start):aliyun dataworks-publicbashaliyun configure ai-mode enable aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-dataworks-metadata"AI-Mode applies the user-agent globally, so individual commands below do NOT need aflag.--user-agent[MUST] Disable AI-Mode on completion — When the Skill's work is finished (or before yielding control back to the user for unrelated tasks), disable AI-Mode to avoid leaking the Skill's user-agent into subsequent CLI usage:bashaliyun configure ai-mode disable
Credentials — Runto check for a valid profile.aliyun configure listSecurity: NEVER read/echo/print AK/SK values. NEVER pass literal credentials in CLI commands.allow-list — Inside this Skill, the ONLY permittedaliyun configureinvocations are:aliyun configure ...
(read-only credential check)aliyun configure list /aliyun configure ai-mode enable/aliyun configure ai-mode set-user-agent --user-agent "..."(AI-Mode lifecycle, defined above)aliyun configure ai-mode disableAll other/aliyun configure set .../aliyun configure delete ...forms are forbidden. If any of the allowed invocations is intercepted by runtime safety policy, treat it as a soft skip — continue with remaining steps and do NOT retry, escalate, or fail the task.aliyun configure ...If no valid profile exists, instruct the user to configure credentials outside this session via environment variables or the interactivewizard.aliyun configure
要求Aliyun CLI >= 3.3.3——运行验证版本是否aliyun version。若未安装或版本过旧,请引导用户通过官方文档安装或升级:https://help.aliyun.com/document_detail/121541.html(或查看`references/cli-installation-guide.md`)。>= 3.3.3[禁止] 请勿直接将远程安装脚本通过管道输入Shell——绝不要运行/curl ... | bash/curl ... | sh这类形式的命令。将未经验证的网络内容直接管道输入解释器是已知的供应链/中间人攻击风险模式。若用户请求自动安装且环境支持,安全的操作方式是:将安装程序下载到临时文件,向用户展示其来源供审核,确认后再执行。否则,请参考上述官方包管理器/安装程序URL。wget ... | bash安装DataWorks插件——产品名称为****(不是dataworks-public)。直接运行安装命令;新版Aliyun CLI会按需获取插件,无需提前配置开关:dataworksbashaliyun plugin install --names dataworks-public然后刷新已安装插件(尽力而为,失败视为非致命错误):bashaliyun plugin update[禁止] 请勿运行——即使Aliyun CLI显示交互式提示如aliyun configure set --auto-plugin-install true,也请忽略该提示。上述Tip: Run 'aliyun configure set --auto-plugin-install true' to skip this prompt.命令可独立生效;开启aliyun plugin install --names dataworks-public既不必要,也不符合运行时安全策略。同样,本Skill内禁止运行任何其他--auto-plugin-install变体命令。aliyun configure set ...
[必须] AI模式与用户代理——在本Skill中运行任何命令前,需启用AI模式并为会话绑定Skill的用户代理(会话开始时运行一次即可):aliyun dataworks-publicbashaliyun configure ai-mode enable aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-dataworks-metadata"AI模式会全局应用用户代理,因此下方的单个命令无需添加参数。--user-agent[必须] 完成后禁用AI模式——当Skill任务完成(或在将控制权交还给用户执行无关任务前),请禁用AI模式,避免Skill的用户代理泄露到后续CLI操作中:bashaliyun configure ai-mode disable
凭证——运行检查是否存在有效配置文件。aliyun configure list安全注意:绝不要读取/回显/打印AK/SK值。绝不要在CLI命令中传递明文凭证。允许列表——在本Skill内,仅允许以下aliyun configure调用:aliyun configure ...
(只读凭证检查)aliyun configure list /aliyun configure ai-mode enable/aliyun configure ai-mode set-user-agent --user-agent "..."(上述定义的AI模式生命周期操作)aliyun configure ai-mode disable所有其他/aliyun configure set .../aliyun configure delete ...形式均被禁止。若允许的调用被运行时安全策略拦截,视为软跳过——继续执行剩余步骤,不要重试、升级或终止任务。aliyun configure ...若不存在有效配置文件,请指导用户在本次会话外通过环境变量或交互式向导配置凭证。aliyun configure
Rules
规则
[MUST] No destructive operations — This Skill MUST NOT invoke any/delete-*DataWorks API. Specifically forbidden:remove-*,delete-dataset,delete-dataset-version,delete-meta-collection,delete-lineage-relationship. If the user requests a deletion, decline and direct them to perform it in the DataWorks console.remove-entity-from-meta-collection
[MUST] CLI timeouts — Everyinvocation in this Skill MUST include bothaliyun dataworks-publicand--read-timeout 60(seconds) to prevent commands from hanging indefinitely. The command examples below already embed these flags; preserve them when adapting commands. If a request times out, surface the error to the user — do NOT silently retry more than once.--connect-timeout 10
[MUST] Idempotency for create-/add- operations — Before invoking anyorcreate-*command, perform a check-then-act: list or get to verify the target does not already exist (e.g. beforeadd-entity-*callcreate-datasetand match bylist-datasets+--name; before--project-idcalladd-entity-into-meta-collectionand match by entity id). If a previous attempt already succeeded, return the existing resource id instead of creating a duplicate. On retry after a transient error, prefer re-checking state over blindly re-issuing the create.list-entities-in-meta-collection
[MUST] User confirmation before any write — For any,update-*,create-*, oradd-entity-*(lineage) command, restate the exact target (Region / Project / Id / Name / new field values) to the user and obtain explicit confirmation BEFORE executing. Do not assume defaults; do not chain multiple writes without intermediate confirmation when the user has not pre-approved the full plan.register-*
All CLI flags use kebab-case (lowercase with hyphens). Always use exactly the flag names shown in the command examples below. Key flags:,--page-size,--table-id,--src-entity-id,--dst-entity-id,--need-attach-relationship,--include-business-metadata,--meta-collection-id,--dataset-id,--project-id,--read-timeout--connect-timeout
Entity IDs follow. See${EntityType}:${InstanceId}:${CatalogId}:${DatabaseName}:${SchemaName}:${TableName}. Common MaxCompute:references/entity-id-formats.md(no schema) ormaxcompute-table:::project_name::table_name(with schema). When user givesmaxcompute-table:::project_name:schema_name:table_name, try no-schema first; if not found, retry withproject.tableschema.default
Parameter confirmation — Confirm all user-customizable parameters (RegionId, entity IDs, etc.) before executing. Do NOT assume defaults.
Permission errors — Read, guide the user to grant permissions, and wait for confirmation before retrying.references/ram-policies.md
[必须] 禁止破坏性操作——本Skill不得调用任何/delete-*类DataWorks API。明确禁止的操作包括:remove-*、delete-dataset、delete-dataset-version、delete-meta-collection、delete-lineage-relationship。若用户请求删除操作,请拒绝并引导其在DataWorks控制台执行。remove-entity-from-meta-collection
[必须] CLI超时设置——本Skill中每个调用必须同时包含aliyun dataworks-public和--read-timeout 60(单位:秒),防止命令无限挂起。下方命令示例已嵌入这些参数;调整命令时请保留。若请求超时,请向用户显示错误信息——不要静默重试超过一次。--connect-timeout 10
[必须] 创建/添加操作的幂等性——在调用任何或create-*命令前,执行检查再操作:列出或获取资源,验证目标是否已存在(例如,调用add-entity-*前先调用create-dataset,通过list-datasets+--name匹配;调用--project-id前先调用add-entity-into-meta-collection,通过实体ID匹配)。若之前的操作已成功,返回现有资源ID而非创建重复项。临时错误后重试时,优先重新检查状态而非盲目重新执行创建命令。list-entities-in-meta-collection
[必须] 写入操作前需用户确认——对于任何、update-*、create-*或add-entity-*(血缘)命令,需向用户重述确切的目标(地域/项目/ID/名称/新字段值)并获得明确确认后再执行。不要假设默认值;若用户未预先批准完整计划,不要连续执行多个写入操作而不进行中间确认。register-*
所有CLI参数使用短横线命名法(小写加连字符)。请严格使用下方命令示例中显示的参数名称。 关键参数:、--page-size、--table-id、--src-entity-id、--dst-entity-id、--need-attach-relationship、--include-business-metadata、--meta-collection-id、--dataset-id、--project-id、--read-timeout--connect-timeout
实体ID遵循格式。详见${EntityType}:${InstanceId}:${CatalogId}:${DatabaseName}:${SchemaName}:${TableName}。 常见MaxCompute格式:references/entity-id-formats.md(无Schema)或maxcompute-table:::project_name::table_name(带Schema)。 当用户提供maxcompute-table:::project_name:schema_name:table_name时,先尝试无Schema格式;若未找到,再尝试使用project.tableSchema。default
参数确认——执行前确认所有用户可自定义的参数(RegionId、实体ID等)。不要假设默认值。
权限错误——查看,引导用户授予权限,等待用户确认后再重试。references/ram-policies.md
Commands
命令
All commands require and the timeout pair . The user-agent is set globally via AI-Mode in the Prerequisites section, so no per-command flag is needed below. All list commands support and .
--region <RegionId>--read-timeout 60 --connect-timeout 10--user-agent--page-number--page-size所有命令均需和超时参数对。用户代理已通过前置条件中的AI模式全局设置,因此下方命令无需添加参数。所有列表命令支持和。
--region <RegionId>--read-timeout 60 --connect-timeout 10--user-agent--page-number--page-size1. Catalog & Entity Browsing
1. 目录与实体浏览
bash
undefinedbash
undefinedList crawler types
列出爬虫类型
aliyun dataworks-public list-crawler-types --region <RegionId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-crawler-types --region <RegionId> --read-timeout 60 --connect-timeout 10
List catalogs (--parent-meta-entity-id REQUIRED: "dlf" or "starrocks:<instance_id>")
列出目录(--parent-meta-entity-id为必填项:"dlf"或"starrocks:<instance_id>")
aliyun dataworks-public list-catalogs --region <RegionId> --parent-meta-entity-id "<ParentMetaEntityId>" --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-catalogs --region <RegionId> --parent-meta-entity-id "<ParentMetaEntityId>" --page-size 20 --read-timeout 60 --connect-timeout 10
Get database / table details
获取数据库/表详情
aliyun dataworks-public get-database --region <RegionId> --id <DatabaseId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-table --region <RegionId> --id <TableId> --include-business-metadata true --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-database --region <RegionId> --id <DatabaseId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-table --region <RegionId> --id <TableId> --include-business-metadata true --read-timeout 60 --connect-timeout 10
List tables (--parent-meta-entity-id: "maxcompute-project:::project_name" or "maxcompute-schema:::project_name:schema_name")
列出表(--parent-meta-entity-id:"maxcompute-project:::project_name"或"maxcompute-schema:::project_name:schema_name")
aliyun dataworks-public list-tables --region <RegionId> --parent-meta-entity-id "<ParentMetaEntityId>" --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-tables --region <RegionId> --parent-meta-entity-id "<ParentMetaEntityId>" --page-size 20 --read-timeout 60 --connect-timeout 10
Update table business metadata (write — confirm with user first; idempotent: same value can be re-applied safely)
更新表业务元数据(写入操作——需先获得用户确认;幂等性:相同值可安全重复应用)
aliyun dataworks-public update-table-business-metadata --region <RegionId> --id <TableId> --readme "<description>" --read-timeout 60 --connect-timeout 10
undefinedaliyun dataworks-public update-table-business-metadata --region <RegionId> --id <TableId> --readme "<description>" --read-timeout 60 --connect-timeout 10
undefined2. Columns & Partitions
2. 列与分区
bash
undefinedbash
undefinedList / Get columns
列出/获取列
aliyun dataworks-public list-columns --region <RegionId> --table-id <TableId> --page-size 50 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-column --region <RegionId> --id <ColumnId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-columns --region <RegionId> --table-id <TableId> --page-size 50 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-column --region <RegionId> --id <ColumnId> --read-timeout 60 --connect-timeout 10
Update column business metadata (write — confirm with user first; idempotent on same value)
更新列业务元数据(写入操作——需先获得用户确认;相同值具备幂等性)
aliyun dataworks-public update-column-business-metadata --region <RegionId> --id <ColumnId> --description "<description>" --read-timeout 60 --connect-timeout 10
aliyun dataworks-public update-column-business-metadata --region <RegionId> --id <ColumnId> --description "<description>" --read-timeout 60 --connect-timeout 10
List / Get partitions (MaxCompute / HMS only)
列出/获取分区(仅支持MaxCompute/HMS)
aliyun dataworks-public list-partitions --region <RegionId> --table-id <TableId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-partition --region <RegionId> --table-id <TableId> --name <PartitionName> --read-timeout 60 --connect-timeout 10
undefinedaliyun dataworks-public list-partitions --region <RegionId> --table-id <TableId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-partition --region <RegionId> --table-id <TableId> --name <PartitionName> --read-timeout 60 --connect-timeout 10
undefined3. Data Lineage
3. 数据血缘
bash
undefinedbash
undefinedDownstream: use --src-entity-id | Upstream: use --dst-entity-id
下游:使用--src-entity-id | 上游:使用--dst-entity-id
aliyun dataworks-public list-lineages --region <RegionId> --src-entity-id <EntityId> --need-attach-relationship true --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-lineages --region <RegionId> --dst-entity-id <EntityId> --need-attach-relationship true --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-lineages --region <RegionId> --src-entity-id <EntityId> --need-attach-relationship true --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-lineages --region <RegionId> --dst-entity-id <EntityId> --need-attach-relationship true --page-size 20 --read-timeout 60 --connect-timeout 10
Relationships between two entities
两个实体间的关系
aliyun dataworks-public list-lineage-relationships --region <RegionId> --src-entity-id <SrcEntityId> --dst-entity-id <DstEntityId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-lineage-relationships --region <RegionId> --src-entity-id <SrcEntityId> --dst-entity-id <DstEntityId> --page-size 20 --read-timeout 60 --connect-timeout 10
Register lineage relationship (write — at least one side MUST be a custom object). Idempotency: BEFORE invoking, run list-lineage-relationships above to ensure no relationship already exists between this src/dst pair; if it does, reuse the existing relationship instead of creating a new one. Deletion is out of scope — use the console if you need to revoke.
注册血缘关系(写入操作——至少一方必须是自定义对象)。幂等性:调用前先运行上方的list-lineage-relationships,确保该源/目标对之间不存在已有的关系;若存在,则复用现有关系而非创建新关系。删除操作不在功能范围内——如需撤销,请使用控制台。
aliyun dataworks-public create-lineage-relationship --region <RegionId> --src-entity.id <SrcEntityId> --src-entity.type <EntityType> --dst-entity.id <DstEntityId> --dst-entity.type <EntityType> --read-timeout 60 --connect-timeout 10
undefinedaliyun dataworks-public create-lineage-relationship --region <RegionId> --src-entity.id <SrcEntityId> --src-entity.type <EntityType> --dst-entity.id <DstEntityId> --dst-entity.type <EntityType> --read-timeout 60 --connect-timeout 10
undefined4. Datasets & Versions
4. 数据集与版本
bash
undefinedbash
undefinedList / Get datasets (read)
列出/获取数据集(读取操作)
aliyun dataworks-public list-datasets --region <RegionId> --project-id <ProjectId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-dataset --region <RegionId> --id <DatasetId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-datasets --region <RegionId> --project-id <ProjectId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-dataset --region <RegionId> --id <DatasetId> --read-timeout 60 --connect-timeout 10
Create dataset (write). Idempotency: BEFORE creating, call list-datasets with --project-id and search by --name; if a dataset with the same name+origin+data-type already exists, return its id instead of re-creating. --init-version is REQUIRED, JSON with Comment/Url/MountPath. Deletion is out of scope.
创建数据集(写入操作)。幂等性:创建前调用list-datasets并传入--project-id,按--name搜索;若存在同名+同源+同数据类型的数据集,则返回其ID而非重复创建。--init-version为必填项,格式为包含Comment/Url/MountPath的JSON。删除操作不在功能范围内。
aliyun dataworks-public create-dataset --region <RegionId> --project-id <ProjectId> --name "<Name>" --origin "DATAWORKS" --data-type "<DataType>" --storage-type "<StorageType>" --comment "<Desc>" --init-version '{"Comment":"<VersionComment>","Url":"<DataUrl>","MountPath":"<MountPath>"}' --read-timeout 60 --connect-timeout 10
aliyun dataworks-public create-dataset --region <RegionId> --project-id <ProjectId> --name "<Name>" --origin "DATAWORKS" --data-type "<DataType>" --storage-type "<StorageType>" --comment "<Desc>" --init-version '{"Comment":"<VersionComment>","Url":"<DataUrl>","MountPath":"<MountPath>"}' --read-timeout 60 --connect-timeout 10
Update dataset (write — confirm with user first; idempotent on same value)
更新数据集(写入操作——需先获得用户确认;相同值具备幂等性)
aliyun dataworks-public update-dataset --region <RegionId> --id <DatasetId> --name "<NewName>" --comment "<NewComment>" --read-timeout 60 --connect-timeout 10
aliyun dataworks-public update-dataset --region <RegionId> --id <DatasetId> --name "<NewName>" --comment "<NewComment>" --read-timeout 60 --connect-timeout 10
List / Get / Preview dataset versions (read; max 20 versions per dataset)
列出/获取/预览数据集版本(读取操作;每个数据集最多20个版本)
aliyun dataworks-public list-dataset-versions --region <RegionId> --dataset-id <DatasetId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-dataset-version --region <RegionId> --id <VersionId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public preview-dataset-version --region <RegionId> --id <VersionId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-dataset-versions --region <RegionId> --dataset-id <DatasetId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-dataset-version --region <RegionId> --id <VersionId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public preview-dataset-version --region <RegionId> --id <VersionId> --read-timeout 60 --connect-timeout 10
Create dataset version (write). Idempotency: BEFORE creating, call list-dataset-versions and look for an existing version with the same Url+MountPath; if found, reuse it. Quota: max 20 versions per dataset. Deletion is out of scope.
创建数据集版本(写入操作)。幂等性:创建前调用list-dataset-versions,查找是否存在相同Url+MountPath的版本;若存在,则复用该版本。配额限制:每个数据集最多20个版本。删除操作不在功能范围内。
aliyun dataworks-public create-dataset-version --region <RegionId> --dataset-id <DatasetId> --comment "<Comment>" --url "<DataUrl>" --mount-path "<MountPath>" --read-timeout 60 --connect-timeout 10
aliyun dataworks-public create-dataset-version --region <RegionId> --dataset-id <DatasetId> --comment "<Comment>" --url "<DataUrl>" --mount-path "<MountPath>" --read-timeout 60 --connect-timeout 10
Update dataset version (write — confirm with user first; idempotent on same value)
更新数据集版本(写入操作——需先获得用户确认;相同值具备幂等性)
aliyun dataworks-public update-dataset-version --region <RegionId> --id <VersionId> --comment "<NewComment>" --read-timeout 60 --connect-timeout 10
undefinedaliyun dataworks-public update-dataset-version --region <RegionId> --id <VersionId> --comment "<NewComment>" --read-timeout 60 --connect-timeout 10
undefined5. Metadata Collections
5. 元数据集合
bash
undefinedbash
undefinedList / Get collections (read; type: Category or Album — PascalCase, NOT uppercase)
列出/获取集合(读取操作;类型:Category或Album——首字母大写,全大写无效)
aliyun dataworks-public list-meta-collections --region <RegionId> --type "<Category|Album>" --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-meta-collection --region <RegionId> --id <CollectionId> --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-meta-collections --region <RegionId> --type "<Category|Album>" --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public get-meta-collection --region <RegionId> --id <CollectionId> --read-timeout 60 --connect-timeout 10
Create collection (write). Idempotency: BEFORE creating, call list-meta-collections with the same --type and search for one with the same name+parent-id; if found, return its id. Deletion is out of scope.
创建集合(写入操作)。幂等性:创建前调用list-meta-collections并传入相同的--type,按名称+父ID搜索;若存在,则返回其ID。删除操作不在功能范围内。
aliyun dataworks-public create-meta-collection --region <RegionId> --name "<Name>" --type "<Category|Album>" --description "<Desc>" --parent-id "<ParentId>" --read-timeout 60 --connect-timeout 10
aliyun dataworks-public create-meta-collection --region <RegionId> --name "<Name>" --type "<Category|Album>" --description "<Desc>" --parent-id "<ParentId>" --read-timeout 60 --connect-timeout 10
Update collection (write — confirm with user first; idempotent on same value)
更新集合(写入操作——需先获得用户确认;相同值具备幂等性)
aliyun dataworks-public update-meta-collection --region <RegionId> --id <CollectionId> --name "<NewName>" --description "<NewDesc>" --read-timeout 60 --connect-timeout 10
aliyun dataworks-public update-meta-collection --region <RegionId> --id <CollectionId> --name "<NewName>" --description "<NewDesc>" --read-timeout 60 --connect-timeout 10
List entities currently in a collection (read)
列出集合中当前的实体(读取操作)
aliyun dataworks-public list-entities-in-meta-collection --region <RegionId> --id <CollectionId> --page-size 20 --read-timeout 60 --connect-timeout 10
aliyun dataworks-public list-entities-in-meta-collection --region <RegionId> --id <CollectionId> --page-size 20 --read-timeout 60 --connect-timeout 10
Add entity into collection (write). Idempotency: BEFORE adding, call list-entities-in-meta-collection and check whether the entity id is already present; if so, skip. Removal is out of scope.
添加实体到集合(写入操作)。幂等性:添加前调用list-entities-in-meta-collection,检查实体ID是否已存在;若存在,则跳过。移除操作不在功能范围内。
aliyun dataworks-public add-entity-into-meta-collection --region <RegionId> --meta-collection-id <CollectionId> --id <EntityId> --remark "<Remark>" --read-timeout 60 --connect-timeout 10
undefinedaliyun dataworks-public add-entity-into-meta-collection --region <RegionId> --meta-collection-id <CollectionId> --id <EntityId> --remark "<Remark>" --read-timeout 60 --connect-timeout 10
undefinedTips
提示
- Direct access — For MaxCompute, construct entity ID directly () and call
maxcompute-table:::project::table— no need to browse from catalogs.get-table - Lineage direction — = downstream,
--src-entity-id= upstream. For full impact analysis, recursively query each downstream entity to trace multi-level lineage (ODS->DWD->DWS->ADS).--dst-entity-id - Schema fallback — If MaxCompute table not found, retry with schema (three-level model).
:default: - Limits — Max 20 versions per dataset; Album operations require or creator/admin; max 2000 datasets per tenant.
AliyunDataWorksFullAccess - Deletions — Out of scope. If the user asks to delete a dataset/version/collection/lineage relationship or remove an entity from a collection, decline and tell them to use the DataWorks console.
- Retry safely — When a write times out or returns an ambiguous error, do NOT blindly retry. Re-check state with the matching /
list-*first to detect partial success, then decide whether to retry or accept.get-*
- 直接访问——对于MaxCompute,可直接构造实体ID()并调用
maxcompute-table:::project::table——无需从目录开始浏览。get-table - 血缘方向——=下游,
--src-entity-id=上游。如需完整影响分析,递归查询每个下游实体以追踪多级血缘(ODS->DWD->DWS->ADS)。--dst-entity-id - Schema回退——若未找到MaxCompute表,尝试使用Schema(三级模型)重试。
:default: - 限制——每个数据集最多20个版本;专辑操作需权限或创建者/管理员权限;每个租户最多2000个数据集。
AliyunDataWorksFullAccess - 删除操作——不在功能范围内。若用户要求删除数据集/版本/集合/血缘关系或从集合中移除实体,请拒绝并告知其使用DataWorks控制台。
- 安全重试——当写入操作超时或返回模糊错误时,请勿盲目重试。先通过对应的/
list-*命令重新检查状态,检测是否部分成功,再决定是否重试或接受结果。get-*
References
参考资料
| File | Description |
|---|---|
| Entity ID formats for all data source types |
| Complete CLI command reference (read + non-destructive write subset exposed by this Skill) |
| Required RAM permissions (read + non-destructive write) |
| Success verification steps |
| 文件 | 描述 |
|---|---|
| 所有数据源类型的实体ID格式 |
| 完整CLI命令参考(本Skill暴露的读取+非破坏性写入子集) |
| 所需RAM权限(读取+非破坏性写入) |
| 成功验证步骤 |