alibabacloud-data-agent-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

metadata: author: DataAgent Team version: "1.7.2"

metadata: author: DataAgent Team version: "1.7.2"

Changelog

更新日志

  • v1.7.2: Use Alibaba Cloud default credential chain instead of explicit AK/SK, add User-Agent header, fix RAM policy wildcard issues
  • v1.7.1: Fix CLI
    ls
    command API response parsing (support case-insensitive field names), optimize SKILL documentation structure, separate ANALYSIS mode specification document
  • v1.7.0: API_KEY authentication support, native async execution mode, session isolation, enhanced attach mode, optimized log output


  • v1.7.2: 改用阿里云默认凭证链替代显式AK/SK,新增User-Agent请求头,修复RAM策略通配符问题
  • v1.7.1: 修复CLI
    ls
    命令API响应解析问题(支持不区分大小写的字段名),优化Skill文档结构,拆分ANALYSIS模式说明文档
  • v1.7.0: 支持API_KEY认证,原生异步执行模式,会话隔离,增强attach模式,优化日志输出


Installation

安装

Configure Credentials

配置凭证

This Skill uses Alibaba Cloud default credential chain (recommended) or API_KEY authentication.
本Skill使用阿里云默认凭证链(推荐)或API_KEY认证。

Option 1: Default Credential Chain (Recommended)

选项1:默认凭证链(推荐)

The Skill uses Alibaba Cloud SDK's default credential chain to automatically obtain credentials, supporting environment variables, configuration files, instance roles, etc.
本Skill使用阿里云SDK的默认凭证链自动获取凭证,支持环境变量、配置文件、实例角色等。

Option 2: API_KEY Authentication (File Analysis Only)

选项2:API_KEY认证(仅支持文件分析)

bash
export DATA_AGENT_API_KEY=your-api-key
export DATA_AGENT_REGION=cn-hangzhou
Get API_KEY: Data Agent Console
bash
export DATA_AGENT_API_KEY=your-api-key
export DATA_AGENT_REGION=cn-hangzhou
获取API_KEY: Data Agent控制台

Permission Requirements

权限要求

RAM users need
AliyunDMSFullAccess
or
AliyunDMSDataAgentFullAccess
permissions. See RAM-POLICIES.md for detailed permission information.
RAM用户需要
AliyunDMSFullAccess
AliyunDMSDataAgentFullAccess
权限。 参考 RAM-POLICIES.md 查看详细权限信息。

Debug Mode

调试模式

bash
DATA_AGENT_DEBUG_API=1 python3 scripts/data_agent_cli.py file example.csv -q "analyze"
bash
DATA_AGENT_DEBUG_API=1 python3 scripts/data_agent_cli.py file example.csv -q "analyze"

💡 Getting Started Tips

💡 入门提示

  • Use the built-in demo database
    internal_data_employees
    (DataAgent's built-in test database containing employee, department, and salary data) for first-time experience
  • Or use local file
    assets/example_game_data.csv
    for file analysis experience
  • 使用内置演示数据库
    internal_data_employees
    (DataAgent内置的测试数据库,包含员工、部门和薪资数据)进行首次体验
  • 或使用本地文件
    assets/example_game_data.csv
    进行文件分析体验

Data Agent CLI — Unified Command-Line Data Analysis Tool

Data Agent CLI — 统一命令行数据分析工具

Overview

概述

scripts/data_agent_cli.py
helps users complete the full workflow from discover data → initiate analysis → track progress → get results.
scripts/data_agent_cli.py
可帮助用户完成发现数据 → 发起分析 → 跟踪进度 → 获取结果的全工作流程。

Core Concepts

核心概念

⚠️ Key Prerequisite: Data Agent can only analyze databases that have been imported into Data Agent Data Center.
  • Data Center: Data Agent's data center, only databases here can be analyzed
  • DMS: Alibaba Cloud Data Management Service, stores metadata of all databases
  • Relationship: Databases registered in DMS ≠ Databases in Data Center
Usage Flow:
  1. First use
    ls
    to check if the target database exists in Data Center
  2. If not found, use
    dms
    subcommand to search for database info, then use
    import
    subcommand to import it
  3. After successful import, you can use
    db
    subcommand for analysis

⚠️ 重要前提: Data Agent仅能分析已导入Data Agent数据中心的数据库。
  • 数据中心: Data Agent的数据中心,仅此处的数据库可被分析
  • DMS: 阿里云数据管理服务,存储所有数据库的元数据
  • 关系: 在DMS中注册的数据库 ≠ 数据中心中的数据库
使用流程:
  1. 首先使用
    ls
    命令检查目标数据库是否存在于数据中心
  2. 如果未找到,使用
    dms
    子命令搜索数据库信息,然后使用
    import
    子命令导入
  3. 导入成功后,即可使用
    db
    子命令进行分析

Analysis Modes

分析模式

  • ASK_DATA (default): Synchronous execution, sub-second response, suitable for quick Q&A
  • ANALYSIS: Deep analysis, takes 5-40 minutes, requires spawning a sub-agent for async execution or using --async-run parameter
See ANALYSIS_MODE.md for details

  • ASK_DATA(默认): 同步执行,亚秒级响应,适合快速问答
  • ANALYSIS: 深度分析,耗时5-40分钟,需要生成子Agent异步执行或使用--async-run参数
参考 ANALYSIS_MODE.md 了解详情

Session Reuse

会话复用

Use
db
/
file
to create a session for initial analysis, then use
attach --session-id <ID>
to reuse the session for follow-up questions.
See COMMANDS.md and WORKFLOWS.md for details

使用
db
/
file
创建会话进行首次分析后,可使用
attach --session-id <ID>
复用会话进行后续提问。
参考 COMMANDS.mdWORKFLOWS.md 了解详情

Quick Start

快速开始

bash
undefined
bash
undefined

1. List available databases

1. 列出可用数据库

python3 scripts/data_agent_cli.py ls
python3 scripts/data_agent_cli.py ls

2. Query analysis (synchronous response)

2. 查询分析(同步响应)

python3 scripts/data_agent_cli.py db
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "Which department has the highest average salary"
python3 scripts/data_agent_cli.py db
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "哪个部门的平均薪资最高"

3. Follow-up question (reuse session)

3. 后续提问(复用会话)

python3 scripts/data_agent_cli.py attach --session-id <ID> -q "Break down by month"

> 📖 See [WORKFLOWS.md](references/WORKFLOWS.md) and [COMMANDS.md](references/COMMANDS.md) for complete workflows, command reference, and best practices

---
python3 scripts/data_agent_cli.py attach --session-id <ID> -q "按月份拆分统计"

> 📖 参考 [WORKFLOWS.md](references/WORKFLOWS.md) 和 [COMMANDS.md](references/COMMANDS.md) 查看完整工作流、命令参考和最佳实践

---

Project Structure

项目结构

                          # Skill root directory
├── SKILL.md              # This document
├── scripts/              # Source code
│   ├── data_agent/       # SDK module
│   ├── cli/              # CLI module
│   ├── data_agent_cli.py # CLI entry point
│   └── requirements.txt  # Dependencies
├── sessions/             # Session data
└── references/           # Reference documents
                          # Skill根目录
├── SKILL.md              # 本文档
├── scripts/              # 源代码
│   ├── data_agent/       # SDK模块
│   ├── cli/              # CLI模块
│   ├── data_agent_cli.py # CLI入口文件
│   └── requirements.txt  # 依赖项
├── sessions/             # 会话数据
└── references/           # 参考文档