alibabacloud-data-agent-skill
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesemetadata: author: DataAgent Team version: "1.7.2"
metadata: author: DataAgent Team version: "1.7.2"
Changelog
更新日志
- v1.7.2: Use Alibaba Cloud default credential chain instead of explicit AK/SK, add User-Agent header, fix RAM policy wildcard issues
- v1.7.1: Fix CLI command API response parsing (support case-insensitive field names), optimize SKILL documentation structure, separate ANALYSIS mode specification document
ls - v1.7.0: API_KEY authentication support, native async execution mode, session isolation, enhanced attach mode, optimized log output
- v1.7.2: 改用阿里云默认凭证链替代显式AK/SK,新增User-Agent请求头,修复RAM策略通配符问题
- v1.7.1: 修复CLI 命令API响应解析问题(支持不区分大小写的字段名),优化Skill文档结构,拆分ANALYSIS模式说明文档
ls - v1.7.0: 支持API_KEY认证,原生异步执行模式,会话隔离,增强attach模式,优化日志输出
Installation
安装
Configure Credentials
配置凭证
This Skill uses Alibaba Cloud default credential chain (recommended) or API_KEY authentication.
本Skill使用阿里云默认凭证链(推荐)或API_KEY认证。
Option 1: Default Credential Chain (Recommended)
选项1:默认凭证链(推荐)
The Skill uses Alibaba Cloud SDK's default credential chain to automatically obtain credentials, supporting environment variables, configuration files, instance roles, etc.
本Skill使用阿里云SDK的默认凭证链自动获取凭证,支持环境变量、配置文件、实例角色等。
参考 阿里云凭证链文档
Option 2: API_KEY Authentication (File Analysis Only)
选项2:API_KEY认证(仅支持文件分析)
bash
export DATA_AGENT_API_KEY=your-api-key
export DATA_AGENT_REGION=cn-hangzhouGet API_KEY: Data Agent Console
bash
export DATA_AGENT_API_KEY=your-api-key
export DATA_AGENT_REGION=cn-hangzhou获取API_KEY: Data Agent控制台
Permission Requirements
权限要求
RAM users need or permissions.
See RAM-POLICIES.md for detailed permission information.
AliyunDMSFullAccessAliyunDMSDataAgentFullAccessRAM用户需要 或 权限。
参考 RAM-POLICIES.md 查看详细权限信息。
AliyunDMSFullAccessAliyunDMSDataAgentFullAccessDebug Mode
调试模式
bash
DATA_AGENT_DEBUG_API=1 python3 scripts/data_agent_cli.py file example.csv -q "analyze"bash
DATA_AGENT_DEBUG_API=1 python3 scripts/data_agent_cli.py file example.csv -q "analyze"💡 Getting Started Tips
💡 入门提示
- Use the built-in demo database (DataAgent's built-in test database containing employee, department, and salary data) for first-time experience
internal_data_employees - Or use local file for file analysis experience
assets/example_game_data.csv
- 使用内置演示数据库 (DataAgent内置的测试数据库,包含员工、部门和薪资数据)进行首次体验
internal_data_employees - 或使用本地文件 进行文件分析体验
assets/example_game_data.csv
Data Agent CLI — Unified Command-Line Data Analysis Tool
Data Agent CLI — 统一命令行数据分析工具
Overview
概述
scripts/data_agent_cli.pyscripts/data_agent_cli.pyCore Concepts
核心概念
⚠️ Key Prerequisite: Data Agent can only analyze databases that have been imported into Data Agent Data Center.
- Data Center: Data Agent's data center, only databases here can be analyzed
- DMS: Alibaba Cloud Data Management Service, stores metadata of all databases
- Relationship: Databases registered in DMS ≠ Databases in Data Center
Usage Flow:
- First use
to check if the target database exists in Data Centerls- If not found, use
subcommand to search for database info, then usedmssubcommand to import itimport- After successful import, you can use
subcommand for analysisdb
⚠️ 重要前提: Data Agent仅能分析已导入Data Agent数据中心的数据库。
- 数据中心: Data Agent的数据中心,仅此处的数据库可被分析
- DMS: 阿里云数据管理服务,存储所有数据库的元数据
- 关系: 在DMS中注册的数据库 ≠ 数据中心中的数据库
使用流程:
- 首先使用
命令检查目标数据库是否存在于数据中心ls- 如果未找到,使用
子命令搜索数据库信息,然后使用dms子命令导入import- 导入成功后,即可使用
子命令进行分析db
Analysis Modes
分析模式
- ASK_DATA (default): Synchronous execution, sub-second response, suitable for quick Q&A
- ANALYSIS: Deep analysis, takes 5-40 minutes, requires spawning a sub-agent for async execution or using --async-run parameter
See ANALYSIS_MODE.md for details
- ASK_DATA(默认): 同步执行,亚秒级响应,适合快速问答
- ANALYSIS: 深度分析,耗时5-40分钟,需要生成子Agent异步执行或使用--async-run参数
参考 ANALYSIS_MODE.md 了解详情
Session Reuse
会话复用
Use / to create a session for initial analysis, then use to reuse the session for follow-up questions.
dbfileattach --session-id <ID>See COMMANDS.md and WORKFLOWS.md for details
使用 / 创建会话进行首次分析后,可使用 复用会话进行后续提问。
dbfileattach --session-id <ID>参考 COMMANDS.md 和 WORKFLOWS.md 了解详情
Quick Start
快速开始
bash
undefinedbash
undefined1. List available databases
1. 列出可用数据库
python3 scripts/data_agent_cli.py ls
python3 scripts/data_agent_cli.py ls
2. Query analysis (synchronous response)
2. 查询分析(同步响应)
python3 scripts/data_agent_cli.py db
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "Which department has the highest average salary"
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "Which department has the highest average salary"
python3 scripts/data_agent_cli.py db
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "哪个部门的平均薪资最高"
--dms-instance-id <ID> --dms-db-id <ID>
--instance-name <NAME> --db-name <DB>
--tables "employees,departments" -q "哪个部门的平均薪资最高"
3. Follow-up question (reuse session)
3. 后续提问(复用会话)
python3 scripts/data_agent_cli.py attach --session-id <ID> -q "Break down by month"
> 📖 See [WORKFLOWS.md](references/WORKFLOWS.md) and [COMMANDS.md](references/COMMANDS.md) for complete workflows, command reference, and best practices
---python3 scripts/data_agent_cli.py attach --session-id <ID> -q "按月份拆分统计"
> 📖 参考 [WORKFLOWS.md](references/WORKFLOWS.md) 和 [COMMANDS.md](references/COMMANDS.md) 查看完整工作流、命令参考和最佳实践
---Project Structure
项目结构
# Skill root directory
├── SKILL.md # This document
├── scripts/ # Source code
│ ├── data_agent/ # SDK module
│ ├── cli/ # CLI module
│ ├── data_agent_cli.py # CLI entry point
│ └── requirements.txt # Dependencies
├── sessions/ # Session data
└── references/ # Reference documents # Skill根目录
├── SKILL.md # 本文档
├── scripts/ # 源代码
│ ├── data_agent/ # SDK模块
│ ├── cli/ # CLI模块
│ ├── data_agent_cli.py # CLI入口文件
│ └── requirements.txt # 依赖项
├── sessions/ # 会话数据
└── references/ # 参考文档