huggingface-import
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHuggingFace to Coval Test Set Import
HuggingFace 转 Coval 测试集导入
Import from HuggingFace and convert it into Coval test sets with properly structured test cases.
$ARGUMENTS从HuggingFace导入并将其转换为结构规范的Coval测试集及测试用例。
$ARGUMENTSCoval Context
Coval 背景信息
Coval is an AI evaluation platform for testing voice and conversational AI agents. It runs simulations against AI agents and measures performance with configurable metrics.
| Concept | Description |
|---|---|
| Test Set | A collection of test cases, grouped by category or evaluation purpose |
| Test Case | A single evaluation scenario with |
| Persona | High-level user character (system prompt) - separate from test cases |
| Agent | The AI system being evaluated |
Key distinction:
- Persona = WHO is asking (character, traits)
- Test Case = WHAT they ask (prompts, scenarios)
Coval是一个用于测试语音和对话式AI Agent的AI评估平台。它会针对AI Agent运行模拟测试,并通过可配置的指标衡量其性能。
| 概念(Concept) | 描述 |
|---|---|
| Test Set | 按类别或评估目的分组的测试用例集合 |
| Test Case | 单个评估场景,包含 |
| Persona | 高级用户角色(系统提示词)——与测试用例分离 |
| Agent | 被评估的AI系统 |
关键区别:
- Persona = 提问者是谁(角色、特征)
- Test Case = 提问内容是什么(提示词、场景)
Coval API
Coval API
Base URL:
https://api.coval.dev/v1Fetch the OpenAPI spec before making API calls:
bash
undefined基础URL:
https://api.coval.dev/v1调用API前先获取OpenAPI规范:
bash
undefinedList specs (no auth)
列出规范(无需授权)
Fetch specific spec
获取特定规范
undefinedundefinedWorkflow
工作流程
Step 1: Identify the HuggingFace Source
步骤1:确定HuggingFace数据源
If is provided, navigate to it. Otherwise ask:
$ARGUMENTSWhat is the HuggingFace repository, space, or dataset you want to import?
Then:
- Navigate to the HuggingFace source
- Find data files (CSV, JSON, Parquet)
- Examine structure and fields
若已提供,则直接访问该数据源。否则询问用户:
$ARGUMENTS你想要导入的HuggingFace仓库、空间或数据集是什么?
然后执行以下操作:
- 访问HuggingFace数据源
- 查找数据文件(CSV、JSON、Parquet格式)
- 检查数据结构和字段
Step 2: Analyze Data Structure
步骤2:分析数据结构
Report to the user:
- Total records
- Available fields/columns
- Existing categorization
- 2-3 sample records
向用户汇报以下信息:
- 记录总数
- 可用字段/列
- 现有分类方式
- 2-3条样本记录
Step 3: Interactive Field Mapping
步骤3:交互式字段映射
Ask these questions to map HuggingFace data to Coval format:
Q1: Input Field
Which field contains the question/prompt for the test case?input
Q2: Categorization
How should test cases be organized into test sets?
- By existing category field
- Single test set
- Custom logic
Q3: Metadata
Which fields should be preserved inJSON? (Recommend: preserve original IDs likemetadata)question_id
Q4: Multi-turn (if applicable)
How to handle multi-turn conversations?
- First turn only
- Concatenate turns
- Separate test cases per turn
询问以下问题,将HuggingFace数据映射为Coval格式:
问题1:输入字段
哪个字段包含测试用例对应的问题/提示词?input
问题2:分类方式
测试用例应如何组织为测试集?
- 按现有分类字段分组
- 单个测试集
- 自定义逻辑
问题3:元数据
哪些字段应保留在JSON中? (建议:保留原始ID,如metadata)question_id
问题4:多轮对话(如适用)
如何处理多轮对话?
- 仅保留第一轮
- 拼接所有轮次
- 每轮对话单独作为测试用例
Step 4: Generate CSVs
步骤4:生成CSV文件
Create Coval-compatible CSVs:
csv
input,metadata
"Your question here","{""question_id"": ""123"", ""source"": ""mt-bench""}"Requirements:
- column MUST be first
input - Proper quote escaping (double quotes)
- as valid JSON string
metadata - UTF-8 encoding
- One CSV per category (recommended)
Naming:
{source}_{category}.csv创建符合Coval格式的CSV文件:
csv
input,metadata
"Your question here","{""question_id"": ""123"", ""source"": ""mt-bench""}"要求:
- 列必须位于第一列
input - 正确转义引号(使用双引号)
- 为有效的JSON字符串
metadata - UTF-8编码
- 建议按类别分别生成CSV文件
命名规则:
{source}_{category}.csvStep 5: Upload to Coval
步骤5:上传至Coval
Manual: Upload CSVs via Coval dashboard test sets page.
API: Fetch OpenAPI spec and use test set endpoints programmatically.
手动方式: 通过Coval控制台的测试集页面上传CSV文件。
API方式: 获取OpenAPI规范并通过测试集端点以编程方式上传。
Common HuggingFace Sources
常见HuggingFace数据源
General Language Understanding
通用语言理解
| Dataset | Description |
|---|---|
| 15k+ multiple-choice questions across 57 subjects (STEM, humanities, law) |
| Sentence-level tasks: sentiment, entailment, linguistic acceptability |
| Reasoning tests for everyday world knowledge |
| Common-sense inference and completion |
| 数据集 | 描述 |
|---|---|
| 涵盖57个学科(STEM、人文、法律等)的15000+道多项选择题 |
| 句子级任务:情感分析、文本蕴含、语言可接受性 |
| 针对日常世界知识的推理测试 |
| 常识推理与补全任务 |
Reasoning & Problem-Solving
推理与问题解决
| Dataset | Description |
|---|---|
| ~8k grade-school math word problems (multi-step arithmetic) |
| Reading comprehension with discrete operations |
| BigBench Hard - challenging reasoning subset |
| 数据集 | 描述 |
|---|---|
| 约8000道小学数学生应用题(多步算术运算) |
| 包含离散操作的阅读理解任务 |
| BigBench Hard - 具有挑战性的推理子集 |
Supporting Files
支持文件
- For Python transformation example, see examples/huggingface-import.py
- Python转换示例请查看examples/huggingface-import.py
Checklist
检查清单
- Identified input field
- Determined categorization
- Preserved original IDs in metadata
- Proper quote escaping
- Valid JSON in metadata
- Separate CSVs per category
- 已确定输入字段
- 已确定分类方式
- 已在元数据中保留原始ID
- 已正确转义引号
- 元数据为有效JSON
- 已按类别生成独立CSV文件