huggingface-import

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

HuggingFace to Coval Test Set Import

HuggingFace 转 Coval 测试集导入

Import

$ARGUMENTS

from HuggingFace and convert it into Coval test sets with properly structured test cases.

从HuggingFace导入

$ARGUMENTS

并将其转换为结构规范的Coval测试集及测试用例。

Coval Context

Coval 背景信息

Coval is an AI evaluation platform for testing voice and conversational AI agents. It runs simulations against AI agents and measures performance with configurable metrics.

Concept	Description
Test Set	A collection of test cases, grouped by category or evaluation purpose
Test Case	A single evaluation scenario with `input` (prompt) and optional `metadata`
Persona	High-level user character (system prompt) - separate from test cases
Agent	The AI system being evaluated

Key distinction:

Persona = WHO is asking (character, traits)
Test Case = WHAT they ask (prompts, scenarios)

Coval是一个用于测试语音和对话式AI Agent的AI评估平台。它会针对AI Agent运行模拟测试，并通过可配置的指标衡量其性能。

概念（Concept）	描述
Test Set	按类别或评估目的分组的测试用例集合
Test Case	单个评估场景，包含 `input` （提示词）和可选的 `metadata`
Persona	高级用户角色（系统提示词）——与测试用例分离
Agent	被评估的AI系统

关键区别：

Persona = 提问者是谁（角色、特征）
Test Case = 提问内容是什么（提示词、场景）

Coval API

Base URL:

https://api.coval.dev/v1

Fetch the OpenAPI spec before making API calls:

bash

undefined

基础URL：

https://api.coval.dev/v1

调用API前先获取OpenAPI规范：

bash

undefined

List specs (no auth)

列出规范（无需授权）

GET https://api.coval.dev/v1/openapi

Fetch specific spec

获取特定规范

GET https://api.coval.dev/v1/openapi/{spec_name}

undefined

GET https://api.coval.dev/v1/openapi/{spec_name}

undefined

Workflow

工作流程

Step 1: Identify the HuggingFace Source

步骤1：确定HuggingFace数据源

$ARGUMENTS

is provided, navigate to it. Otherwise ask:

What is the HuggingFace repository, space, or dataset you want to import?

Then:

Navigate to the HuggingFace source
Find data files (CSV, JSON, Parquet)
Examine structure and fields

若已提供

$ARGUMENTS

，则直接访问该数据源。否则询问用户：

你想要导入的HuggingFace仓库、空间或数据集是什么？

然后执行以下操作：

访问HuggingFace数据源
查找数据文件（CSV、JSON、Parquet格式）
检查数据结构和字段

Step 2: Analyze Data Structure

步骤2：分析数据结构

Report to the user:

Total records
Available fields/columns
Existing categorization
2-3 sample records

向用户汇报以下信息：

记录总数
可用字段/列
现有分类方式
2-3条样本记录

Step 3: Interactive Field Mapping

步骤3：交互式字段映射

Ask these questions to map HuggingFace data to Coval format:

Q1: Input Field

Which field contains the question/prompt for the test case
input
?

Q2: Categorization

How should test cases be organized into test sets?

By existing category field

Single test set

Custom logic

Q3: Metadata

Which fields should be preserved in
metadata
JSON? (Recommend: preserve original IDs like
question_id
)

Q4: Multi-turn (if applicable)

How to handle multi-turn conversations?

First turn only

Concatenate turns

Separate test cases per turn

询问以下问题，将HuggingFace数据映射为Coval格式：

问题1：输入字段

哪个字段包含测试用例
input
对应的问题/提示词？

问题2：分类方式

测试用例应如何组织为测试集？

按现有分类字段分组

单个测试集

自定义逻辑

问题3：元数据

哪些字段应保留在
metadata
JSON中？（建议：保留原始ID，如
question_id
）

问题4：多轮对话（如适用）

如何处理多轮对话？

仅保留第一轮

拼接所有轮次

每轮对话单独作为测试用例

Step 4: Generate CSVs

步骤4：生成CSV文件

Create Coval-compatible CSVs:

csv

input,metadata
"Your question here","{""question_id"": ""123"", ""source"": ""mt-bench""}"

Requirements:

```
input
```
column MUST be first
Proper quote escaping (double quotes)
```
metadata
```
as valid JSON string
UTF-8 encoding
One CSV per category (recommended)

Naming:

{source}_{category}.csv

创建符合Coval格式的CSV文件：

csv

input,metadata
"Your question here","{""question_id"": ""123"", ""source"": ""mt-bench""}"

要求：

```
input
```
列必须位于第一列
正确转义引号（使用双引号）
```
metadata
```
为有效的JSON字符串
UTF-8编码
建议按类别分别生成CSV文件

命名规则：

{source}_{category}.csv

Step 5: Upload to Coval

步骤5：上传至Coval

Manual: Upload CSVs via Coval dashboard test sets page.

API: Fetch OpenAPI spec and use test set endpoints programmatically.

手动方式： 通过Coval控制台的测试集页面上传CSV文件。

API方式： 获取OpenAPI规范并通过测试集端点以编程方式上传。

Common HuggingFace Sources

常见HuggingFace数据源

General Language Understanding

通用语言理解

Dataset	Description
`cais/mmlu`	15k+ multiple-choice questions across 57 subjects (STEM, humanities, law)
`nyu-mll/glue`	Sentence-level tasks: sentiment, entailment, linguistic acceptability
`tau/commonsense_qa`	Reasoning tests for everyday world knowledge
`Rowan/hellaswag`	Common-sense inference and completion

数据集	描述
`cais/mmlu`	涵盖57个学科（STEM、人文、法律等）的15000+道多项选择题
`nyu-mll/glue`	句子级任务：情感分析、文本蕴含、语言可接受性
`tau/commonsense_qa`	针对日常世界知识的推理测试
`Rowan/hellaswag`	常识推理与补全任务

Reasoning & Problem-Solving

推理与问题解决

Dataset	Description
`openai/gsm8k`	~8k grade-school math word problems (multi-step arithmetic)
`ucinlp/drop`	Reading comprehension with discrete operations
`lukaemon/bbh`	BigBench Hard - challenging reasoning subset

数据集	描述
`openai/gsm8k`	约8000道小学数学生应用题（多步算术运算）
`ucinlp/drop`	包含离散操作的阅读理解任务
`lukaemon/bbh`	BigBench Hard - 具有挑战性的推理子集

huggingface-import

Original

Translation

HuggingFace to Coval Test Set Import

HuggingFace 转 Coval 测试集导入

Coval Context

Coval 背景信息

Coval API

Coval API

List specs (no auth)

列出规范（无需授权）

Fetch specific spec

获取特定规范

Workflow

工作流程

Step 1: Identify the HuggingFace Source

步骤1：确定HuggingFace数据源

Step 2: Analyze Data Structure

步骤2：分析数据结构

Step 3: Interactive Field Mapping

步骤3：交互式字段映射

Step 4: Generate CSVs

步骤4：生成CSV文件

Step 5: Upload to Coval

步骤5：上传至Coval

Common HuggingFace Sources

常见HuggingFace数据源

General Language Understanding

通用语言理解

Reasoning & Problem-Solving

推理与问题解决

Supporting Files

支持文件

Checklist

检查清单