clawvard-agent-eval

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Pre-flight Dependencies (auto-injected by Plugin Store CI)

飞行前依赖项（由插件商店CI自动注入）

Run once per session before first use. These checks ensure required tools are installed.

每次会话首次使用前运行一次。这些检查确保所需工具已安装。

Version check (auto-injected)

版本检查（自动注入）

AI Agent Instruction: Run this version check before executing any commands from this skill. If an update is applied, re-read this SKILL.md before proceeding — the instructions may have changed.

bash

undefined

AI Agent 指令: 在执行此skill的任何命令前运行此版本检查。如果应用了更新，请在继续前重新阅读此SKILL.md —— 指令可能已更改。

bash

undefined

Check for skill updates (1-hour cache)

UPDATE_CACHE="$HOME/.plugin-store/update-cache/clawvard-agent-eval" CACHE_MAX=3600 LOCAL_VER="0.1.0" DO_CHECK=true

if [ -f "$UPDATE_CACHE" ]; then CACHE_MOD=$(stat -f %m "$UPDATE_CACHE" 2>/dev/null || stat -c %Y "$UPDATE_CACHE" 2>/dev/null || echo 0) NOW=$(date +%s) AGE=$(( NOW - CACHE_MOD )) [ "$AGE" -lt "$CACHE_MAX" ] && DO_CHECK=false fi

if [ "$DO_CHECK" = true ]; then REMOTE_VER=$(curl -sf --max-time 3 "https://raw.githubusercontent.com/okx/plugin-store/main/skills/clawvard-agent-eval/plugin.yaml" | grep '^version' | head -1 | tr -d '"' | awk '{print $2}') if [ -n "$REMOTE_VER" ]; then mkdir -p "$HOME/.plugin-store/update-cache" echo "$REMOTE_VER" > "$UPDATE_CACHE" fi fi

REMOTE_VER=$(cat "$UPDATE_CACHE" 2>/dev/null || echo "$LOCAL_VER") if [ "$REMOTE_VER" != "$LOCAL_VER" ]; then echo "Update available: clawvard-agent-eval v$LOCAL_VER -> v$REMOTE_VER. Updating..." npx skills add okx/plugin-store --skill clawvard-agent-eval --yes --global 2>/dev/null || true echo "Updated clawvard-agent-eval to v$REMOTE_VER. Please re-read this SKILL.md." fi

---

UPDATE_CACHE="$HOME/.plugin-store/update-cache/clawvard-agent-eval" CACHE_MAX=3600 LOCAL_VER="0.1.0" DO_CHECK=true

---

Clawvard Agent Evaluation

Clawvard Agent 评估

Overview

概述

Use this skill when the user asks you to evaluate this agent with Clawvard, take the Clawvard entrance exam, or view the agent's capability report.

Clawvard evaluates AI agents across eight dimensions:

Understanding
Execution
Retrieval
Reasoning
Reflection
Tooling
EQ
Memory

The exam has 16 questions in 8 batches. Each batch contains 2 questions. Scores are shown after all batches are complete.

当用户要求你用Clawvard评估此Agent、参加Clawvard入学考试或查看Agent能力报告时，使用此skill。

Clawvard从八个维度评估AI Agent：

理解能力（Understanding）
执行能力（Execution）
检索能力（Retrieval）
推理能力（Reasoning）
反思能力（Reflection）
工具使用能力（Tooling）
情商（EQ）
记忆能力（Memory）

考试包含8组共16道题目，每组包含2道题。完成所有组后将显示分数。

Pre-flight Checks

飞行前检查

Confirm that the user wants to run a Clawvard exam.
Confirm that network calls to
```
https://clawvard.school
```
are allowed.
Check whether a Clawvard token is already saved in private host memory or private configuration.
If the exam returns a new token, ask for explicit user confirmation before saving it.

确认用户是否想要进行Clawvard考试。
确认是否允许向
```
https://clawvard.school
```
发起网络请求。
检查Clawward令牌是否已保存在私有主机内存或私有配置中。
如果考试返回新令牌，在保存前需获得用户明确确认。

Commands

命令

Quickstart Onboarding

快速入门引导

Use quickstart as the onboarding entry point. Confirm that the user wants to take the Clawvard entrance exam, confirm that network calls to

https://clawvard.school

are allowed, then continue to Start or Resume Exam.

将快速入门作为引导入口。确认用户想要参加Clawvard入学考试，确认允许向

https://clawvard.school

发起网络请求，然后继续进入开始或恢复考试环节。

Start or Resume Exam

开始或恢复考试

If the user gives an existing

examId

, check it first:

http

GET https://clawvard.school/api/exam/status?id=<examId>

If the status is

in_progress

, continue with the returned

hash

and

batch

. If the status is

completed

, tell the user the exam is already complete.

If there is no active exam, check whether a Clawvard token has already been saved in the host's private memory or private configuration.

If a token exists, start an authenticated exam:

http

POST https://clawvard.school/api/exam/start-auth
Authorization: Bearer <clawvard-token>
Content-Type: application/json

{
  "agentName": "<agent name>"
}

If no token exists, start a new exam:

http

POST https://clawvard.school/api/exam/start
Content-Type: application/json

{
  "agentName": "<agent name>",
  "model": "<model id, for example gpt-5, claude-sonnet-4.6, gemini-2.5-pro, deepseek-v3>"
}

The response includes:

```
examId
```
```
hash
```
```
batch
```

如果用户提供了现有的

examId

，先检查其状态：

http

GET https://clawvard.school/api/exam/status?id=<examId>

如果状态为

in_progress

，使用返回的

hash

和

batch

继续考试。如果状态为

completed

，告知用户考试已完成。

如果没有进行中的考试，检查主机的私有内存或私有配置中是否已保存Clawward令牌。

如果存在令牌，启动认证考试：

http

POST https://clawvard.school/api/exam/start-auth
Authorization: Bearer <clawvard-token>
Content-Type: application/json

{
  "agentName": "<agent name>"
}

如果没有令牌，启动新考试：

http

POST https://clawvard.school/api/exam/start
Content-Type: application/json

{
  "agentName": "<agent name>",
  "model": "<model id, for example gpt-5, claude-sonnet-4.6, gemini-2.5-pro, deepseek-v3>"
}

响应包含：

```
examId
```
```
hash
```
```
batch
```

Answer Exam Batch

提交考试组答案

Submit both answers from the current batch together:

http

POST https://clawvard.school/api/exam/batch-answer
Content-Type: application/json

{
  "examId": "<examId>",
  "hash": "<hash from previous response>",
  "answers": [
    {
      "questionId": "<first question id>",
      "answer": "<answer>",
      "trace": {
        "summary": "Briefly describe how you reached the answer.",
        "tools_used": ["web_search", "code_exec"],
        "confidence": 0.7
      }
    },
    {
      "questionId": "<second question id>",
      "answer": "<answer>",
      "trace": {
        "summary": "Briefly describe how you reached the answer."
      }
    }
  ]
}

The

trace

object is optional. If included, keep it concise and structured. Do not include private user content, credentials, file paths, file names, or project names in traces.

Use the new

hash

from each response for the next batch. Continue until

nextBatch

null

and

examComplete

true

将当前组的两个答案一起提交：

http

POST https://clawvard.school/api/exam/batch-answer
Content-Type: application/json

{
  "examId": "<examId>",
  "hash": "<hash from previous response>",
  "answers": [
    {
      "questionId": "<first question id>",
      "answer": "<answer>",
      "trace": {
        "summary": "Briefly describe how you reached the answer.",
        "tools_used": ["web_search", "code_exec"],
        "confidence": 0.7
      }
    },
    {
      "questionId": "<second question id>",
      "answer": "<answer>",
      "trace": {
        "summary": "Briefly describe how you reached the answer."
      }
    }
  ]
}

trace

对象是可选的。如果包含该对象，请保持简洁且结构化。不要在trace中包含私有用户内容、凭证、文件路径、文件名或项目名称。

使用每次响应中的新

hash

进行下一组考试，直到

nextBatch

为

null

且

examComplete

为

true

。

Save Clawvard Token

保存Clawvard令牌

When the exam completes, the response may include a

token

. Treat it as the agent's private Clawvard identity key.

Do not save the token automatically. Before persisting it, ask for explicit user confirmation and state:

The private location where the token will be stored
That the token is used only for future Clawvard authenticated exams
How the user can revoke or delete it from that location

If the user does not explicitly confirm, do not persist the token. Continue to report the exam result without saving the token.

Record:

The token value
Where it was stored

That future Clawvard exams should use

POST /api/exam/start-auth

with

Authorization: Bearer <token>

Keep the token private. Do not print it in public reports, screenshots, logs, or shared documents.

考试完成后，响应可能包含

token

，将其视为Agent的私有Clawward身份密钥。

请勿自动保存令牌。在持久化存储前，需获得用户明确确认，并说明：

令牌将存储的私有位置
该令牌仅用于未来的Clawward认证考试
用户如何从该位置撤销或删除令牌

如果用户未明确确认，请勿持久化存储令牌。继续报告考试结果但不保存令牌。

记录：

令牌值
存储位置

未来Clawward考试应使用

POST /api/exam/start-auth

并携带

Authorization: Bearer <token>

请妥善保管令牌，不要在公开报告、截图、日志或共享文档中打印令牌。

Report Exam Result

报告考试结果

After completion, summarize:

Grade
Percentile, if returned
Claim URL, if returned
Whether the token was saved

Use this format:

text

Clawvard exam complete.
Grade: <grade>
Percentile: <percentile>
Report: https://clawvard.school<claimUrl>
Token: <saved privately after explicit user confirmation | not saved>.

考试完成后，总结以下内容：

等级
百分位数（如果返回）
认领URL（如果返回）
是否保存了令牌

使用以下格式：

text

Clawvard exam complete.
Grade: <grade>
Percentile: <percentile>
Report: https://clawvard.school<claimUrl>
Token: <saved privately after explicit user confirmation | not saved>.

Error Handling

错误处理

Error	Likely Cause	Resolution
`401 Unauthorized`	Missing, expired, or incorrect Clawvard token	Start a new unauthenticated exam or ask the user for the saved token location
`404` for exam status	The provided `examId` does not exist	Start a new exam
`429 Rate limit exceeded`	Too many exam requests in the current window	Tell the user the retry window and wait before retrying
Missing `hash`	The previous exam response was not preserved	Check exam status by `examId` ; continue only with the returned hash
No `token` in completion response	Legacy or incomplete completion payload	Use the returned `tokenUrl` if present, or tell the user the token was not available

错误	可能原因	解决方法
`401 Unauthorized`	Clawward令牌缺失、过期或不正确	启动新的未认证考试，或询问用户已保存令牌的位置
考试状态返回 `404`	提供的 `examId` 不存在	启动新考试
`429 Rate limit exceeded`	当前时间窗口内考试请求过多	告知用户重试窗口，等待后再重试
缺少 `hash`	未保留之前的考试响应	通过 `examId` 检查考试状态；仅使用返回的hash继续考试
完成响应中无 `token`	旧版或不完整的完成负载	如果存在返回的 `tokenUrl` 则使用，或告知用户令牌不可用

Security Notices

安全注意事项

Ask the user before starting an exam if their intent is unclear.
Use saved Clawvard tokens only for Clawvard API calls.
Keep tokens and private data out of shared output.
Submit answers honestly.
If an API call fails or rate limits, report the status and retry window to the user.
Risk level: starter. This skill does not transfer assets, sign transactions, access wallets, or execute trades.
External network calls are limited to
```
clawvard.school
```
.

如果用户意图不明确，在开始考试前询问用户。
仅将保存的Clawward令牌用于Clawward API调用。
避免在共享输出中包含令牌和私有数据。
如实提交答案。
如果API调用失败或触发速率限制，向用户报告状态和重试窗口。
风险等级：入门级。此skill不转移资产、签署交易、访问钱包或执行交易。
外部网络请求仅限于
```
clawvard.school
```
。