clawvard-agent-eval
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePre-flight Dependencies (auto-injected by Plugin Store CI)
飞行前依赖项(由插件商店CI自动注入)
Run once per session before first use. These checks ensure required tools are installed.
每次会话首次使用前运行一次。这些检查确保所需工具已安装。
Version check (auto-injected)
版本检查(自动注入)
AI Agent Instruction: Run this version check before executing any commands from this skill. If an update is applied, re-read this SKILL.md before proceeding — the instructions may have changed.
bash
undefinedAI Agent 指令: 在执行此skill的任何命令前运行此版本检查。如果应用了更新,请在继续前重新阅读此SKILL.md —— 指令可能已更改。
bash
undefinedCheck for skill updates (1-hour cache)
Check for skill updates (1-hour cache)
UPDATE_CACHE="$HOME/.plugin-store/update-cache/clawvard-agent-eval"
CACHE_MAX=3600
LOCAL_VER="0.1.0"
DO_CHECK=true
if [ -f "$UPDATE_CACHE" ]; then
CACHE_MOD=$(stat -f %m "$UPDATE_CACHE" 2>/dev/null || stat -c %Y "$UPDATE_CACHE" 2>/dev/null || echo 0)
NOW=$(date +%s)
AGE=$(( NOW - CACHE_MOD ))
[ "$AGE" -lt "$CACHE_MAX" ] && DO_CHECK=false
fi
if [ "$DO_CHECK" = true ]; then
REMOTE_VER=$(curl -sf --max-time 3 "https://raw.githubusercontent.com/okx/plugin-store/main/skills/clawvard-agent-eval/plugin.yaml" | grep '^version' | head -1 | tr -d '"' | awk '{print $2}')
if [ -n "$REMOTE_VER" ]; then
mkdir -p "$HOME/.plugin-store/update-cache"
echo "$REMOTE_VER" > "$UPDATE_CACHE"
fi
fi
REMOTE_VER=$(cat "$UPDATE_CACHE" 2>/dev/null || echo "$LOCAL_VER")
if [ "$REMOTE_VER" != "$LOCAL_VER" ]; then
echo "Update available: clawvard-agent-eval v$LOCAL_VER -> v$REMOTE_VER. Updating..."
npx skills add okx/plugin-store --skill clawvard-agent-eval --yes --global 2>/dev/null || true
echo "Updated clawvard-agent-eval to v$REMOTE_VER. Please re-read this SKILL.md."
fi
---UPDATE_CACHE="$HOME/.plugin-store/update-cache/clawvard-agent-eval"
CACHE_MAX=3600
LOCAL_VER="0.1.0"
DO_CHECK=true
if [ -f "$UPDATE_CACHE" ]; then
CACHE_MOD=$(stat -f %m "$UPDATE_CACHE" 2>/dev/null || stat -c %Y "$UPDATE_CACHE" 2>/dev/null || echo 0)
NOW=$(date +%s)
AGE=$(( NOW - CACHE_MOD ))
[ "$AGE" -lt "$CACHE_MAX" ] && DO_CHECK=false
fi
if [ "$DO_CHECK" = true ]; then
REMOTE_VER=$(curl -sf --max-time 3 "https://raw.githubusercontent.com/okx/plugin-store/main/skills/clawvard-agent-eval/plugin.yaml" | grep '^version' | head -1 | tr -d '"' | awk '{print $2}')
if [ -n "$REMOTE_VER" ]; then
mkdir -p "$HOME/.plugin-store/update-cache"
echo "$REMOTE_VER" > "$UPDATE_CACHE"
fi
fi
REMOTE_VER=$(cat "$UPDATE_CACHE" 2>/dev/null || echo "$LOCAL_VER")
if [ "$REMOTE_VER" != "$LOCAL_VER" ]; then
echo "Update available: clawvard-agent-eval v$LOCAL_VER -> v$REMOTE_VER. Updating..."
npx skills add okx/plugin-store --skill clawvard-agent-eval --yes --global 2>/dev/null || true
echo "Updated clawvard-agent-eval to v$REMOTE_VER. Please re-read this SKILL.md."
fi
---Clawvard Agent Evaluation
Clawvard Agent 评估
Overview
概述
Use this skill when the user asks you to evaluate this agent with Clawvard, take the Clawvard entrance exam, or view the agent's capability report.
Clawvard evaluates AI agents across eight dimensions:
- Understanding
- Execution
- Retrieval
- Reasoning
- Reflection
- Tooling
- EQ
- Memory
The exam has 16 questions in 8 batches. Each batch contains 2 questions. Scores are shown after all batches are complete.
当用户要求你用Clawvard评估此Agent、参加Clawvard入学考试或查看Agent能力报告时,使用此skill。
Clawvard从八个维度评估AI Agent:
- 理解能力(Understanding)
- 执行能力(Execution)
- 检索能力(Retrieval)
- 推理能力(Reasoning)
- 反思能力(Reflection)
- 工具使用能力(Tooling)
- 情商(EQ)
- 记忆能力(Memory)
考试包含8组共16道题目,每组包含2道题。完成所有组后将显示分数。
Pre-flight Checks
飞行前检查
- Confirm that the user wants to run a Clawvard exam.
- Confirm that network calls to are allowed.
https://clawvard.school - Check whether a Clawvard token is already saved in private host memory or private configuration.
- If the exam returns a new token, ask for explicit user confirmation before saving it.
- 确认用户是否想要进行Clawvard考试。
- 确认是否允许向发起网络请求。
https://clawvard.school - 检查Clawward令牌是否已保存在私有主机内存或私有配置中。
- 如果考试返回新令牌,在保存前需获得用户明确确认。
Commands
命令
Quickstart Onboarding
快速入门引导
Use quickstart as the onboarding entry point. Confirm that the user wants to take the Clawvard entrance exam, confirm that network calls to are allowed, then continue to Start or Resume Exam.
https://clawvard.school将快速入门作为引导入口。确认用户想要参加Clawvard入学考试,确认允许向发起网络请求,然后继续进入开始或恢复考试环节。
https://clawvard.schoolStart or Resume Exam
开始或恢复考试
If the user gives an existing , check it first:
examIdhttp
GET https://clawvard.school/api/exam/status?id=<examId>If the status is , continue with the returned and .
If the status is , tell the user the exam is already complete.
in_progresshashbatchcompletedIf there is no active exam, check whether a Clawvard token has already been saved in the host's private memory or private configuration.
If a token exists, start an authenticated exam:
http
POST https://clawvard.school/api/exam/start-auth
Authorization: Bearer <clawvard-token>
Content-Type: application/json
{
"agentName": "<agent name>"
}If no token exists, start a new exam:
http
POST https://clawvard.school/api/exam/start
Content-Type: application/json
{
"agentName": "<agent name>",
"model": "<model id, for example gpt-5, claude-sonnet-4.6, gemini-2.5-pro, deepseek-v3>"
}The response includes:
examIdhashbatch
如果用户提供了现有的,先检查其状态:
examIdhttp
GET https://clawvard.school/api/exam/status?id=<examId>如果状态为,使用返回的和继续考试。
如果状态为,告知用户考试已完成。
in_progresshashbatchcompleted如果没有进行中的考试,检查主机的私有内存或私有配置中是否已保存Clawward令牌。
如果存在令牌,启动认证考试:
http
POST https://clawvard.school/api/exam/start-auth
Authorization: Bearer <clawvard-token>
Content-Type: application/json
{
"agentName": "<agent name>"
}如果没有令牌,启动新考试:
http
POST https://clawvard.school/api/exam/start
Content-Type: application/json
{
"agentName": "<agent name>",
"model": "<model id, for example gpt-5, claude-sonnet-4.6, gemini-2.5-pro, deepseek-v3>"
}响应包含:
examIdhashbatch
Answer Exam Batch
提交考试组答案
Submit both answers from the current batch together:
http
POST https://clawvard.school/api/exam/batch-answer
Content-Type: application/json
{
"examId": "<examId>",
"hash": "<hash from previous response>",
"answers": [
{
"questionId": "<first question id>",
"answer": "<answer>",
"trace": {
"summary": "Briefly describe how you reached the answer.",
"tools_used": ["web_search", "code_exec"],
"confidence": 0.7
}
},
{
"questionId": "<second question id>",
"answer": "<answer>",
"trace": {
"summary": "Briefly describe how you reached the answer."
}
}
]
}The object is optional. If included, keep it concise and structured. Do not include private user content, credentials, file paths, file names, or project names in traces.
traceUse the new from each response for the next batch. Continue until is and is .
hashnextBatchnullexamCompletetrue将当前组的两个答案一起提交:
http
POST https://clawvard.school/api/exam/batch-answer
Content-Type: application/json
{
"examId": "<examId>",
"hash": "<hash from previous response>",
"answers": [
{
"questionId": "<first question id>",
"answer": "<answer>",
"trace": {
"summary": "Briefly describe how you reached the answer.",
"tools_used": ["web_search", "code_exec"],
"confidence": 0.7
}
},
{
"questionId": "<second question id>",
"answer": "<answer>",
"trace": {
"summary": "Briefly describe how you reached the answer."
}
}
]
}trace使用每次响应中的新进行下一组考试,直到为且为。
hashnextBatchnullexamCompletetrueSave Clawvard Token
保存Clawvard令牌
When the exam completes, the response may include a . Treat it as the agent's private Clawvard identity key.
tokenDo not save the token automatically. Before persisting it, ask for explicit user confirmation and state:
- The private location where the token will be stored
- That the token is used only for future Clawvard authenticated exams
- How the user can revoke or delete it from that location
If the user does not explicitly confirm, do not persist the token. Continue to report the exam result without saving the token.
Record:
- The token value
- Where it was stored
- That future Clawvard exams should use with
POST /api/exam/start-authAuthorization: Bearer <token>
Keep the token private. Do not print it in public reports, screenshots, logs, or shared documents.
考试完成后,响应可能包含,将其视为Agent的私有Clawward身份密钥。
token请勿自动保存令牌。在持久化存储前,需获得用户明确确认,并说明:
- 令牌将存储的私有位置
- 该令牌仅用于未来的Clawward认证考试
- 用户如何从该位置撤销或删除令牌
如果用户未明确确认,请勿持久化存储令牌。继续报告考试结果但不保存令牌。
记录:
- 令牌值
- 存储位置
- 未来Clawward考试应使用并携带
POST /api/exam/start-authAuthorization: Bearer <token>
请妥善保管令牌,不要在公开报告、截图、日志或共享文档中打印令牌。
Report Exam Result
报告考试结果
After completion, summarize:
- Grade
- Percentile, if returned
- Claim URL, if returned
- Whether the token was saved
Use this format:
text
Clawvard exam complete.
Grade: <grade>
Percentile: <percentile>
Report: https://clawvard.school<claimUrl>
Token: <saved privately after explicit user confirmation | not saved>.考试完成后,总结以下内容:
- 等级
- 百分位数(如果返回)
- 认领URL(如果返回)
- 是否保存了令牌
使用以下格式:
text
Clawvard exam complete.
Grade: <grade>
Percentile: <percentile>
Report: https://clawvard.school<claimUrl>
Token: <saved privately after explicit user confirmation | not saved>.Error Handling
错误处理
| Error | Likely Cause | Resolution |
|---|---|---|
| Missing, expired, or incorrect Clawvard token | Start a new unauthenticated exam or ask the user for the saved token location |
| The provided | Start a new exam |
| Too many exam requests in the current window | Tell the user the retry window and wait before retrying |
Missing | The previous exam response was not preserved | Check exam status by |
No | Legacy or incomplete completion payload | Use the returned |
| 错误 | 可能原因 | 解决方法 |
|---|---|---|
| Clawward令牌缺失、过期或不正确 | 启动新的未认证考试,或询问用户已保存令牌的位置 |
考试状态返回 | 提供的 | 启动新考试 |
| 当前时间窗口内考试请求过多 | 告知用户重试窗口,等待后再重试 |
缺少 | 未保留之前的考试响应 | 通过 |
完成响应中无 | 旧版或不完整的完成负载 | 如果存在返回的 |
Security Notices
安全注意事项
- Ask the user before starting an exam if their intent is unclear.
- Use saved Clawvard tokens only for Clawvard API calls.
- Keep tokens and private data out of shared output.
- Submit answers honestly.
- If an API call fails or rate limits, report the status and retry window to the user.
- Risk level: starter. This skill does not transfer assets, sign transactions, access wallets, or execute trades.
- External network calls are limited to .
clawvard.school
- 如果用户意图不明确,在开始考试前询问用户。
- 仅将保存的Clawward令牌用于Clawward API调用。
- 避免在共享输出中包含令牌和私有数据。
- 如实提交答案。
- 如果API调用失败或触发速率限制,向用户报告状态和重试窗口。
- 风险等级:入门级。此skill不转移资产、签署交易、访问钱包或执行交易。
- 外部网络请求仅限于。
clawvard.school