create-seed-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Create a Seed Skill

创建种子Skill

Scaffold an integration-type skill from scratch. A seed skill is SKILL.md + reference assets +
prompts.jsonl
entries that together teach an agent to sprout a working, testable integration.
从零开始搭建一个集成类型的Skill。种子Skill由SKILL.md、参考资源和
prompts.jsonl
条目组成,共同指导Agent生成可运行、可测试的集成代码。

When to Use This

使用场景

Integration skills (use this skill):
  • Wrapping an SDK or framework with code generation + real API tests
  • The eval question is: "Did the agent write code that calls real APIs and passes tests?"
  • Examples:
    ydc-ai-sdk-integration
    ,
    teams-anthropic-integration
Tool skills (do NOT use this skill):
  • CLI wrappers where the agent runs a command, not writes code
  • No test directory, no generated files
  • Examples:
    youdotcom-cli
集成类Skill(使用本Skill):
  • 封装SDK或框架,包含代码生成+真实API测试
  • 评估标准:“Agent是否编写了调用真实API并通过测试的代码?”
  • 示例:
    ydc-ai-sdk-integration
    teams-anthropic-integration
工具类Skill(请勿使用本Skill):
  • 仅封装CLI命令,Agent只需运行命令而非编写代码
  • 无测试目录,无生成文件
  • 示例:
    youdotcom-cli

Decision Point

决策点

Ask the developer first:
Is this an integration skill (the agent generates code + tests) or a tool skill (the agent runs a CLI command)?
  • Integration → Continue with this workflow
  • Tool → Stop. This skill is not needed for tool skills.

首先询问开发者:
这是集成类Skill(Agent生成代码+测试)还是工具类Skill(Agent运行CLI命令)?
  • 集成类 → 继续本工作流
  • 工具类 → 停止。本Skill不适用于工具类Skill。

Workflow

工作流

Step 1 — Gather Information

步骤1 — 收集信息

Ask these questions all at once (do not ask one by one):
  1. Skill name: What should it be called? (lowercase, hyphens, e.g.
    my-sdk-integration
    )
  2. Package(s): What npm/pip package(s) does it wrap? Include exact package names.
  3. Language(s): TypeScript, Python, or both?
    • Both means: separate asset files for each language, two
      prompts.jsonl
      entries (one
      -typescript
      , one
      -python
      ), and two
      tests/
      directories (
      tests/<skill-name>-typescript/
      and
      tests/<skill-name>-python/
      ), each with a
      .gitkeep
      file.
    • One language means: one set of assets, one
      prompts.jsonl
      entry, one
      tests/<skill-name>/
      directory with a
      .gitkeep
      file.
  4. Path A (basic): What is the simplest working integration — one function, one API call?
  5. Path B (extended): What is the natural extension — MCP, streaming, tool filtering, etc.?
  6. Required env vars: Which API keys are required? (e.g.
    MY_API_KEY
    ,
    OPENAI_API_KEY
    )
  7. Test query: What is a factual question with a deterministic, multi-keyword answer that the integration should be able to answer? (See Choosing a Test Query)
一次性询问以下所有问题(不要逐个提问):
  1. Skill名称:应该如何命名?(小写,连字符分隔,例如
    my-sdk-integration
  2. :它封装了哪些npm/pip包?请包含确切的包名。
  3. 语言:TypeScript、Python,还是两者都支持
    • 两者都支持意味着:为每种语言单独准备资源文件,两个
      prompts.jsonl
      条目(一个
      -typescript
      ,一个
      -python
      ),以及两个
      tests/
      目录(
      tests/<skill-name>-typescript/
      tests/<skill-name>-python/
      ),每个目录下都有一个
      .gitkeep
      文件。
    • 单一语言意味着:一套资源文件,一个
      prompts.jsonl
      条目,一个
      tests/<skill-name>/
      目录,目录下有一个
      .gitkeep
      文件。
  4. 路径A(基础版):最简单的可运行集成是什么?比如一个函数、一次API调用?
  5. 路径B(扩展版):自然的扩展功能是什么?例如MCP、流处理、工具过滤等。
  6. 必填环境变量:需要哪些API密钥?(例如
    MY_API_KEY
    OPENAI_API_KEY
  7. 测试查询:一个具有确定性、多关键词答案的事实性问题,集成需要能够回答该问题?(请参阅选择测试查询

Step 2 — Read Reference Assets

步骤2 — 阅读参考资源

Before writing any code, read the reference assets to understand the required structure:
  • assets/example-SKILL.md — complete SKILL.md template
  • assets/example-path-a.ts — TypeScript Path A asset structure
  • assets/example-test.spec.ts — TypeScript test asset structure
  • assets/example-pyproject.toml — Python project config asset
  • assets/example-test_integration.py — Python test asset structure
在编写任何代码之前,请阅读参考资源以了解所需结构:
  • assets/example-SKILL.md — 完整的SKILL.md模板
  • assets/example-path-a.ts — TypeScript路径A资源结构
  • assets/example-test.spec.ts — TypeScript测试资源结构
  • assets/example-pyproject.toml — Python项目配置资源
  • assets/example-test_integration.py — Python测试资源结构

Step 3 — Create Files

步骤3 — 创建文件

Create all files in one pass. The skill lives in
skills/<skill-name>/
.
Directory layout:
skills/<skill-name>/
├── SKILL.md
└── assets/
    # TypeScript:
    ├── path-a-<variant>.ts       # Path A integration
    ├── path-b-<variant>.ts       # Path B integration (if applicable)
    ├── integration.spec.ts       # Bun test file
    # Python:
    ├── path_a_<variant>.py       # Path A integration
    ├── path_b_<variant>.py       # Path B integration (if applicable)
    ├── test_integration.py       # pytest file
    └── pyproject.toml            # Python project config (required for uv run pytest)
一次性创建所有文件。Skill存储在
skills/<skill-name>/
目录下。
目录结构:
skills/<skill-name>/
├── SKILL.md
└── assets/
    # TypeScript:
    ├── path-a-<variant>.ts       # 路径A集成
    ├── path-b-<variant>.ts       # 路径B集成(如适用)
    ├── integration.spec.ts       # Bun测试文件
    # Python:
    ├── path_a_<variant>.py       # 路径A集成
    ├── path_b_<variant>.py       # 路径B集成(如适用)
    ├── test_integration.py       # pytest测试文件
    └── pyproject.toml            # Python项目配置(uv run pytest必需)

Also create (see below):

同时创建(见下文):

tests/<skill-name>/ # single language └── .gitkeep
tests/<skill-name>/ # 单一语言 └── .gitkeep

— or for both languages —

— 或针对两种语言 —

tests/<skill-name>-typescript/ └── .gitkeep tests/<skill-name>-python/ └── .gitkeep

**Also create the `tests/` eval target directory (or directories) with a `.gitkeep` file** so the directory exists in git before agents write to it:

- Single language: `tests/<skill-name>/.gitkeep`
- Both languages: `tests/<skill-name>-typescript/.gitkeep` **and** `tests/<skill-name>-python/.gitkeep`

```bash
tests/<skill-name>-typescript/ └── .gitkeep tests/<skill-name>-python/ └── .gitkeep

**同时创建`tests/`评估目标目录(或多个目录)并添加`.gitkeep`文件**,确保目录在Agent写入前已存在于git中:

- 单一语言:`tests/<skill-name>/.gitkeep`
- 两种语言:`tests/<skill-name>-typescript/.gitkeep` **和** `tests/<skill-name>-python/.gitkeep`

```bash

Single language (adjust path as needed):

单一语言(根据需要调整路径):

mkdir -p tests/<skill-name> && touch tests/<skill-name>/.gitkeep
mkdir -p tests/<skill-name> && touch tests/<skill-name>/.gitkeep

Both languages:

两种语言:

mkdir -p tests/<skill-name>-typescript tests/<skill-name>-python touch tests/<skill-name>-typescript/.gitkeep tests/<skill-name>-python/.gitkeep
undefined
mkdir -p tests/<skill-name>-typescript tests/<skill-name>-python touch tests/<skill-name>-typescript/.gitkeep tests/<skill-name>-python/.gitkeep
undefined

Step 4 — Add prompts.jsonl Entry

步骤4 — 添加prompts.jsonl条目

Append one entry per language to
data/prompts/prompts.jsonl
.
Single language template (pick one concrete example — do NOT leave
TypeScript
or
Python
as a placeholder):
TypeScript:
jsonl
{"id":"<skill-name>","input":["Using the <skill-name> skill, create a working TypeScript <description of Path A> integration. Write flat minimal code with no comments or TSDoc. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name> directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or TSDoc. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>","language":"typescript"}}
Python:
jsonl
{"id":"<skill-name>","input":["Using the <skill-name> skill, create a working Python <description of Path A> integration. Write flat minimal code with no comments or docstrings. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name> directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or docstrings. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>","language":"python"}}
Both languages — append TWO entries:
jsonl
{"id":"<skill-name>-typescript","input":["Using the <skill-name> skill, create a working TypeScript <description of Path A> integration. Write flat minimal code with no comments or TSDoc. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name>-typescript directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or TSDoc. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>-typescript","language":"typescript"}}
{"id":"<skill-name>-python","input":["Using the <skill-name> skill, create a working Python <description of Path A> integration. Write flat minimal code with no comments or docstrings. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name>-python directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or docstrings. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>-python","language":"python"}}
Rules for prompts:
  • Exactly 2 turns per entry
  • Name the language explicitly in Turn 1 ("TypeScript" or "Python") so the agent doesn't guess
  • Describe outcomes only — never mention class names, method names, or config keys
  • Turn 1: basic integration + tests
  • Turn 2: extension + update tests
  • metadata.cwd
    must match the
    tests/
    directory created in Step 3 (the grader reads files from here)
  • metadata.language
    must be
    typescript
    or
    python
为每种语言向
data/prompts/prompts.jsonl
追加一个条目。
单一语言模板(选择一个具体示例 — 请勿将
TypeScript
Python
作为占位符保留):
TypeScript:
jsonl
{"id":"<skill-name>","input":["Using the <skill-name> skill, create a working TypeScript <description of Path A> integration. Write flat minimal code with no comments or TSDoc. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name> directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or TSDoc. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>","language":"typescript"}}
Python:
jsonl
{"id":"<skill-name>","input":["Using the <skill-name> skill, create a working Python <description of Path A> integration. Write flat minimal code with no comments or docstrings. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name> directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or docstrings. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>","language":"python"}}
两种语言 — 追加两个条目:
jsonl
{"id":"<skill-name>-typescript","input":["Using the <skill-name> skill, create a working TypeScript <description of Path A> integration. Write flat minimal code with no comments or TSDoc. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name>-typescript directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or TSDoc. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>-typescript","language":"typescript"}}
{"id":"<skill-name>-python","input":["Using the <skill-name> skill, create a working Python <description of Path A> integration. Write flat minimal code with no comments or docstrings. Write integration tests that call real APIs and assert on meaningful response content. Save everything to the tests/<skill-name>-python directory.","Extend the integration with <description of Path B>. Write flat minimal code with no comments or docstrings. Update the integration tests to verify the extended integration also works with a live query."],"metadata":{"cwd":"tests/<skill-name>-python","language":"python"}}
提示词规则:
  • 每个条目恰好包含2轮对话
  • 在第一轮中明确指定语言(“TypeScript”或“Python”),避免Agent猜测
  • 仅描述结果 — 绝不提及类名、方法名或配置键
  • 第一轮:基础集成+测试
  • 第二轮:扩展功能+更新测试
  • metadata.cwd
    必须与步骤3中创建的
    tests/
    目录匹配(评分器从此处读取文件)
  • metadata.language
    必须为
    typescript
    python

Step 5 — Create Symlink

步骤5 — 创建符号链接

bash
ln -s ../../skills/<skill-name> .claude/skills/<skill-name>
bash
ln -s ../../skills/<skill-name> .claude/skills/<skill-name>

Step 6 — Validate

步骤6 — 验证

bash
bunx @plaited/development-skills validate-skill skills/<skill-name>

bash
bunx @plaited/development-skills validate-skill skills/<skill-name>

What Makes a Good Seed Skill

优质种子Skill的标准

SKILL.md Must Contain

SKILL.md必须包含

  1. Correct frontmatter
    name
    ,
    description
    ,
    license
    ,
    compatibility
    ,
    allowed-tools
    ,
    assets
    list,
    metadata
  2. Decision point — brief Path A vs Path B description, one clear question
  3. Install instructions — exact package name and install command
  4. Complete code templates — full working examples, not pseudo-code
  5. Security section — if the integration fetches untrusted web content, include prompt injection warning (W011)
  6. Generate Integration Tests section — markdown links to all asset files, explicit rules
  1. 正确的前置元数据
    name
    description
    license
    compatibility
    allowed-tools
    assets
    列表、
    metadata
  2. 决策点 — 简要描述路径A与路径B的区别,一个清晰的问题
  3. 安装说明 — 确切的包名和安装命令
  4. 完整的代码模板 — 可运行的完整示例,而非伪代码
  5. 安全章节 — 如果集成会获取不可信的网络内容,请包含提示注入警告(W011)
  6. 生成集成测试章节 — 指向所有资源文件的markdown链接,明确的规则

Generate Integration Tests Rules (include all of these)

生成集成测试规则(需包含所有以下内容)

markdown
**Rules:**
- No mocks — call real APIs
- Assert on keywords from a deterministic query, not just `length > 0`
- Validate required env vars at test start (inside the test function, not at module scope)
- TypeScript: use `bun:test`, dynamic imports inside tests, `timeout: 60_000`
- Python: use `pytest`, import inside test function; always include `pyproject.toml` with `pytest` in `[dependency-groups] dev`
- Run TypeScript tests: `bun test` | Run Python tests: `uv run pytest`
markdown
**规则:**
- 不使用模拟 — 调用真实API
- 针对确定性查询的关键词进行断言,而非仅断言`length > 0`
- 在测试开始时验证必填环境变量(在测试函数内部,而非模块作用域)
- TypeScript:使用`bun:test`,在测试中使用动态导入,设置`timeout: 60_000`
- Python:使用`pytest`,在测试函数内部导入;始终包含`pyproject.toml`,并在`[dependency-groups] dev`中添加`pytest`
- 运行TypeScript测试:`bun test` | 运行Python测试:`uv run pytest`

Reference Assets Must

参考资源必须

  • Compile and run — no pseudo-code, no placeholders
  • Include security instructions in agent
    instructions
    /
    system_prompt
  • Use the test query from Step 1 in both the integration files (as the
    __main__
    example) and the test assertions
  • Assert on keywords — not just
    length > 0
  • TypeScript: export a callable function, use dynamic
    import()
    in tests
  • Python: define a
    main(query: str) -> str
    function, use deferred imports inside test functions

  • 可编译并运行 — 无伪代码,无占位符
  • 在Agent的
    instructions
    /
    system_prompt
    包含安全说明
  • 在集成文件(作为
    __main__
    示例)和测试断言中使用步骤1中的测试查询
  • 针对关键词进行断言 — 而非仅断言
    length > 0
  • TypeScript:导出可调用的函数,在测试中使用动态
    import()
  • Python:定义
    main(query: str) -> str
    函数,在测试函数内部使用延迟导入

Choosing a Test Query

选择测试查询

The real goal: verify the integration code ran, not that the LLM knows the answer.
Most LLMs can answer factual questions from memory without calling any tool. A query the model can answer without searching doesn't prove the MCP server or SDK tool was invoked — the integration may silently skip the tool and still pass.
What actually matters: the test passes only if the code path worked — the SDK was initialized, the MCP server was reached, and a response was returned. A keyword assertion that matches typical tool output is better than no assertion, but it doesn't prove the tool fired.
Practical guidance:
  • Prefix the query with an explicit instruction to use the tool — this forces invocation rather than relying on the model's judgment
  • Use a query with a stable, multi-keyword answer so you can assert on content, not just
    length > 0
  • The LLM-as-judge grader (
    scripts/grader.ts
    ) also evaluates the generated code structure, not just whether keywords appear
Good examples (explicit tool instruction + stable keywords):
  • "Search the web for the three branches of the US government"
    → assert
    legislative
    ,
    executive
    ,
    judicial
  • "Use web search to find what programming language TypeScript compiles to"
    → assert
    javascript
  • "Search the web for the four classical elements"
    → assert
    earth
    ,
    water
    ,
    fire
    ,
    air
The
"Search the web for..."
or
"Use [tool name] to find..."
prefix makes tool use an explicit instruction, not an inference — the model must call the tool to follow the prompt.
Avoid:
  • Plain factual queries ("What are the three branches...") — model may answer from memory, skipping the tool entirely
  • "What is the latest news in AI?" — changes daily, no predictable keywords
  • "Say hello in one sentence." — no meaningful content assertion possible

核心目标: 验证集成代码已运行,而非验证LLM知道答案。
大多数LLM可以仅凭记忆回答事实性问题,无需调用任何工具。如果模型无需搜索即可回答查询,则无法证明MCP服务器或SDK工具已被调用 — 集成可能会静默跳过工具仍通过测试。
关键要点: 只有当代码路径正常工作时,测试才会通过 — SDK已初始化、MCP服务器已连接、响应已返回。针对工具输出的典型关键词进行断言比无断言更好,但仍无法证明工具已触发。
实用指南:
  • 在查询前添加明确的工具使用指令 — 这会强制模型调用工具,而非依赖模型的判断
  • 使用具有稳定、多关键词答案的查询,以便对内容进行断言,而非仅断言
    length > 0
  • LLM评分器(
    scripts/grader.ts
    )还会评估生成的代码结构,而非仅检查关键词是否出现
优秀示例(明确的工具指令+稳定关键词):
  • "Search the web for the three branches of the US government"
    → 断言包含
    legislative
    executive
    judicial
  • "Use web search to find what programming language TypeScript compiles to"
    → 断言包含
    javascript
  • "Search the web for the four classical elements"
    → 断言包含
    earth
    water
    fire
    air
"Search the web for..."
"Use [工具名称] to find..."
前缀将工具使用变为明确指令,而非推断 — 模型必须调用工具才能遵循提示词。
应避免:
  • 纯事实性查询(“What are the three branches...”) — 模型可能仅凭记忆回答,完全跳过工具
  • “What is the latest news in AI?” — 内容每日变化,无可预测的关键词
  • “Say hello in one sentence.” — 无法进行有意义的内容断言

Eval Grader Notes

评估评分器说明

The grader (
scripts/grader.ts
) runs from
metadata.cwd
:
  • TypeScript: runs
    bun test
    , scans
    **/*.{ts,js}
    for generated files
  • Python: runs
    uv run pytest
    (requires
    pyproject.toml
    in the test dir)
  • Sends test output + generated file contents to Haiku for LLM-as-judge scoring (0.0–1.0)
  • Pass threshold is ~0.65; target is 0.95
Common failure modes to prevent in your SKILL.md:
  • pytest missing: Always include
    pyproject.toml
    asset with pytest in dev deps
  • Module-scope env checks: Python imports should be inside test functions to avoid collection errors
  • Prescriptive prompts: prompts.jsonl should describe outcomes, not implementation steps
  • Tool introspection: Never instruct agents to assert on SDK event streams or tool call objects
  • open-ended test queries: "latest AI news" produces unpredictable content — use factual queries with deterministic keywords

评分器(
scripts/grader.ts
)从
metadata.cwd
运行:
  • TypeScript:运行
    bun test
    ,扫描
    **/*.{ts,js}
    查找生成的文件
  • Python:运行
    uv run pytest
    (测试目录中需要
    pyproject.toml
  • 将测试输出+生成的文件内容发送给Haiku进行LLM评分(0.0–1.0)
  • 通过阈值约为0.65;目标为0.95
SKILL.md中需要避免的常见失败模式:
  • 缺少pytest:始终包含
    pyproject.toml
    资源,并在开发依赖中添加pytest
  • 模块作用域的环境检查:Python导入应在测试函数内部,避免收集错误
  • 指令式提示词:prompts.jsonl应描述结果,而非实现步骤
  • 工具自省:绝不指示Agent对SDK事件流或工具调用对象进行断言
  • 开放式测试查询:“latest AI news”会产生不可预测的内容 — 使用具有确定性关键词的事实性查询

Example: Adding a New Python Integration Skill

示例:添加新的Python集成Skill

Given: wrapping the
httpx
library with You.com search, Python only.
Files to create:
skills/ydc-httpx-integration/
├── SKILL.md
└── assets/
    ├── path_a_basic.py
    ├── test_integration.py
    └── pyproject.toml

tests/ydc-httpx-integration/
└── .gitkeep
Create the tests directory:
bash
mkdir -p tests/ydc-httpx-integration && touch tests/ydc-httpx-integration/.gitkeep
prompts.jsonl entry:
jsonl
{"id":"ydc-httpx-integration","input":["Using the ydc-httpx-integration skill, create a working Python application that calls the You.com search API directly with httpx and returns search results. Write integration tests that call the real API and verify the response contains expected keywords. Save everything to the tests/ydc-httpx-integration directory.","Extend the integration to also support content extraction from URLs. Update the integration tests to verify both search and content extraction work with live queries."],"metadata":{"cwd":"tests/ydc-httpx-integration","language":"python"}}
需求:封装
httpx
库实现You.com搜索,仅支持Python。
需创建的文件:
skills/ydc-httpx-integration/
├── SKILL.md
└── assets/
    ├── path_a_basic.py
    ├── test_integration.py
    └── pyproject.toml

tests/ydc-httpx-integration/
└── .gitkeep
创建测试目录:
bash
mkdir -p tests/ydc-httpx-integration && touch tests/ydc-httpx-integration/.gitkeep
prompts.jsonl条目:
jsonl
{"id":"ydc-httpx-integration","input":["Using the ydc-httpx-integration skill, create a working Python application that calls the You.com search API directly with httpx and returns search results. Write integration tests that call the real API and verify the response contains expected keywords. Save everything to the tests/ydc-httpx-integration directory.","Extend the integration to also support content extraction from URLs. Update the integration tests to verify both search and content extraction work with live queries."],"metadata":{"cwd":"tests/ydc-httpx-integration","language":"python"}}