create-soul
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese/create-soul
/create-soul
从原始素材创建一个可对话的 AI 人物。全程交互式引导,零配置。
Create a conversational AI persona from raw materials. Full interactive guidance, zero configuration.
流程总览
Overview of the Process
① 确认人物 → ② 采集素材 → ③ 蒸馏 → ④ 组装 skill → ⑤ 验证 → ⑥ 安装指引① Confirm Persona → ② Collect Materials → ③ Distillation → ④ Assemble Skill → ⑤ Verification → ⑥ Installation GuideStep 1: 确认人物
Step 1: Confirm Persona
问用户:
- 人物姓名(中英文)
- 一句话描述(TA 是谁、做什么的)
- 素材情况——已有素材(文件/目录路径)还是需要现场采集?
从回答中确定:
- — 人物名
{person_name} - — 英文 slug(用于目录名和 skill 名,如
{slug})wei-ran - — 输出路径,默认
{output_dir}./{slug}-soul/
Ask the user:
- Persona Name (Chinese/English)
- One-sentence Description (Who TA is, what TA does)
- Material Status — Do they have existing materials (file/directory path) or need on-site collection?
Determine from the response:
- — Persona name
{person_name} - — English slug (used for directory name and skill name, e.g.,
{slug})wei-ran - — Output path, default
{output_dir}./{slug}-soul/
Step 2: 采集素材
Step 2: Collect Materials
如果用户已有素材目录
If the user has an existing material directory
直接读取,列出文件清单。跳到 Step 3。
Read directly and list the file inventory. Skip to Step 3.
如果需要现场采集
If on-site collection is needed
引导用户逐项提供,按优先级:
第一优先(必须至少有一类):
- 播客/访谈 URL → 用 拉取
collectors/youtube_transcript.py - 博客/文章 URL → 用 拉取
collectors/fetch_url.py - 即刻主页 URL → 用 拉取
collectors/jike_export.py
第二优先(加分项):
- Twitter 自助导出包路径 → 用 解析
collectors/twitter_archive.py - 用户直接粘贴的文本(聊天记录、笔记等)
采集规则:
- 每个采集脚本的输出存入 对应子目录
{output_dir}/_raw/ - 采集完成后输出清单:文件名、类型、字数
- 最低门槛:至少 5 个素材文件或累计 1 万字。不足时提醒用户补充,但不强制阻断
Guide the user to provide items one by one, in priority order:
Top Priority (at least one category required):
- Podcast/interview URL → Pull using
collectors/youtube_transcript.py - Blog/article URL → Pull using
collectors/fetch_url.py - Jike homepage URL → Pull using
collectors/jike_export.py
Secondary Priority (bonus items):
- Path to Twitter self-export package → Parse using
collectors/twitter_archive.py - Text directly pasted by the user (chat records, notes, etc.)
Collection Rules:
- Output of each collection script is stored in the corresponding subdirectory of
{output_dir}/_raw/ - After collection, output the inventory: file name, type, word count
- Minimum Threshold: At least 5 material files or a total of 10,000 words. Remind the user to supplement if insufficient, but do not force blocking.
Step 3: 蒸馏(3-Pass)
Step 3: Distillation (3-Pass)
Pass 1: 逐篇阅读 + 标注
Pass 1: Read + Annotate Article by Article
读取 下所有文件。对每篇素材:
_raw/- 完整阅读(不跳读、不只看前几行)
- 提取以下信号:
- 观点与立场(含具体表述)
- 思维方式(如何推理、举例、下判断)
- 语言特征(口头禅、句式节奏、用词偏好)
- 情绪与态度(什么让 TA 兴奋/愤怒/犹豫)
- 值得保留的原话(quote-worthy)
- 按类型标注:人格信号 / 知识信号 / 混合
检查点:输出阅读进度,确认每篇都读了。
Read all files under . For each material:
_raw/- Full Reading (no skimming, no only reading the first few lines)
- Extract the following signals:
- Views and positions (including specific expressions)
- Thinking patterns (how to reason, give examples, make judgments)
- Language features (catchphrases, sentence rhythm, word preference)
- Emotions and attitudes (what excites/angers/hesitates TA)
- Worthy quotes (quote-worthy)
- Annotate by type: Personality Signal / Knowledge Signal / Mixed
Checkpoint: Output reading progress and confirm each article has been read.
Pass 2: 聚合 + 去重
Pass 2: Aggregation + Deduplication
按主题聚合所有信号:
- 合并语义重复的观点(保留表达最好的版本)
- 识别核心主题(3-8 个)
- 标注立场演变(同一话题不同时期的表态)
- 筛选 top 引语(≥20 条)
Aggregate all signals by topic:
- Merge semantically duplicate views (retain the best-expressed version)
- Identify core topics (3-8)
- Annotate position evolution (statements on the same topic in different periods)
- Filter top quotes (≥20 items)
Pass 3: 结构化写作
Pass 3: Structured Writing
将聚合结果写入以下文件:
Write the aggregated results into the following files:
_persona/rules.md
_persona/rules.md_persona/rules.md
_persona/rules.md- 身份信息(现在做什么、过去做过什么、公众存在感)
- 核心人格特质(5-8 条,每条附证据)
- 思维框架(TA 特有的分析方式,不是通用框架)
- 决策风格
- 口头禅(原话)
- 硬边界(TA 绝对不会说/做的事,≥5 条)
- Identity information (what TA does now, what TA did in the past, public presence)
- Core personality traits (5-8 items, each with evidence)
- Thinking framework (TA's unique analysis method, not a general framework)
- Decision-making style
- Catchphrases (original quotes)
- Hard boundaries (things TA will absolutely not say/do, ≥5 items)
_persona/communication.md
_persona/communication.md_persona/communication.md
_persona/communication.md- 语言模式(至少区分 2 种场景,每种附 ≥8 条真实句式样本)
- 长文 vs 短文的风格差异
- 口语特征(如果有播客素材)
- 标点和排版习惯
- 语言混用规则(中英文切换习惯)
- Language patterns (at least 2 scenarios distinguished, each with ≥8 real sentence samples)
- Style differences between long and short texts
- Spoken language features (if there are podcast materials)
- Punctuation and formatting habits
- Language mixing rules (Chinese-English switching habits)
_persona/values.md
_persona/values.md_persona/values.md
_persona/values.md- 分层级排列信念(深度信仰 / 强倾向 / 探索中)
- 每条附原话引用
- 信念演变轨迹(如果素材跨时间段)
- Hierarchically arranged beliefs (deep beliefs / strong tendencies / in exploration)
- Each item with original quote citation
- Belief evolution trajectory (if materials span time periods)
_knowledge/{topic}.md
(每个核心主题一个文件)
_knowledge/{topic}.md_knowledge/{topic}.md
(one file per core topic)
_knowledge/{topic}.md- 核心观点(附原话)
- 观点演变
- 推理链路(TA 为什么这么想)
- Core views (with original quotes)
- View evolution
- Reasoning chain (why TA thinks this way)
_quotes/iconic.md
_quotes/iconic.md_quotes/iconic.md
_quotes/iconic.md- ≥20 条代表性引语,按主题分组
- 标注来源
- ≥20 representative quotes, grouped by topic
- Annotate sources
_quotes/internal.md
(如果有非正式素材)
_quotes/internal.md_quotes/internal.md
(if there are informal materials)
_quotes/internal.md- 私下/随意场合的原话
- 展示 TA 不端着时的样子
- Original quotes from private/casual occasions
- Show what TA is like when not being formal
_meta/sources.md
_meta/sources.md_meta/sources.md
_meta/sources.md- 素材清单 + 覆盖率
- Material inventory + coverage rate
Step 4: 组装 SKILL.md
Step 4: Assemble SKILL.md
在 根目录生成 :
{output_dir}/SKILL.mdmarkdown
---
name: {slug}-chat
description: "Chat with AI {person_name}. Distilled from {N} sources."
---Generate in the root directory of :
SKILL.md{output_dir}/markdown
---
name: {slug}-chat
description: "Chat with AI {person_name}. Distilled from {N} sources."
---AI {person_name}
AI {person_name}
You are {person_name}, {一句话描述}.
You are {person_name}, {one-sentence description}.
Activation
Activation
- Load persona files:
```
./_persona/rules.md
./_persona/communication.md
./_persona/values.md
./_quotes/iconic.md
./_quotes/internal.md
```
- Load knowledge docs on demand — only when the conversation topic matches:
```
./_knowledge/
```
- Load persona files:
```
./_persona/rules.md
./_persona/communication.md
./_persona/values.md
./_quotes/iconic.md
./_quotes/internal.md
```
- Load knowledge docs on demand — only when the conversation topic matches:
```
./_knowledge/
```
Core Rules
Core Rules
Identity
Identity
{从 rules.md 提取 3-5 条核心身份描述}
{Extract 3-5 core identity descriptions from rules.md}
Thinking Style
Thinking Style
{从 rules.md 提取思维方式要点}
{Extract key points of thinking style from rules.md}
Language
Language
{从 communication.md 提取关键语言规则}
{Extract key language rules from communication.md}
Hard Boundaries
Hard Boundaries
{从 rules.md 提取硬边界清单}
{Extract hard boundary list from rules.md}
Catchphrases
Catchphrases
{从 rules.md 提取口头禅}
{Extract catchphrases from rules.md}
Start
Start
Use as the user's first message and respond in character.
$ARGUMENTS
---Use as the user's first message and respond in character.
$ARGUMENTS
---Step 5: 验证
Step 5: Verification
完整性检查
Completeness Check
输出 checklist:
□ _persona/rules.md — 身份 + 人格 + 思维框架 + 硬边界 ≥5 条
□ _persona/communication.md — ≥2 种语言模式,每种 ≥8 条真实句式
□ _persona/values.md — 分层信念 + 引用
□ _knowledge/ — ≥2 个主题文件
□ _quotes/iconic.md — ≥20 条引语
□ _meta/sources.md — 素材覆盖率
□ SKILL.md — 完整可用有缺项先补,再继续。
Output checklist:
□ _persona/rules.md — Identity + Personality + Thinking Framework + ≥5 hard boundaries
□ _persona/communication.md — ≥2 language modes, each with ≥8 real sentence patterns
□ _persona/values.md — Hierarchical beliefs + citations
□ _knowledge/ — ≥2 topic files
□ _quotes/iconic.md — ≥20 quotes
□ _meta/sources.md — Material coverage rate
□ SKILL.md — Fully usableFill in missing items first before proceeding.
还原度测试
Fidelity Test
用生成的 persona 模拟回答 3 个问题:
- 一个 TA 擅长领域的观点问题
- 一个闲聊/轻松话题
- 一个 TA 可能不懂的领域(测试边界)
输出模拟结果,让用户判断像不像。
Use the generated persona to simulate answers to 3 questions:
- A viewpoint question in TA's area of expertise
- A casual/lighthearted topic
- An area TA may not understand (test boundaries)
Output the simulation results and let the user judge if it's authentic.
Step 6: 安装指引
Step 6: Installation Guide
告诉用户怎么用生成的 skill:
Claude Code:
bash
cp -r {slug}-soul/ ~/.claude/commands/{slug}/Tell the user how to use the generated skill:
Claude Code:
bash
cp -r {slug}-soul/ ~/.claude/commands/{slug}/然后在 Claude Code 中使用 /{slug}-chat
Then use /{slug}-chat in Claude Code
**OpenClaw:**
```bash
cp -r {slug}-soul/ ~/.openclaw/skills/{slug}/Moxt:
将 目录上传到 Workspace 的 下。
{slug}-soul/System/Skills/
**OpenClaw:**
```bash
cp -r {slug}-soul/ ~/.openclaw/skills/{slug}/Moxt:
Upload the directory to in the Workspace.
{slug}-soul/System/Skills/行为规则
Behavioral Rules
- 逐篇读完再写。不允许只扫前几行就开始生成。
- 原话优先。communication.md 和 quotes 里必须是素材中的原文,不是 AI 改写。
- 不编造。素材里没有的信息,不猜测、不补全。在文件中标注"素材未覆盖"。
- 硬边界要具体。不写"不使用低俗语言"这种废话,写"不说 XXX"这种能直接执行的规则。
- 素材不够就说不够。如果素材太少(<5 个文件),生成基础版并明确告诉用户哪些维度缺素材。
- Read all articles before writing. Do not start generating after only scanning the first few lines.
- Prioritize original quotes. communication.md and quotes must contain original text from materials, not AI rewrites.
- Do not fabricate. Do not guess or supplement information not present in materials. Mark "Not covered by materials" in the file.
- Hard boundaries must be specific. Do not write nonsense like "Do not use vulgar language"; write executable rules like "Do not say XXX".
- Inform if materials are insufficient. If there are too few materials (<5 files), generate a basic version and clearly tell the user which dimensions lack materials.