skill-idea-miner

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Skill Idea Miner

技能创意挖掘器

Automatically extract skill idea candidates from Claude Code session logs, score them for novelty, feasibility, and trading value, and maintain a prioritized backlog for downstream skill generation.
自动从Claude Code会话日志中提取技能创意候选,对其新颖性、可行性和交易价值进行评分,并为下游的技能生成维护一个优先级排序的待办列表。

When to Use

适用场景

  • Weekly automated pipeline run (Saturday 06:00 via launchd)
  • Manual backlog refresh:
    python3 scripts/run_skill_generation_pipeline.py --mode weekly
  • Dry-run to preview candidates without LLM scoring
  • 每周自动流水线运行(通过launchd于周六06:00执行)
  • 手动刷新待办列表:
    python3 scripts/run_skill_generation_pipeline.py --mode weekly
  • 试运行模式,无需LLM评分即可预览候选创意

Workflow

工作流程

Stage 1: Session Log Mining

阶段1:会话日志挖掘

  1. Enumerate session logs from allowlist projects in
    ~/.claude/projects/
  2. Filter to past 7 days by file mtime, confirm with
    timestamp
    field
  3. Extract user messages (
    type: "user"
    ,
    userType: "external"
    )
  4. Extract tool usage patterns from assistant messages
  5. Run deterministic signal detection:
    • Skill usage frequency (
      skills/*/
      path references)
    • Error patterns (non-zero exit codes,
      is_error
      flags, exception keywords)
    • Repetitive tool sequences (3+ tools repeated 3+ times)
    • Automation request keywords (English and Japanese)
    • Unresolved requests (5+ minute gap after user message)
  6. Invoke Claude CLI headless for idea abstraction
  7. Output
    raw_candidates.yaml
  1. 枚举
    ~/.claude/projects/
    下白名单项目的会话日志
  2. 按文件修改时间过滤过去7天的日志,通过
    timestamp
    字段二次确认
  3. 提取用户消息(
    type: "user"
    userType: "external"
  4. 从助手消息中提取工具使用模式
  5. 运行确定性信号检测:
    • 技能使用频率(
      skills/*/
      路径引用)
    • 错误模式(非零退出码、
      is_error
      标记、异常关键字)
    • 重复工具序列(3个及以上工具重复出现3次及以上)
    • 自动化请求关键字(英文和日文)
    • 未解决请求(用户消息发送后间隔5分钟以上无响应)
  6. 无头模式调用Claude CLI进行创意抽象
  7. 输出
    raw_candidates.yaml

Stage 2: Scoring and Deduplication

阶段2:评分与去重

  1. Load existing skills from
    skills/*/SKILL.md
    frontmatter
  2. Deduplicate via Jaccard similarity (threshold > 0.5) against:
    • Existing skill names and descriptions
    • Existing backlog ideas
  3. Score non-duplicate candidates with Claude CLI:
    • Novelty (0-100): differentiation from existing skills
    • Feasibility (0-100): technical implementability
    • Trading Value (0-100): practical value for investors/traders
    • Composite = 0.3 * Novelty + 0.3 * Feasibility + 0.4 * Trading Value
  4. Merge scored candidates into
    logs/.skill_generation_backlog.yaml
  1. skills/*/SKILL.md
    的前置元数据中加载现有技能
  2. 通过Jaccard相似度(阈值>0.5)与以下内容对比进行去重:
    • 现有技能的名称和描述
    • 现有待办列表中的创意
  3. 使用Claude CLI对非重复候选进行评分:
    • 新颖性(0-100):与现有技能的差异化程度
    • 可行性(0-100):技术可实现程度
    • 交易价值(0-100):对投资者/交易者的实用价值
    • 综合得分 = 0.3 * 新颖性 + 0.3 * 可行性 + 0.4 * 交易价值
  4. 将评分后的候选合并到
    logs/.skill_generation_backlog.yaml

Output Format

输出格式

raw_candidates.yaml

raw_candidates.yaml

yaml
generated_at_utc: "2026-03-08T06:00:00Z"
period: {from: "2026-03-01", to: "2026-03-07"}
projects_scanned: ["claude-trading-skills"]
sessions_scanned: 12
candidates:
  - id: "raw_2026w10_001"
    title: "Earnings Whispers Image Parser"
    source_project: "claude-trading-skills"
    evidence:
      user_requests: ["Extract earnings dates from screenshot"]
      pain_points: ["Manual image reading"]
      frequency: 3
    raw_description: "Parse Earnings Whispers screenshots to extract dates."
    category: "data-extraction"
yaml
generated_at_utc: "2026-03-08T06:00:00Z"
period: {from: "2026-03-01", to: "2026-03-07"}
projects_scanned: ["claude-trading-skills"]
sessions_scanned: 12
candidates:
  - id: "raw_2026w10_001"
    title: "Earnings Whispers Image Parser"
    source_project: "claude-trading-skills"
    evidence:
      user_requests: ["Extract earnings dates from screenshot"]
      pain_points: ["Manual image reading"]
      frequency: 3
    raw_description: "Parse Earnings Whispers screenshots to extract dates."
    category: "data-extraction"

Backlog (logs/.skill_generation_backlog.yaml)

待办列表(logs/.skill_generation_backlog.yaml)

yaml
updated_at_utc: "2026-03-08T06:15:00Z"
ideas:
  - id: "idea_2026w10_001"
    title: "Earnings Whispers Image Parser"
    description: "Skill that parses Earnings Whispers screenshots..."
    category: "data-extraction"
    scores: {novelty: 75, feasibility: 60, trading_value: 80, composite: 73}
    status: "pending"
yaml
updated_at_utc: "2026-03-08T06:15:00Z"
ideas:
  - id: "idea_2026w10_001"
    title: "Earnings Whispers Image Parser"
    description: "Skill that parses Earnings Whispers screenshots..."
    category: "data-extraction"
    scores: {novelty: 75, feasibility: 60, trading_value: 80, composite: 73}
    status: "pending"

Resources

相关资源

  • references/idea_extraction_rubric.md
    — Signal detection criteria and scoring rubric
  • scripts/mine_session_logs.py
    — Session log parser
  • scripts/score_ideas.py
    — Scorer and deduplicator
  • references/idea_extraction_rubric.md
    — 信号检测标准与评分规则
  • scripts/mine_session_logs.py
    — 会话日志解析器
  • scripts/score_ideas.py
    — 评分与去重工具