self-improving-agent

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Self-Improving Agent

自我改进Agent

"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research

"一款可从每次交互中学习、积累模式和洞见以持续提升自身能力的AI Agent。" — 基于2025年终身学习研究

Overview

概述

This is a universal self-improvement system that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:

Multi-Memory Architecture: Semantic + Episodic + Working memory
Self-Correction: Detects and fixes skill guidance errors
Self-Validation: Periodically verifies skill accuracy
Hooks Integration: Auto-triggers on skill events (before_start, after_complete, on_error)
Evolution Markers: Traceable changes with source attribution

这是一个通用自我改进系统，可从所有技能经验中学习，而不仅仅是PRD。它实现了一个完整的反馈循环，包含：

多内存架构：语义+情景+工作内存
自我修正：检测并修复技能指导错误
自我验证：定期验证技能准确性
钩子集成：在技能事件（before_start、after_complete、on_error）时自动触发
演进标记：可追溯的变更及来源归因

Research-Based Design

基于研究的设计

Based on 2025 research:

Research	Key Insight	Application
SimpleMem	Efficient lifelong memory	Pattern accumulation system
Multi-Memory Survey	Semantic + Episodic memory	World knowledge + experiences
Lifelong Learning	Continuous task stream learning	Learn from every skill use
Evo-Memory	Test-time lifelong learning	Real-time adaptation

基于2025年相关研究：

研究	核心洞见	应用
SimpleMem	高效终身内存	模式积累系统
Multi-Memory Survey	语义+情景内存	世界知识+经验存储
Lifelong Learning	持续任务流学习	从每次技能使用中学习
Evo-Memory	测试时终身学习	实时适配

The Self-Improvement Loop

自我改进循环

┌─────────────────────────────────────────────────────────────────┐
│                    UNIVERSAL SELF-IMPROVEMENT                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   Skill Event → Extract Experience → Abstract Pattern → Update  │
│        │                  │                │         │          │
│        ▼                  ▼                ▼         ▼          │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              MULTI-MEMORY SYSTEM                      │       │
│   ├─────────────────────────────────────────────────────┤       │
│   │  Semantic Memory   │  Episodic Memory  │ Working Memory │  │
│   │  (Patterns/Rules)  │  (Experiences)    │  (Current)     │  │
│   │  memory/semantic/  │  memory/episodic/ │  memory/working/│  │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              FEEDBACK LOOP                            │       │
│   │  User Feedback → Confidence Update → Pattern Adapt   │       │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                    UNIVERSAL SELF-IMPROVEMENT                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   Skill Event → Extract Experience → Abstract Pattern → Update  │
│        │                  │                │         │          │
│        ▼                  ▼                ▼         ▼          │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              MULTI-MEMORY SYSTEM                      │       │
│   ├─────────────────────────────────────────────────────┤       │
│   │  Semantic Memory   │  Episodic Memory  │ Working Memory │  │
│   │  (Patterns/Rules)  │  (Experiences)    │  (Current)     │  │
│   │  memory/semantic/  │  memory/episodic/ │  memory/working/│  │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              FEEDBACK LOOP                            │       │
│   │  User Feedback → Confidence Update → Pattern Adapt   │       │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

When This Activates

触发时机

Automatic Triggers (via hooks)

自动触发（通过钩子）

Event	Trigger	Action
before_start	Any skill starts	Log session start
after_complete	Any skill completes	Extract patterns, update skills
on_error	Bash returns non-zero exit	Capture error context, trigger self-correction

事件	触发条件	动作
before_start	任意技能启动	记录会话开始
after_complete	任意技能完成	提取模式，更新技能
on_error	Bash返回非零退出码	捕获错误上下文，触发自我修正

Manual Triggers

手动触发

User says "自我进化", "self-improve", "从经验中学习"
User says "分析今天的经验", "总结教训"
User asks to improve a specific skill

用户输入“自我进化”、“self-improve”、“从经验中学习”
用户输入“分析今天的经验”、“总结教训”
用户要求改进特定技能

Evolution Priority Matrix

演进优先级矩阵

Trigger evolution when new reusable knowledge appears:

Trigger	Target Skill	Priority	Action
New PRD pattern discovered	prd-planner	High	Add to quality checklist
Architecture tradeoff clarified	architecting-solutions	High	Add to decision patterns
API design rule learned	api-designer	High	Update template
Debugging fix discovered	debugger	High	Add to anti-patterns
Review checklist gap	code-reviewer	High	Add checklist item
Perf/security insight	performance-engineer, security-auditor	High	Add to patterns
UI/UX spec issue	prd-planner, architecting-solutions	High	Add visual spec requirements
React/state pattern	debugger, refactoring-specialist	Medium	Add to patterns
Test strategy improvement	test-automator, qa-expert	Medium	Update approach
CI/deploy fix	deployment-engineer	Medium	Add to troubleshooting

当出现新的可复用知识时触发演进：

触发条件	目标技能	优先级	动作
发现新的PRD模式	prd-planner	高	添加到质量检查清单
明确架构权衡	architecting-solutions	高	添加到决策模式
学到API设计规则	api-designer	高	更新模板
发现调试修复方案	debugger	高	添加到反模式库
发现评审检查清单漏洞	code-reviewer	高	添加检查项
获得性能/安全洞见	performance-engineer, security-auditor	高	添加到模式库
发现UI/UX规范问题	prd-planner, architecting-solutions	高	添加视觉规范要求
学到React/状态模式	debugger, refactoring-specialist	中	添加到模式库
改进测试策略	test-automator, qa-expert	中	更新方法
发现CI/部署修复方案	deployment-engineer	中	添加到故障排查库

Multi-Memory Architecture

多内存架构

1. Semantic Memory (

memory/semantic-patterns.json

)

1. 语义内存（

memory/semantic-patterns.json

）

Stores abstract patterns and rules reusable across contexts:

json

{
  "patterns": {
    "pattern_id": {
      "id": "pat-2025-01-11-001",
      "name": "Pattern Name",
      "source": "user_feedback|implementation_review|retrospective",
      "confidence": 0.95,
      "applications": 5,
      "created": "2025-01-11",
      "category": "prd_structure|react_patterns|async_patterns|...",
      "pattern": "One-line summary",
      "problem": "What problem does this solve?",
      "solution": { ... },
      "quality_rules": [ ... ],
      "target_skills": [ ... ]
    }
  }
}

存储可复用的抽象模式和规则：

json

{
  "patterns": {
    "pattern_id": {
      "id": "pat-2025-01-11-001",
      "name": "Pattern Name",
      "source": "user_feedback|implementation_review|retrospective",
      "confidence": 0.95,
      "applications": 5,
      "created": "2025-01-11",
      "category": "prd_structure|react_patterns|async_patterns|...",
      "pattern": "One-line summary",
      "problem": "What problem does this solve?",
      "solution": { ... },
      "quality_rules": [ ... ],
      "target_skills": [ ... ]
    }
  }
}

2. Episodic Memory (

memory/episodic/

)

2. 情景内存（

memory/episodic/

）

Stores specific experiences and what happened:

memory/episodic/
├── 2025/
│   ├── 2025-01-11-prd-creation.json
│   ├── 2025-01-11-debug-session.json
│   └── 2025-01-12-refactoring.json

json

{
  "id": "ep-2025-01-11-001",
  "timestamp": "2025-01-11T10:30:00Z",
  "skill": "debugger",
  "situation": "User reported data not refreshing after form submission",
  "root_cause": "Empty callback in onRefresh prop",
  "solution": "Implement actual refresh logic in callback",
  "lesson": "Always verify callbacks are not empty functions",
  "related_pattern": "callback_verification",
  "user_feedback": {
    "rating": 8,
    "comments": "This was exactly the issue"
  }
}

存储具体的经验及事件经过：

memory/episodic/
├── 2025/
│   ├── 2025-01-11-prd-creation.json
│   ├── 2025-01-11-debug-session.json
│   └── 2025-01-12-refactoring.json

json

{
  "id": "ep-2025-01-11-001",
  "timestamp": "2025-01-11T10:30:00Z",
  "skill": "debugger",
  "situation": "User reported data not refreshing after form submission",
  "root_cause": "Empty callback in onRefresh prop",
  "solution": "Implement actual refresh logic in callback",
  "lesson": "Always verify callbacks are not empty functions",
  "related_pattern": "callback_verification",
  "user_feedback": {
    "rating": 8,
    "comments": "This was exactly the issue"
  }
}

3. Working Memory (

memory/working/

)

3. 工作内存（

memory/working/

）

Stores current session context:

memory/working/
├── current_session.json   # Active session data
├── last_error.json        # Error context for self-correction
└── session_end.json       # Session end marker

存储当前会话上下文：

memory/working/
├── current_session.json   # Active session data
├── last_error.json        # Error context for self-correction
└── session_end.json       # Session end marker

Self-Improvement Process

自我改进流程

Phase 1: Experience Extraction

阶段1：经验提取

After any skill completes, extract:

yaml

What happened:
  skill_used: {which skill}
  task: {what was being done}
  outcome: {success|partial|failure}

Key Insights:
  what_went_well: [what worked]
  what_went_wrong: [what didn't work]
  root_cause: {underlying issue if applicable}

User Feedback:
  rating: {1-10 if provided}
  comments: {specific feedback}

任意技能完成后，提取以下信息：

yaml

What happened:
  skill_used: {which skill}
  task: {what was being done}
  outcome: {success|partial|failure}

Key Insights:
  what_went_well: [what worked]
  what_went_wrong: [what didn't work]
  root_cause: {underlying issue if applicable}

User Feedback:
  rating: {1-10 if provided}
  comments: {specific feedback}

Phase 2: Pattern Abstraction

阶段2：模式抽象

Convert experiences to reusable patterns:

Concrete Experience	Abstract Pattern	Target Skill
"User forgot to save PRD notes"	"Always persist thinking to files"	prd-planner
"Code review missed SQL injection"	"Add security checklist item"	code-reviewer
"Callback was empty, didn't work"	"Verify callback implementations"	debugger
"Net APY position ambiguous"	"UI specs need exact relative positions"	prd-planner

Abstraction Rules:

yaml

If experience_repeats 3+ times:
  pattern_level: critical
  action: Add to skill's "Critical Mistakes" section

If solution_was_effective:
  pattern_level: best_practice
  action: Add to skill's "Best Practices" section

If user_rating >= 7:
  pattern_level: strength
  action: Reinforce this approach

If user_rating <= 4:
  pattern_level: weakness
  action: Add to "What to Avoid" section

将经验转化为可复用模式：

具体经验	抽象模式	目标技能
"用户忘记保存PRD笔记"	"始终将思考内容持久化到文件中"	prd-planner
"代码评审遗漏SQL注入问题"	"添加安全检查清单项"	code-reviewer
"回调函数为空，无法正常工作"	"验证回调函数实现"	debugger
"Net APY位置不明确"	"UI规范需要精确的相对位置"	prd-planner

抽象规则：

yaml

If experience_repeats 3+ times:
  pattern_level: critical
  action: Add to skill's "Critical Mistakes" section

If solution_was_effective:
  pattern_level: best_practice
  action: Add to skill's "Best Practices" section

If user_rating >= 7:
  pattern_level: strength
  action: Reinforce this approach

If user_rating <= 4:
  pattern_level: weakness
  action: Add to "What to Avoid" section

Phase 3: Skill Updates

阶段3：技能更新

Update the appropriate skill files with evolution markers:

markdown

<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->

使用演进标记更新相应的技能文件：

markdown

<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->

Pattern Added (2025-01-12)

Pattern: Always verify callbacks are not empty functions

Source: Episode ep-2025-01-12-001

Confidence: 0.95

Pattern: Always verify callbacks are not empty functions

Source: Episode ep-2025-01-12-001

Confidence: 0.95

Updated Checklist

Verify all callbacks have implementations
Test callback execution paths


**Correction Markers** (when fixing wrong guidance):

```markdown
<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->

Verify all callbacks have implementations
Test callback execution paths


**修正标记**（修复错误指导时）：

```markdown
<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->

Corrected Guidance

Use direct state monitoring instead of callback chains:

typescript

// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);

undefined

Use direct state monitoring instead of callback chains:

typescript

// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);

undefined

Phase 4: Memory Consolidation

阶段4：内存整合

Update semantic memory (
```
memory/semantic-patterns.json
```
)
Store episodic memory (
```
memory/episodic/YYYY-MM-DD-{skill}.json
```
)
Update pattern confidence based on applications/feedback
Prune outdated patterns (low confidence, no recent applications)

更新语义内存（
```
memory/semantic-patterns.json
```
）
存储情景内存（
```
memory/episodic/YYYY-MM-DD-{skill}.json
```
）
基于应用情况/反馈更新模式置信度
清理过时模式（低置信度、近期无应用的模式）

Self-Correction (on_error hook)

自我修正（on_error钩子）

Triggered when:

Bash command returns non-zero exit code
Tests fail after following skill guidance
User reports the guidance produced incorrect results

Process:

markdown

undefined

在以下情况触发：

Bash命令返回非零退出码
遵循技能指导后测试失败
用户反馈指导产生错误结果

流程：

markdown

undefined

Self-Correction Workflow

Detect Error
- Capture error context from working/last_error.json
- Identify which skill guidance was followed
Verify Root Cause
- Was the skill guidance incorrect?
- Was the guidance misinterpreted?
- Was the guidance incomplete?
Apply Correction
- Update skill file with corrected guidance
- Add correction marker with reason
- Update related patterns in semantic memory
Validate Fix
- Test the corrected guidance
- Ask user to verify


**Example:**

```markdown
<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->

Detect Error
- Capture error context from working/last_error.json
- Identify which skill guidance was followed
Verify Root Cause
- Was the skill guidance incorrect?
- Was the guidance misinterpreted?
- Was the guidance incomplete?
Apply Correction
- Update skill file with corrected guidance
- Add correction marker with reason
- Update related patterns in semantic memory
Validate Fix
- Test the corrected guidance
- Ask user to verify


**示例：**

```markdown
<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->

Self-Correction: Click-Time Computation

Issue: Using useMemo for claimable IDs caused stale data Fix: Compute at click time for always-fresh data Pattern: click_time_vs_open_time_computation

undefined

Issue: Using useMemo for claimable IDs caused stale data Fix: Compute at click time for always-fresh data Pattern: click_time_vs_open_time_computation

undefined

Self-Validation

自我验证

Use the validation template in

references/appendix.md

when reviewing updates.

审查更新时，请使用

references/appendix.md

中的验证模板。

Hooks Integration

钩子集成

Wiring Hooks in Claude Code Settings

在Claude Code设置中配置钩子

Add to Claude Code settings (

~/.claude/settings.json

json

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
          }
        ]
      }
    ]
  }
}

Replace

${SKILLS_DIR}

with your actual skills path.

添加到Claude Code设置（

~/.claude/settings.json

）：

json

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
          }
        ]
      }
    ]
  }
}

将

${SKILLS_DIR}

替换为实际的技能路径。

Additional References

额外参考

See

references/appendix.md

for memory structure, workflow diagrams, metrics, feedback templates, and research links.

请查看

references/appendix.md

获取内存结构、工作流图表、指标、反馈模板和研究链接。

Best Practices

最佳实践

DO

建议

✅ Learn from EVERY skill interaction
✅ Extract patterns at the right abstraction level
✅ Update multiple related skills
✅ Track confidence and apply counts
✅ Ask for user feedback on improvements
✅ Use evolution/correction markers for traceability
✅ Validate guidance before applying broadly

✅ 从每次技能交互中学习
✅ 在合适的抽象级别提取模式
✅ 更新多个相关技能
✅ 跟踪置信度和应用次数
✅ 询问用户对改进的反馈
✅ 使用演进/修正标记确保可追溯性
✅ 广泛应用前验证指导内容

DON'T

避免

❌ Over-generalize from single experiences
❌ Update skills without confidence tracking
❌ Ignore negative feedback
❌ Make changes that break existing functionality
❌ Create contradictory patterns
❌ Update skills without understanding context

❌ 从单一经验过度泛化
❌ 不跟踪置信度就更新技能
❌ 忽略负面反馈
❌ 做出破坏现有功能的变更
❌ 创建矛盾的模式
❌ 在不理解上下文的情况下更新技能

Quick Start

快速开始

After any skill completes, this agent automatically:

Analyzes what happened
Extracts patterns and insights
Updates relevant skill files
Logs to memory for future reference
Reports summary to user

任意技能完成后，本Agent会自动执行以下操作：

分析事件经过
提取模式和洞见
更新相关技能文件
记录到内存供未来参考
向用户报告总结

self-improving-agent

Original

Translation

Self-Improving Agent

自我改进Agent

Overview

概述

Research-Based Design

基于研究的设计

The Self-Improvement Loop

自我改进循环

When This Activates

触发时机

Automatic Triggers (via hooks)

自动触发（通过钩子）

Manual Triggers

手动触发

Evolution Priority Matrix

演进优先级矩阵

Multi-Memory Architecture

多内存架构

1. Semantic Memory (memory/semantic-patterns.json)

1. 语义内存（memory/semantic-patterns.json）

2. Episodic Memory (memory/episodic/)

2. 情景内存（memory/episodic/）

3. Working Memory (memory/working/)

3. 工作内存（memory/working/）

Self-Improvement Process

自我改进流程

Phase 1: Experience Extraction

阶段1：经验提取

Phase 2: Pattern Abstraction

阶段2：模式抽象

Phase 3: Skill Updates

阶段3：技能更新

Pattern Added (2025-01-12)

Pattern Added (2025-01-12)

Updated Checklist

Updated Checklist

Corrected Guidance

Corrected Guidance

Phase 4: Memory Consolidation

阶段4：内存整合

Self-Correction (on_error hook)

自我修正（on_error钩子）

Self-Correction Workflow

Self-Correction Workflow

Self-Correction: Click-Time Computation

Self-Correction: Click-Time Computation

Self-Validation

自我验证

Hooks Integration

钩子集成

Wiring Hooks in Claude Code Settings

在Claude Code设置中配置钩子

Additional References

额外参考

Best Practices

最佳实践

DO

建议

DON'T

避免

Quick Start

快速开始

References

参考文献

1. Semantic Memory (
`memory/semantic-patterns.json`
)

1. 语义内存（
`memory/semantic-patterns.json`
）

2. Episodic Memory (
`memory/episodic/`
)

2. 情景内存（
`memory/episodic/`
）

3. Working Memory (
`memory/working/`
)

3. 工作内存（
`memory/working/`
）