self-improving-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSelf-Improving Agent
自我改进Agent
"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research
"一款可从每次交互中学习、积累模式和洞见以持续提升自身能力的AI Agent。" — 基于2025年终身学习研究
Overview
概述
This is a universal self-improvement system that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:
- Multi-Memory Architecture: Semantic + Episodic + Working memory
- Self-Correction: Detects and fixes skill guidance errors
- Self-Validation: Periodically verifies skill accuracy
- Hooks Integration: Auto-triggers on skill events (before_start, after_complete, on_error)
- Evolution Markers: Traceable changes with source attribution
这是一个通用自我改进系统,可从所有技能经验中学习,而不仅仅是PRD。它实现了一个完整的反馈循环,包含:
- 多内存架构:语义+情景+工作内存
- 自我修正:检测并修复技能指导错误
- 自我验证:定期验证技能准确性
- 钩子集成:在技能事件(before_start、after_complete、on_error)时自动触发
- 演进标记:可追溯的变更及来源归因
Research-Based Design
基于研究的设计
Based on 2025 research:
| Research | Key Insight | Application |
|---|---|---|
| SimpleMem | Efficient lifelong memory | Pattern accumulation system |
| Multi-Memory Survey | Semantic + Episodic memory | World knowledge + experiences |
| Lifelong Learning | Continuous task stream learning | Learn from every skill use |
| Evo-Memory | Test-time lifelong learning | Real-time adaptation |
基于2025年相关研究:
| 研究 | 核心洞见 | 应用 |
|---|---|---|
| SimpleMem | 高效终身内存 | 模式积累系统 |
| Multi-Memory Survey | 语义+情景内存 | 世界知识+经验存储 |
| Lifelong Learning | 持续任务流学习 | 从每次技能使用中学习 |
| Evo-Memory | 测试时终身学习 | 实时适配 |
The Self-Improvement Loop
自我改进循环
┌─────────────────────────────────────────────────────────────────┐
│ UNIVERSAL SELF-IMPROVEMENT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Skill Event → Extract Experience → Abstract Pattern → Update │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ MULTI-MEMORY SYSTEM │ │
│ ├─────────────────────────────────────────────────────┤ │
│ │ Semantic Memory │ Episodic Memory │ Working Memory │ │
│ │ (Patterns/Rules) │ (Experiences) │ (Current) │ │
│ │ memory/semantic/ │ memory/episodic/ │ memory/working/│ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ FEEDBACK LOOP │ │
│ │ User Feedback → Confidence Update → Pattern Adapt │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────────┐
│ UNIVERSAL SELF-IMPROVEMENT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Skill Event → Extract Experience → Abstract Pattern → Update │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ MULTI-MEMORY SYSTEM │ │
│ ├─────────────────────────────────────────────────────┤ │
│ │ Semantic Memory │ Episodic Memory │ Working Memory │ │
│ │ (Patterns/Rules) │ (Experiences) │ (Current) │ │
│ │ memory/semantic/ │ memory/episodic/ │ memory/working/│ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ FEEDBACK LOOP │ │
│ │ User Feedback → Confidence Update → Pattern Adapt │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘When This Activates
触发时机
Automatic Triggers (via hooks)
自动触发(通过钩子)
| Event | Trigger | Action |
|---|---|---|
| before_start | Any skill starts | Log session start |
| after_complete | Any skill completes | Extract patterns, update skills |
| on_error | Bash returns non-zero exit | Capture error context, trigger self-correction |
| 事件 | 触发条件 | 动作 |
|---|---|---|
| before_start | 任意技能启动 | 记录会话开始 |
| after_complete | 任意技能完成 | 提取模式,更新技能 |
| on_error | Bash返回非零退出码 | 捕获错误上下文,触发自我修正 |
Manual Triggers
手动触发
- User says "自我进化", "self-improve", "从经验中学习"
- User says "分析今天的经验", "总结教训"
- User asks to improve a specific skill
- 用户输入“自我进化”、“self-improve”、“从经验中学习”
- 用户输入“分析今天的经验”、“总结教训”
- 用户要求改进特定技能
Evolution Priority Matrix
演进优先级矩阵
Trigger evolution when new reusable knowledge appears:
| Trigger | Target Skill | Priority | Action |
|---|---|---|---|
| New PRD pattern discovered | prd-planner | High | Add to quality checklist |
| Architecture tradeoff clarified | architecting-solutions | High | Add to decision patterns |
| API design rule learned | api-designer | High | Update template |
| Debugging fix discovered | debugger | High | Add to anti-patterns |
| Review checklist gap | code-reviewer | High | Add checklist item |
| Perf/security insight | performance-engineer, security-auditor | High | Add to patterns |
| UI/UX spec issue | prd-planner, architecting-solutions | High | Add visual spec requirements |
| React/state pattern | debugger, refactoring-specialist | Medium | Add to patterns |
| Test strategy improvement | test-automator, qa-expert | Medium | Update approach |
| CI/deploy fix | deployment-engineer | Medium | Add to troubleshooting |
当出现新的可复用知识时触发演进:
| 触发条件 | 目标技能 | 优先级 | 动作 |
|---|---|---|---|
| 发现新的PRD模式 | prd-planner | 高 | 添加到质量检查清单 |
| 明确架构权衡 | architecting-solutions | 高 | 添加到决策模式 |
| 学到API设计规则 | api-designer | 高 | 更新模板 |
| 发现调试修复方案 | debugger | 高 | 添加到反模式库 |
| 发现评审检查清单漏洞 | code-reviewer | 高 | 添加检查项 |
| 获得性能/安全洞见 | performance-engineer, security-auditor | 高 | 添加到模式库 |
| 发现UI/UX规范问题 | prd-planner, architecting-solutions | 高 | 添加视觉规范要求 |
| 学到React/状态模式 | debugger, refactoring-specialist | 中 | 添加到模式库 |
| 改进测试策略 | test-automator, qa-expert | 中 | 更新方法 |
| 发现CI/部署修复方案 | deployment-engineer | 中 | 添加到故障排查库 |
Multi-Memory Architecture
多内存架构
1. Semantic Memory (memory/semantic-patterns.json
)
memory/semantic-patterns.json1. 语义内存(memory/semantic-patterns.json
)
memory/semantic-patterns.jsonStores abstract patterns and rules reusable across contexts:
json
{
"patterns": {
"pattern_id": {
"id": "pat-2025-01-11-001",
"name": "Pattern Name",
"source": "user_feedback|implementation_review|retrospective",
"confidence": 0.95,
"applications": 5,
"created": "2025-01-11",
"category": "prd_structure|react_patterns|async_patterns|...",
"pattern": "One-line summary",
"problem": "What problem does this solve?",
"solution": { ... },
"quality_rules": [ ... ],
"target_skills": [ ... ]
}
}
}存储可复用的抽象模式和规则:
json
{
"patterns": {
"pattern_id": {
"id": "pat-2025-01-11-001",
"name": "Pattern Name",
"source": "user_feedback|implementation_review|retrospective",
"confidence": 0.95,
"applications": 5,
"created": "2025-01-11",
"category": "prd_structure|react_patterns|async_patterns|...",
"pattern": "One-line summary",
"problem": "What problem does this solve?",
"solution": { ... },
"quality_rules": [ ... ],
"target_skills": [ ... ]
}
}
}2. Episodic Memory (memory/episodic/
)
memory/episodic/2. 情景内存(memory/episodic/
)
memory/episodic/Stores specific experiences and what happened:
memory/episodic/
├── 2025/
│ ├── 2025-01-11-prd-creation.json
│ ├── 2025-01-11-debug-session.json
│ └── 2025-01-12-refactoring.jsonjson
{
"id": "ep-2025-01-11-001",
"timestamp": "2025-01-11T10:30:00Z",
"skill": "debugger",
"situation": "User reported data not refreshing after form submission",
"root_cause": "Empty callback in onRefresh prop",
"solution": "Implement actual refresh logic in callback",
"lesson": "Always verify callbacks are not empty functions",
"related_pattern": "callback_verification",
"user_feedback": {
"rating": 8,
"comments": "This was exactly the issue"
}
}存储具体的经验及事件经过:
memory/episodic/
├── 2025/
│ ├── 2025-01-11-prd-creation.json
│ ├── 2025-01-11-debug-session.json
│ └── 2025-01-12-refactoring.jsonjson
{
"id": "ep-2025-01-11-001",
"timestamp": "2025-01-11T10:30:00Z",
"skill": "debugger",
"situation": "User reported data not refreshing after form submission",
"root_cause": "Empty callback in onRefresh prop",
"solution": "Implement actual refresh logic in callback",
"lesson": "Always verify callbacks are not empty functions",
"related_pattern": "callback_verification",
"user_feedback": {
"rating": 8,
"comments": "This was exactly the issue"
}
}3. Working Memory (memory/working/
)
memory/working/3. 工作内存(memory/working/
)
memory/working/Stores current session context:
memory/working/
├── current_session.json # Active session data
├── last_error.json # Error context for self-correction
└── session_end.json # Session end marker存储当前会话上下文:
memory/working/
├── current_session.json # Active session data
├── last_error.json # Error context for self-correction
└── session_end.json # Session end markerSelf-Improvement Process
自我改进流程
Phase 1: Experience Extraction
阶段1:经验提取
After any skill completes, extract:
yaml
What happened:
skill_used: {which skill}
task: {what was being done}
outcome: {success|partial|failure}
Key Insights:
what_went_well: [what worked]
what_went_wrong: [what didn't work]
root_cause: {underlying issue if applicable}
User Feedback:
rating: {1-10 if provided}
comments: {specific feedback}任意技能完成后,提取以下信息:
yaml
What happened:
skill_used: {which skill}
task: {what was being done}
outcome: {success|partial|failure}
Key Insights:
what_went_well: [what worked]
what_went_wrong: [what didn't work]
root_cause: {underlying issue if applicable}
User Feedback:
rating: {1-10 if provided}
comments: {specific feedback}Phase 2: Pattern Abstraction
阶段2:模式抽象
Convert experiences to reusable patterns:
| Concrete Experience | Abstract Pattern | Target Skill |
|---|---|---|
| "User forgot to save PRD notes" | "Always persist thinking to files" | prd-planner |
| "Code review missed SQL injection" | "Add security checklist item" | code-reviewer |
| "Callback was empty, didn't work" | "Verify callback implementations" | debugger |
| "Net APY position ambiguous" | "UI specs need exact relative positions" | prd-planner |
Abstraction Rules:
yaml
If experience_repeats 3+ times:
pattern_level: critical
action: Add to skill's "Critical Mistakes" section
If solution_was_effective:
pattern_level: best_practice
action: Add to skill's "Best Practices" section
If user_rating >= 7:
pattern_level: strength
action: Reinforce this approach
If user_rating <= 4:
pattern_level: weakness
action: Add to "What to Avoid" section将经验转化为可复用模式:
| 具体经验 | 抽象模式 | 目标技能 |
|---|---|---|
| "用户忘记保存PRD笔记" | "始终将思考内容持久化到文件中" | prd-planner |
| "代码评审遗漏SQL注入问题" | "添加安全检查清单项" | code-reviewer |
| "回调函数为空,无法正常工作" | "验证回调函数实现" | debugger |
| "Net APY位置不明确" | "UI规范需要精确的相对位置" | prd-planner |
抽象规则:
yaml
If experience_repeats 3+ times:
pattern_level: critical
action: Add to skill's "Critical Mistakes" section
If solution_was_effective:
pattern_level: best_practice
action: Add to skill's "Best Practices" section
If user_rating >= 7:
pattern_level: strength
action: Reinforce this approach
If user_rating <= 4:
pattern_level: weakness
action: Add to "What to Avoid" sectionPhase 3: Skill Updates
阶段3:技能更新
Update the appropriate skill files with evolution markers:
markdown
<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->使用演进标记更新相应的技能文件:
markdown
<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->Pattern Added (2025-01-12)
Pattern Added (2025-01-12)
Pattern: Always verify callbacks are not empty functions
Source: Episode ep-2025-01-12-001
Confidence: 0.95
Pattern: Always verify callbacks are not empty functions
Source: Episode ep-2025-01-12-001
Confidence: 0.95
Updated Checklist
Updated Checklist
- Verify all callbacks have implementations
- Test callback execution paths
**Correction Markers** (when fixing wrong guidance):
```markdown
<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->- Verify all callbacks have implementations
- Test callback execution paths
**修正标记**(修复错误指导时):
```markdown
<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->Corrected Guidance
Corrected Guidance
Use direct state monitoring instead of callback chains:
typescript
// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);undefinedUse direct state monitoring instead of callback chains:
typescript
// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);undefinedPhase 4: Memory Consolidation
阶段4:内存整合
- Update semantic memory ()
memory/semantic-patterns.json - Store episodic memory ()
memory/episodic/YYYY-MM-DD-{skill}.json - Update pattern confidence based on applications/feedback
- Prune outdated patterns (low confidence, no recent applications)
- 更新语义内存()
memory/semantic-patterns.json - 存储情景内存()
memory/episodic/YYYY-MM-DD-{skill}.json - 基于应用情况/反馈更新模式置信度
- 清理过时模式(低置信度、近期无应用的模式)
Self-Correction (on_error hook)
自我修正(on_error钩子)
Triggered when:
- Bash command returns non-zero exit code
- Tests fail after following skill guidance
- User reports the guidance produced incorrect results
Process:
markdown
undefined在以下情况触发:
- Bash命令返回非零退出码
- 遵循技能指导后测试失败
- 用户反馈指导产生错误结果
流程:
markdown
undefinedSelf-Correction Workflow
Self-Correction Workflow
-
Detect Error
- Capture error context from working/last_error.json
- Identify which skill guidance was followed
-
Verify Root Cause
- Was the skill guidance incorrect?
- Was the guidance misinterpreted?
- Was the guidance incomplete?
-
Apply Correction
- Update skill file with corrected guidance
- Add correction marker with reason
- Update related patterns in semantic memory
-
Validate Fix
- Test the corrected guidance
- Ask user to verify
**Example:**
```markdown
<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->-
Detect Error
- Capture error context from working/last_error.json
- Identify which skill guidance was followed
-
Verify Root Cause
- Was the skill guidance incorrect?
- Was the guidance misinterpreted?
- Was the guidance incomplete?
-
Apply Correction
- Update skill file with corrected guidance
- Add correction marker with reason
- Update related patterns in semantic memory
-
Validate Fix
- Test the corrected guidance
- Ask user to verify
**示例:**
```markdown
<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->Self-Correction: Click-Time Computation
Self-Correction: Click-Time Computation
Issue: Using useMemo for claimable IDs caused stale data
Fix: Compute at click time for always-fresh data
Pattern: click_time_vs_open_time_computation
undefinedIssue: Using useMemo for claimable IDs caused stale data
Fix: Compute at click time for always-fresh data
Pattern: click_time_vs_open_time_computation
undefinedSelf-Validation
自我验证
Use the validation template in when reviewing updates.
references/appendix.md审查更新时,请使用中的验证模板。
references/appendix.mdHooks Integration
钩子集成
Wiring Hooks in Claude Code Settings
在Claude Code设置中配置钩子
Add to Claude Code settings ():
~/.claude/settings.jsonjson
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Write|Edit",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
}
]
}
],
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
}
]
}
]
}
}Replace with your actual skills path.
${SKILLS_DIR}添加到Claude Code设置():
~/.claude/settings.jsonjson
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Write|Edit",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
}
]
}
],
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
}
]
}
]
}
}将替换为实际的技能路径。
${SKILLS_DIR}Additional References
额外参考
See for memory structure, workflow diagrams, metrics, feedback templates, and research links.
references/appendix.md请查看获取内存结构、工作流图表、指标、反馈模板和研究链接。
references/appendix.mdBest Practices
最佳实践
DO
建议
- ✅ Learn from EVERY skill interaction
- ✅ Extract patterns at the right abstraction level
- ✅ Update multiple related skills
- ✅ Track confidence and apply counts
- ✅ Ask for user feedback on improvements
- ✅ Use evolution/correction markers for traceability
- ✅ Validate guidance before applying broadly
- ✅ 从每次技能交互中学习
- ✅ 在合适的抽象级别提取模式
- ✅ 更新多个相关技能
- ✅ 跟踪置信度和应用次数
- ✅ 询问用户对改进的反馈
- ✅ 使用演进/修正标记确保可追溯性
- ✅ 广泛应用前验证指导内容
DON'T
避免
- ❌ Over-generalize from single experiences
- ❌ Update skills without confidence tracking
- ❌ Ignore negative feedback
- ❌ Make changes that break existing functionality
- ❌ Create contradictory patterns
- ❌ Update skills without understanding context
- ❌ 从单一经验过度泛化
- ❌ 不跟踪置信度就更新技能
- ❌ 忽略负面反馈
- ❌ 做出破坏现有功能的变更
- ❌ 创建矛盾的模式
- ❌ 在不理解上下文的情况下更新技能
Quick Start
快速开始
After any skill completes, this agent automatically:
- Analyzes what happened
- Extracts patterns and insights
- Updates relevant skill files
- Logs to memory for future reference
- Reports summary to user
任意技能完成后,本Agent会自动执行以下操作:
- 分析事件经过
- 提取模式和洞见
- 更新相关技能文件
- 记录到内存供未来参考
- 向用户报告总结