hermes-agent-self-evolution

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Hermes Agent Self-Evolution

Hermes Agent 自我进化

Skill by ara.so — Hermes Skills collection.
Hermes Agent Self-Evolution provides evolutionary self-improvement for Hermes Agent using DSPy + GEPA (Genetic-Pareto Prompt Evolution). It automatically evolves and optimizes skills, tool descriptions, system prompts, and code through reflective evolutionary search—no GPU training required, everything operates via API calls.
ara.so开发的技能——Hermes Skills 合集。
Hermes Agent 自我进化借助DSPy + GEPA(遗传帕累托提示进化)为Hermes Agent提供进化式自我提升能力。它通过反射式进化搜索自动进化并优化技能、工具描述、系统提示词和代码——无需GPU训练,所有操作均通过API调用完成。

What It Does

功能介绍

  • Skill Evolution: Optimizes SKILL.md files using execution traces and targeted mutations
  • Prompt Optimization: Improves system prompts and tool descriptions through evolutionary search
  • Code Evolution: Plans to support code-level optimization via Darwinian Evolver
  • Trace-Based Learning: Analyzes why things fail, not just that they failed
  • Cost-Effective: ~$2-10 per optimization run using LLM APIs
  • 技能进化:利用执行轨迹和定向变异优化SKILL.md文件
  • 提示词优化:通过进化搜索改进系统提示词和工具描述
  • 代码进化:计划通过Darwinian Evolver支持代码级优化
  • 基于轨迹的学习:分析失败的原因,而非仅关注失败结果
  • 成本可控:使用LLM API每次优化运行成本约为2-10美元

Installation

安装步骤

bash
undefined
bash
undefined

Clone the repository

克隆仓库

Install with development dependencies

安装开发依赖

pip install -e ".[dev]"
pip install -e ".[dev]"

Set required environment variables

设置必要的环境变量

export HERMES_AGENT_REPO=~/.hermes/hermes-agent export OPENAI_API_KEY=your_openai_api_key
undefined
export HERMES_AGENT_REPO=~/.hermes/hermes-agent export OPENAI_API_KEY=your_openai_api_key
undefined

Core Workflow

核心工作流

The evolution pipeline follows this flow:
  1. Read current skill/prompt/tool definition
  2. Generate evaluation dataset (synthetic or from session history)
  3. Run GEPA optimizer to create candidate variants
  4. Evaluate variants against execution traces
  5. Apply constraint gates (tests, size limits, benchmarks)
  6. Select best variant and generate PR
进化流水线遵循以下流程:
  1. 读取当前技能/提示词/工具定义
  2. 生成评估数据集(合成数据或来自会话历史)
  3. 运行GEPA优化器生成候选变体
  4. 根据执行轨迹评估变体
  5. 应用约束校验(测试、大小限制、基准测试)
  6. 选择最优变体并生成PR

Evolving Skills

技能进化

Basic Skill Evolution with Synthetic Data

基于合成数据的基础技能进化

python
undefined
python
undefined

Command line

命令行方式

python -m evolution.skills.evolve_skill
--skill github-code-review
--iterations 10
--eval-source synthetic
python -m evolution.skills.evolve_skill
--skill github-code-review
--iterations 10
--eval-source synthetic

Or programmatically

编程方式

from evolution.skills.evolve_skill import evolve_skill
result = evolve_skill( skill_name="github-code-review", iterations=10, eval_source="synthetic", hermes_repo_path="~/.hermes/hermes-agent" )
print(f"Best variant score: {result.best_score}") print(f"Improvement: {result.improvement_pct}%")
undefined
from evolution.skills.evolve_skill import evolve_skill
result = evolve_skill( skill_name="github-code-review", iterations=10, eval_source="synthetic", hermes_repo_path="~/.hermes/hermes-agent" )
print(f"最优变体得分: {result.best_score}") print(f"提升幅度: {result.improvement_pct}%")
undefined

Using Real Session History

使用真实会话历史

python
undefined
python
undefined

Use actual session data from Claude Code, Copilot, Hermes

使用来自Claude Code、Copilot、Hermes的实际会话数据

python -m evolution.skills.evolve_skill
--skill github-code-review
--iterations 10
--eval-source sessiondb
--session-db-path ~/.hermes/sessions.db
undefined
python -m evolution.skills.evolve_skill
--skill github-code-review
--iterations 10
--eval-source sessiondb
--session-db-path ~/.hermes/sessions.db
undefined

Custom Evaluation Dataset

自定义评估数据集

python
from evolution.skills.evolve_skill import evolve_skill
from evolution.eval.dataset import EvalDataset, EvalExample
python
from evolution.skills.evolve_skill import evolve_skill
from evolution.eval.dataset import EvalDataset, EvalExample

Create custom evaluation examples

创建自定义评估示例

dataset = EvalDataset(examples=[ EvalExample( input_query="Review this PR for security issues", expected_behavior="Check for SQL injection, XSS, secrets in code", context={"pr_url": "https://github.com/org/repo/pull/123"} ), EvalExample( input_query="Analyze code quality in this commit", expected_behavior="Check complexity, test coverage, documentation", context={"commit_sha": "abc123"} ) ])
result = evolve_skill( skill_name="github-code-review", custom_dataset=dataset, iterations=10 )
undefined
dataset = EvalDataset(examples=[ EvalExample( input_query="审核此PR的安全问题", expected_behavior="检查SQL注入、XSS、代码中的敏感信息", context={"pr_url": "https://github.com/org/repo/pull/123"} ), EvalExample( input_query="分析此提交中的代码质量", expected_behavior="检查复杂度、测试覆盖率、文档完整性", context={"commit_sha": "abc123"} ) ])
result = evolve_skill( skill_name="github-code-review", custom_dataset=dataset, iterations=10 )
undefined

DSPy + GEPA Integration

DSPy + GEPA 集成

Understanding GEPA

GEPA 介绍

GEPA (Genetic-Pareto Prompt Evolution) reads execution traces to understand failures and propose targeted improvements:
python
from evolution.gepa.optimizer import GEPAOptimizer
from evolution.gepa.trace import ExecutionTrace
GEPA(遗传帕累托提示进化)读取执行轨迹以理解失败原因并提出定向改进方案:
python
from evolution.gepa.optimizer import GEPAOptimizer
from evolution.gepa.trace import ExecutionTrace

Initialize GEPA optimizer

初始化GEPA优化器

optimizer = GEPAOptimizer( population_size=20, mutation_rate=0.3, crossover_rate=0.5 )
optimizer = GEPAOptimizer( population_size=20, mutation_rate=0.3, crossover_rate=0.5 )

Load execution traces

加载执行轨迹

traces = [ ExecutionTrace( input="Review PR #123", output="Checked syntax only", error="Failed to identify security issues", metadata={"skill": "github-code-review"} ) ]
traces = [ ExecutionTrace( input="审核PR #123", output="仅检查了语法", error="未识别出安全问题", metadata={"skill": "github-code-review"} ) ]

Generate improved variants

生成改进后的变体

variants = optimizer.evolve( current_prompt="Review GitHub pull requests for code quality", traces=traces, num_generations=10 )
best_variant = variants[0] print(f"Improved prompt: {best_variant.text}") print(f"Fitness score: {best_variant.fitness}")
undefined
variants = optimizer.evolve( current_prompt="审核GitHub拉取请求的代码质量", traces=traces, num_generations=10 )
best_variant = variants[0] print(f"改进后的提示词: {best_variant.text}") print(f"适应度得分: {best_variant.fitness}")
undefined

Custom Mutation Strategies

自定义变异策略

python
from evolution.gepa.mutations import (
    AddDetailMutation,
    SimplifyMutation,
    ReframeMutation,
    ExampleMutation
)

optimizer = GEPAOptimizer(
    mutations=[
        AddDetailMutation(weight=0.4),
        SimplifyMutation(weight=0.2),
        ReframeMutation(weight=0.2),
        ExampleMutation(weight=0.2)
    ]
)
python
from evolution.gepa.mutations import (
    AddDetailMutation,
    SimplifyMutation,
    ReframeMutation,
    ExampleMutation
)

optimizer = GEPAOptimizer(
    mutations=[
        AddDetailMutation(weight=0.4),
        SimplifyMutation(weight=0.2),
        ReframeMutation(weight=0.2),
        ExampleMutation(weight=0.2)
    ]
)

Configuration

配置

Evolution Config File

进化配置文件

Create
evolution_config.yaml
:
yaml
undefined
创建
evolution_config.yaml
yaml
undefined

Optimization parameters

优化参数

gepa: population_size: 20 num_generations: 10 mutation_rate: 0.3 crossover_rate: 0.5 elitism: 2 # Keep top 2 variants
gepa: population_size: 20 num_generations: 10 mutation_rate: 0.3 crossover_rate: 0.5 elitism: 2 # 保留前2个最优变体

Constraint gates

约束校验

constraints: max_skill_size_kb: 15 max_tool_description_chars: 500 min_test_pass_rate: 1.0 semantic_drift_threshold: 0.15
constraints: max_skill_size_kb: 15 max_tool_description_chars: 500 min_test_pass_rate: 1.0 semantic_drift_threshold: 0.15

Evaluation

评估

evaluation: metrics: - accuracy - execution_success - response_quality weights: accuracy: 0.4 execution_success: 0.4 response_quality: 0.2
evaluation: metrics: - accuracy - execution_success - response_quality weights: accuracy: 0.4 execution_success: 0.4 response_quality: 0.2

API settings

API 设置

api: provider: openai # or anthropic, together model: gpt-4 temperature: 0.7 max_tokens: 2000

Load and use:

```python
from evolution.config import EvolutionConfig

config = EvolutionConfig.from_yaml("evolution_config.yaml")

result = evolve_skill(
    skill_name="github-code-review",
    config=config
)
api: provider: openai # 或 anthropic, together model: gpt-4 temperature: 0.7 max_tokens: 2000

加载并使用:

```python
from evolution.config import EvolutionConfig

config = EvolutionConfig.from_yaml("evolution_config.yaml")

result = evolve_skill(
    skill_name="github-code-review",
    config=config
)

Guardrails and Constraints

防护机制与约束

All evolved variants must pass these gates:
python
from evolution.constraints import (
    TestSuiteConstraint,
    SizeLimitConstraint,
    SemanticPreservationConstraint,
    CachingCompatibilityConstraint
)

constraints = [
    TestSuiteConstraint(
        test_command="pytest tests/ -q",
        required_pass_rate=1.0
    ),
    SizeLimitConstraint(
        max_skill_kb=15,
        max_tool_desc_chars=500
    ),
    SemanticPreservationConstraint(
        drift_threshold=0.15,
        embedding_model="text-embedding-3-small"
    ),
    CachingCompatibilityConstraint(
        allow_mid_conversation_changes=False
    )
]
所有进化后的变体必须通过以下校验:
python
from evolution.constraints import (
    TestSuiteConstraint,
    SizeLimitConstraint,
    SemanticPreservationConstraint,
    CachingCompatibilityConstraint
)

constraints = [
    TestSuiteConstraint(
        test_command="pytest tests/ -q",
        required_pass_rate=1.0
    ),
    SizeLimitConstraint(
        max_skill_kb=15,
        max_tool_desc_chars=500
    ),
    SemanticPreservationConstraint(
        drift_threshold=0.15,
        embedding_model="text-embedding-3-small"
    ),
    CachingCompatibilityConstraint(
        allow_mid_conversation_changes=False
    )
]

Validate a variant

验证变体

from evolution.validation import validate_variant
is_valid, violations = validate_variant( variant_text="...", constraints=constraints )
if not is_valid: print(f"Constraint violations: {violations}")
undefined
from evolution.validation import validate_variant
is_valid, violations = validate_variant( variant_text="...", constraints=constraints )
if not is_valid: print(f"约束违规项: {violations}")
undefined

Monitoring Evolution Progress

监控进化进度

python
from evolution.callbacks import (
    LoggingCallback,
    MetricsCallback,
    CheckpointCallback
)

callbacks = [
    LoggingCallback(verbose=True),
    MetricsCallback(
        track_metrics=["fitness", "diversity", "constraint_violations"]
    ),
    CheckpointCallback(
        checkpoint_dir="./checkpoints",
        save_every=5  # Save every 5 generations
    )
]

result = evolve_skill(
    skill_name="github-code-review",
    iterations=20,
    callbacks=callbacks
)
python
from evolution.callbacks import (
    LoggingCallback,
    MetricsCallback,
    CheckpointCallback
)

callbacks = [
    LoggingCallback(verbose=True),
    MetricsCallback(
        track_metrics=["fitness", "diversity", "constraint_violations"]
    ),
    CheckpointCallback(
        checkpoint_dir="./checkpoints",
        save_every=5  # 每5代保存一次
    )
]

result = evolve_skill(
    skill_name="github-code-review",
    iterations=20,
    callbacks=callbacks
)

Access metrics

查看指标

print(result.metrics_history)
undefined
print(result.metrics_history)
undefined

Generating Evaluation Datasets

生成评估数据集

Synthetic Data Generation

合成数据生成

python
from evolution.eval.synthetic import SyntheticDataGenerator

generator = SyntheticDataGenerator(
    skill_name="github-code-review",
    num_examples=50,
    difficulty_distribution={
        "easy": 0.3,
        "medium": 0.5,
        "hard": 0.2
    }
)

dataset = generator.generate()
dataset.save("eval_datasets/github-code-review.json")
python
from evolution.eval.synthetic import SyntheticDataGenerator

generator = SyntheticDataGenerator(
    skill_name="github-code-review",
    num_examples=50,
    difficulty_distribution={
        "easy": 0.3,
        "medium": 0.5,
        "hard": 0.2
    }
)

dataset = generator.generate()
dataset.save("eval_datasets/github-code-review.json")

From Session History

从会话历史提取

python
from evolution.eval.session_extractor import SessionExtractor

extractor = SessionExtractor(
    session_db_path="~/.hermes/sessions.db",
    skill_filter="github-code-review"
)
python
from evolution.eval.session_extractor import SessionExtractor

extractor = SessionExtractor(
    session_db_path="~/.hermes/sessions.db",
    skill_filter="github-code-review"
)

Extract examples from last 30 days

提取过去30天的示例

dataset = extractor.extract( days_back=30, min_quality_score=0.7, max_examples=100 )
undefined
dataset = extractor.extract( days_back=30, min_quality_score=0.7, max_examples=100 )
undefined

Advanced Patterns

进阶模式

Multi-Objective Optimization

多目标优化

python
from evolution.objectives import (
    AccuracyObjective,
    LatencyObjective,
    TokenEfficiencyObjective
)

optimizer = GEPAOptimizer(
    objectives=[
        AccuracyObjective(weight=0.5),
        LatencyObjective(weight=0.3),
        TokenEfficiencyObjective(weight=0.2)
    ],
    pareto_frontier=True  # Find Pareto-optimal solutions
)

variants = optimizer.evolve(current_prompt, traces, num_generations=15)
python
from evolution.objectives import (
    AccuracyObjective,
    LatencyObjective,
    TokenEfficiencyObjective
)

optimizer = GEPAOptimizer(
    objectives=[
        AccuracyObjective(weight=0.5),
        LatencyObjective(weight=0.3),
        TokenEfficiencyObjective(weight=0.2)
    ],
    pareto_frontier=True  # 寻找帕累托最优解
)

variants = optimizer.evolve(current_prompt, traces, num_generations=15)

Get Pareto frontier

获取帕累托前沿解

pareto_variants = [v for v in variants if v.is_pareto_optimal]
undefined
pareto_variants = [v for v in variants if v.is_pareto_optimal]
undefined

Batch Evolution of Multiple Skills

批量进化多个技能

python
from evolution.batch import batch_evolve_skills

skills_to_evolve = [
    "github-code-review",
    "python-debugging",
    "api-design",
    "docker-optimization"
]

results = batch_evolve_skills(
    skills=skills_to_evolve,
    iterations=10,
    parallel=True,
    max_workers=4
)

for skill, result in results.items():
    print(f"{skill}: {result.improvement_pct}% improvement")
python
from evolution.batch import batch_evolve_skills

skills_to_evolve = [
    "github-code-review",
    "python-debugging",
    "api-design",
    "docker-optimization"
]

results = batch_evolve_skills(
    skills=skills_to_evolve,
    iterations=10,
    parallel=True,
    max_workers=4
)

for skill, result in results.items():
    print(f"{skill}: 提升{result.improvement_pct}%")

Integration with CI/CD

与CI/CD集成

python
undefined
python
undefined

evolution_pipeline.py

evolution_pipeline.py

from evolution.ci import create_evolution_pr
from evolution.ci import create_evolution_pr

Run in GitHub Actions or similar

在GitHub Actions等环境中运行

if name == "main": result = evolve_skill( skill_name="github-code-review", iterations=10, eval_source="synthetic" )
if result.improvement_pct > 5.0:  # Only PR if >5% improvement
    pr = create_evolution_pr(
        skill_name="github-code-review",
        variant_text=result.best_variant,
        metrics=result.metrics,
        repo_path=os.getenv("HERMES_AGENT_REPO"),
        branch_name=f"evolution/github-code-review-{result.run_id}"
    )
    print(f"Created PR: {pr.url}")
undefined
if name == "main": result = evolve_skill( skill_name="github-code-review", iterations=10, eval_source="synthetic" )
if result.improvement_pct > 5.0:  # 仅当提升幅度超过5%时创建PR
    pr = create_evolution_pr(
        skill_name="github-code-review",
        variant_text=result.best_variant,
        metrics=result.metrics,
        repo_path=os.getenv("HERMES_AGENT_REPO"),
        branch_name=f"evolution/github-code-review-{result.run_id}"
    )
    print(f"已创建PR: {pr.url}")
undefined

Troubleshooting

故障排查

Evolution Gets Stuck in Local Optimum

进化陷入局部最优

python
undefined
python
undefined

Increase mutation rate and diversity

提高变异率和多样性奖励

optimizer = GEPAOptimizer( mutation_rate=0.5, # Higher mutation diversity_bonus=0.1, # Reward novel variants restart_threshold=5 # Restart if no improvement for 5 gens )
undefined
optimizer = GEPAOptimizer( mutation_rate=0.5, # 更高的变异率 diversity_bonus=0.1, # 奖励新颖变体 restart_threshold=5 # 若连续5代无提升则重启 )
undefined

Variants Fail Constraint Gates

变体未通过约束校验

python
undefined
python
undefined

Debug constraint failures

调试约束失败原因

from evolution.debug import diagnose_constraints
diagnosis = diagnose_constraints( variant_text="...", constraints=constraints, verbose=True )
print(diagnosis.summary())
undefined
from evolution.debug import diagnose_constraints
diagnosis = diagnose_constraints( variant_text="...", constraints=constraints, verbose=True )
print(diagnosis.summary())
undefined

API Rate Limits

API速率限制

python
from evolution.api import RateLimitedClient

client = RateLimitedClient(
    provider="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    requests_per_minute=50,
    retry_on_limit=True,
    backoff_factor=2.0
)
python
from evolution.api import RateLimitedClient

client = RateLimitedClient(
    provider="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    requests_per_minute=50,
    retry_on_limit=True,
    backoff_factor=2.0
)

Low Quality Evaluation Data

评估数据质量低下

python
undefined
python
undefined

Filter and augment evaluation dataset

过滤并增强评估数据集

from evolution.eval.quality import filter_by_quality, augment_dataset
dataset = EvalDataset.load("eval_datasets/raw.json") dataset = filter_by_quality(dataset, min_score=0.7) dataset = augment_dataset(dataset, augmentation_factor=2)
undefined
from evolution.eval.quality import filter_by_quality, augment_dataset
dataset = EvalDataset.load("eval_datasets/raw.json") dataset = filter_by_quality(dataset, min_score=0.7) dataset = augment_dataset(dataset, augmentation_factor=2)
undefined

Environment Variables

环境变量

bash
undefined
bash
undefined

Required

必填

export HERMES_AGENT_REPO=~/.hermes/hermes-agent export OPENAI_API_KEY=your_api_key
export HERMES_AGENT_REPO=~/.hermes/hermes-agent export OPENAI_API_KEY=your_api_key

Optional

可选

export EVOLUTION_CONFIG_PATH=./evolution_config.yaml export EVOLUTION_CHECKPOINT_DIR=./checkpoints export EVOLUTION_LOG_LEVEL=INFO export SESSION_DB_PATH=~/.hermes/sessions.db
undefined
export EVOLUTION_CONFIG_PATH=./evolution_config.yaml export EVOLUTION_CHECKPOINT_DIR=./checkpoints export EVOLUTION_LOG_LEVEL=INFO export SESSION_DB_PATH=~/.hermes/sessions.db
undefined

Integration with Hermes Agent

与Hermes Agent集成

Evolved skills automatically integrate with Hermes Agent:
python
undefined
进化后的技能会自动与Hermes Agent集成:
python
undefined

After evolution completes, the improved skill is available

进化完成后,即可使用改进后的技能

from hermes_agent import HermesAgent
agent = HermesAgent() agent.load_skill("github-code-review") # Uses evolved version
response = agent.chat("Review this PR for security issues")
undefined
from hermes_agent import HermesAgent
agent = HermesAgent() agent.load_skill("github-code-review") # 使用进化后的版本
response = agent.chat("审核此PR的安全问题")
undefined

Project Structure

项目结构

hermes-agent-self-evolution/
├── evolution/
│   ├── skills/           # Skill evolution
│   ├── prompts/          # Prompt optimization
│   ├── tools/            # Tool description evolution
│   ├── gepa/             # GEPA optimizer implementation
│   ├── eval/             # Evaluation datasets and metrics
│   ├── constraints/      # Constraint gates
│   └── callbacks/        # Evolution callbacks
├── tests/                # Test suite
├── eval_datasets/        # Evaluation data
└── evolution_config.yaml # Default configuration
hermes-agent-self-evolution/
├── evolution/
│   ├── skills/           # 技能进化模块
│   ├── prompts/          # 提示词优化模块
│   ├── tools/            # 工具描述进化模块
│   ├── gepa/             # GEPA优化器实现
│   ├── eval/             # 评估数据集与指标
│   ├── constraints/      # 约束校验模块
│   └── callbacks/        # 进化回调模块
├── tests/                # 测试套件
├── eval_datasets/        # 评估数据
└── evolution_config.yaml # 默认配置文件