metaclaw-evolving-agent

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

MetaClaw Evolving Agent

MetaClaw 进化型Agent

Skill by ara.so — Daily 2026 Skills collection
MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows.

ara.so开发的Skill — 属于2026每日技能合集
MetaClaw是一款兼容OpenAI的代理Agent,它可以拦截对话、注入已学习的技能,并通过真实世界的交互持续自我优化。它支持三种模式:轻量技能注入模式、即时RL训练模式,以及智能“madmax”调度模式——该模式会将权重更新延迟到空闲/休眠时段进行。

Installation

安装

bash
undefined
bash
undefined

Minimal — skills injection only, no GPU required

最小安装 — 仅包含技能注入功能,无需GPU

pip install -e .
pip install -e .

Full RL training support (torch, transformers, tinker)

完整RL训练支持(包含torch、transformers、tinker)

pip install -e ".[rl]"
pip install -e ".[rl]"

Skill evolution via LLM summarization

通过LLM总结实现技能进化

pip install -e ".[evolve]"
pip install -e ".[evolve]"

Google Calendar scheduler for madmax mode

为madmax模式安装Google Calendar调度器

pip install -e ".[scheduler]"
pip install -e ".[scheduler]"

Recommended: everything

推荐:全功能安装

pip install -e ".[rl,evolve,scheduler]"

---
pip install -e ".[rl,evolve,scheduler]"

---

Quick Start

快速开始

bash
undefined
bash
undefined

One-time interactive config wizard

一次性交互式配置向导

metaclaw setup
metaclaw setup

Start in default madmax mode (skills + RL + smart scheduler)

以默认madmax模式启动(技能注入 + RL训练 + 智能调度)

metaclaw start
metaclaw start

Skills only — no GPU, no Tinker needed

仅技能模式 — 无需GPU,无需Tinker

metaclaw start --mode skills_only
metaclaw start --mode skills_only

RL mode — trains immediately when batch is full

RL模式 — 当批次数据填满时立即开始训练

metaclaw start --mode rl
metaclaw start --mode rl

RL without scheduler (same as above, explicit)

无调度器的RL模式(与上述命令功能相同,显式声明)

metaclaw start --mode rl

After `metaclaw start`, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at `http://localhost:<port>` instead of the upstream LLM endpoint.

---
metaclaw start --mode rl

执行`metaclaw start`后,本地会启动一个兼容OpenAI的代理服务。将你的客户端(OpenClaw或任何OpenAI SDK应用)指向`http://localhost:<port>`,而非上游LLM的端点。

---

Configuration

配置

metaclaw setup
writes a config file (default:
~/.metaclaw/config.yaml
). You can also edit it directly:
yaml
undefined
metaclaw setup
会生成一个配置文件(默认路径:
~/.metaclaw/config.yaml
)。你也可以直接编辑该文件:
yaml
undefined

~/.metaclaw/config.yaml

~/.metaclaw/config.yaml

proxy: host: 0.0.0.0 port: 8080
llm: provider: kimi # kimi | qwen | claude | minimax | openai | gemini base_url: https://api.moonshot.cn/v1 model: moonshot-v1-8k

api_key loaded from env: METACLAW_LLM_API_KEY

skills: enabled: true max_injected: 5 # max skills injected per turn summarize_after_session: true
rl: enabled: true backend: auto # auto | tinker | mint batch_size: 32 algorithm: grpo opd_teacher: false # optional teacher distillation
scheduler: # madmax mode only enabled: true sleep_hours: [22, 7] # local 22:00–07:00 idle_timeout_minutes: 15 google_calendar: false # set true + configure OAuth for meeting detection
logging: level: info log_dir: ~/.metaclaw/logs
undefined
proxy: host: 0.0.0.0 port: 8080
llm: provider: kimi # kimi | qwen | claude | minimax | openai | gemini base_url: https://api.moonshot.cn/v1 model: moonshot-v1-8k

api_key 从环境变量加载:METACLAW_LLM_API_KEY

skills: enabled: true max_injected: 5 # 每轮对话最多注入的技能数量 summarize_after_session: true
rl: enabled: true backend: auto # auto | tinker | mint batch_size: 32 algorithm: grpo opd_teacher: false # 可选的教师蒸馏功能
scheduler: # 仅madmax模式可用 enabled: true sleep_hours: [22, 7] # 本地时间22:00–07:00 idle_timeout_minutes: 15 google_calendar: false # 设置为true并配置OAuth以启用会议检测
logging: level: info log_dir: ~/.metaclaw/logs
undefined

Environment Variables

环境变量

bash
export METACLAW_LLM_API_KEY="your-llm-api-key"
export METACLAW_TINKER_API_KEY="your-tinker-api-key"   # rl mode
export METACLAW_MINT_API_KEY="your-mint-api-key"        # if backend=mint
export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json"  # scheduler

bash
export METACLAW_LLM_API_KEY="your-llm-api-key"
export METACLAW_TINKER_API_KEY="your-tinker-api-key"   # RL模式所需
export METACLAW_MINT_API_KEY="your-mint-api-key"        # 当backend=mint时所需
export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json"  # 调度器所需

Operating Modes

运行模式

ModeCommandGPU RequiredDescription
skills_only
metaclaw start --mode skills_only
NoProxy + skills injection + auto-summarization
rl
metaclaw start --mode rl
Via APISkills + GRPO training when batch fills
madmax
metaclaw start
Via APISkills + RL + scheduler (trains only during idle/sleep/meetings)

模式命令是否需要GPU描述
skills_only
metaclaw start --mode skills_only
代理服务 + 技能注入 + 自动总结
rl
metaclaw start --mode rl
通过API调用技能注入 + 当批次填满时进行GRPO训练
madmax
metaclaw start
通过API调用技能注入 + RL训练 + 调度器(仅在空闲/休眠/会议时段训练)

Python API

Python API

Programmatic startup

程序化启动

python
import asyncio
from metaclaw import MetaClawAgent, AgentConfig, Mode

async def main():
    config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")
    agent = MetaClawAgent(config, mode=Mode.MADMAX)
    await agent.start()

asyncio.run(main())
python
import asyncio
from metaclaw import MetaClawAgent, AgentConfig, Mode

async def main():
    config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")
    agent = MetaClawAgent(config, mode=Mode.MADMAX)
    await agent.start()

asyncio.run(main())

Manual skill injection

手动技能注入

python
from metaclaw.skills import SkillStore, SkillInjector

store = SkillStore(path="~/.metaclaw/skills")
python
from metaclaw.skills import SkillStore, SkillInjector

store = SkillStore(path="~/.metaclaw/skills")

Add a skill manually

手动添加技能

store.add( name="code-review-checklist", content="Always check for: 1) error handling, 2) type hints, 3) docstrings.", tags=["code", "review"] )
store.add( name="code-review-checklist", content="Always check for: 1) error handling, 2) type hints, 3) docstrings.", tags=["code", "review"] )

Retrieve top-k relevant skills for a query

根据查询检索最相关的前k个技能

injector = SkillInjector(store) relevant = injector.retrieve(query="review my Python function", top_k=3) for skill in relevant: print(skill.name, skill.score)
undefined
injector = SkillInjector(store) relevant = injector.retrieve(query="review my Python function", top_k=3) for skill in relevant: print(skill.name, skill.score)
undefined

Intercepting and recording conversations

拦截并记录对话

python
from metaclaw.proxy import ConversationInterceptor
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(max_size=1000)

interceptor = ConversationInterceptor(
    upstream_url="https://api.moonshot.cn/v1",
    on_complete=buffer.record   # called after each turn with (messages, response)
)
python
from metaclaw.proxy import ConversationInterceptor
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(max_size=1000)

interceptor = ConversationInterceptor(
    upstream_url="https://api.moonshot.cn/v1",
    on_complete=buffer.record   # 每轮对话结束后调用,传入(messages, response)
)

buffer.record signature:

buffer.record 的签名:

async def on_complete(messages: list[dict], response: dict) -> None: ...
undefined
async def on_complete(messages: list[dict], response: dict) -> None: ...
undefined

Triggering RL training manually

手动触发RL训练

python
from metaclaw.training import RLTrainer, TrainingConfig

trainer = RLTrainer(
    config=TrainingConfig(
        backend="tinker",       # or "mint"
        algorithm="grpo",
        batch_size=32,
        lora_rank=16,
    )
)
python
from metaclaw.training import RLTrainer, TrainingConfig

trainer = RLTrainer(
    config=TrainingConfig(
        backend="tinker",       # 或 "mint"
        algorithm="grpo",
        batch_size=32,
        lora_rank=16,
    )
)

Collect a batch from the experience buffer and train

从经验缓冲区采集批次数据并开始训练

async def run_training(buffer): batch = buffer.sample(n=32, split="support") # support/query separation result = await trainer.train(batch) print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
undefined
async def run_training(buffer): batch = buffer.sample(n=32, split="support") # 区分support/query数据集 result = await trainer.train(batch) print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
undefined

Reward modeling

奖励建模

python
from metaclaw.rewards import RewardModel

reward_model = RewardModel(provider="llm")  # uses configured LLM for scoring

async def score_turn(prompt: str, response: str) -> float:
    score = await reward_model.score(prompt=prompt, response=response)
    return score  # float in [-1.0, 1.0]

python
from metaclaw.rewards import RewardModel

reward_model = RewardModel(provider="llm")  # 使用已配置的LLM进行评分

async def score_turn(prompt: str, response: str) -> float:
    score = await reward_model.score(prompt=prompt, response=response)
    return score  # 范围在[-1.0, 1.0]之间的浮点数

Skills Lifecycle

技能生命周期

Conversation turn
 SkillInjector.retrieve()   ← vector search over SkillStore
       │  injects top-k skills into system prompt
 LLM responds
 ExperienceBuffer.record()  ← stores (context, response, metadata)
       ▼ (end of session)
 SkillSummarizer.run()      ← LLM extracts reusable patterns
 SkillStore.upsert()        ← new/updated skills persisted to disk

对话轮次
 SkillInjector.retrieve()   ← 在SkillStore中进行向量搜索
       │  将前k个技能注入系统提示词
 LLM生成回复
 ExperienceBuffer.record()  ← 存储(上下文、回复、元数据)
       ▼(会话结束)
 SkillSummarizer.run()      ← LLM提取可复用的模式
 SkillStore.upsert()        ← 新技能/更新后的技能持久化到磁盘

Integration: OpenAI SDK as Client

集成:以OpenAI SDK作为客户端

Point any OpenAI SDK client at the MetaClaw proxy:
python
from openai import OpenAI
将任意OpenAI SDK客户端指向MetaClaw代理服务:
python
from openai import OpenAI

MetaClaw proxy is running on localhost:8080

MetaClaw代理服务运行在localhost:8080

client = OpenAI( base_url="http://localhost:8080/v1", api_key="not-used-but-required-by-sdk" )
response = client.chat.completions.create( model="moonshot-v1-8k", # passed through to upstream messages=[ {"role": "user", "content": "Review my pull request strategy."} ] ) print(response.choices[0].message.content)

Skills are injected transparently — the client code does not change.

---
client = OpenAI( base_url="http://localhost:8080/v1", api_key="not-used-but-required-by-sdk" )
response = client.chat.completions.create( model="moonshot-v1-8k", # 直接传递给上游服务 messages=[ {"role": "user", "content": "Review my pull request strategy."} ] ) print(response.choices[0].message.content)

技能注入是完全透明的——客户端代码无需任何修改。

---

Scheduler (MadMax Mode)

调度器(MadMax模式)

The scheduler ensures RL weight updates never interrupt active use:
python
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig

scheduler = MadMaxScheduler(
    config=SchedulerConfig(
        sleep_hours=(22, 7),          # train between 22:00–07:00 local time
        idle_timeout_minutes=15,      # train after 15 min of no conversations
        google_calendar=True,         # also train during calendar meetings
        credentials_path="creds.json"
    )
)
调度器确保RL权重更新不会干扰正常使用:
python
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig

scheduler = MadMaxScheduler(
    config=SchedulerConfig(
        sleep_hours=(22, 7),          # 在本地时间22:00–07:00期间训练
        idle_timeout_minutes=15,      # 15分钟无对话时开始训练
        google_calendar=True,         # 同时在会议期间进行训练
        credentials_path="creds.json"
    )
)

Check if it's safe to train right now

检查当前是否适合训练

if await scheduler.is_training_window(): await trainer.train(batch)
undefined
if await scheduler.is_training_window(): await trainer.train(batch)
undefined

Google Calendar Setup

Google Calendar 设置

bash
undefined
bash
undefined

1. Enable Google Calendar API in Google Cloud Console

1. 在Google Cloud Console中启用Google Calendar API

2. Download OAuth2 credentials as creds.json

2. 下载OAuth2凭证并保存为creds.json

3. Set path in config or env

3. 在配置文件或环境变量中设置路径

export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"
export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"

4. First run will open browser for OAuth consent

4. 首次运行时会打开浏览器进行OAuth授权

metaclaw start

---
metaclaw start

---

Support/Query Set Separation

支持集/查询集分离

MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates:
python
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(
    max_size=2000,
    support_ratio=0.5   # 50% support, 50% query
)
MetaClaw将经验数据分为支持集和查询集,以防止过时的奖励信号影响模型更新:
python
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(
    max_size=2000,
    support_ratio=0.5   # 50% 支持集,50% 查询集
)

During training:

训练时:

support_batch = buffer.sample(n=16, split="support") # used to compute reward signal query_batch = buffer.sample(n=16, split="query") # used for gradient update
await trainer.train_meta(support=support_batch, query=query_batch)

---
support_batch = buffer.sample(n=16, split="support") # 用于计算奖励信号 query_batch = buffer.sample(n=16, split="query") # 用于梯度更新
await trainer.train_meta(support=support_batch, query=query_batch)

---

RL Backends

RL 后端

Tinker (default)

Tinker(默认)

yaml
rl:
  backend: tinker
  tinker_project: my-metaclaw-project
  lora_rank: 16
  learning_rate: 1e-4
yaml
rl:
  backend: tinker
  tinker_project: my-metaclaw-project
  lora_rank: 16
  learning_rate: 1e-4

MinT

MinT

bash
undefined
bash
undefined

Install MinT compatibility layer separately

单独安装MinT兼容层

pip install metaclaw-mint

```yaml
rl:
  backend: mint
  mint_endpoint: https://your-mint-endpoint
pip install metaclaw-mint

```yaml
rl:
  backend: mint
  mint_endpoint: https://your-mint-endpoint

Auto-detection

自动检测

yaml
rl:
  backend: auto   # tries tinker first, falls back to mint, errors if neither available

yaml
rl:
  backend: auto   # 优先尝试tinker,失败则切换到mint,若两者都不可用则报错

Troubleshooting

故障排查

Proxy not reachable after
metaclaw start
  • Check port conflicts:
    lsof -i :8080
  • Change
    proxy.port
    in config and restart
rl
mode: "No training backend available"
  • Ensure
    pip install -e ".[rl]"
    completed successfully
  • Verify
    METACLAW_TINKER_API_KEY
    or
    METACLAW_MINT_API_KEY
    is set
  • Try
    rl.backend: tinker
    explicitly instead of
    auto
Skills not persisting between sessions
  • Confirm
    skills.summarize_after_session: true
    in config
  • Check write permissions on
    ~/.metaclaw/skills/
  • Run
    metaclaw skills list
    to inspect stored skills
Madmax mode never trains
  • Verify
    scheduler.sleep_hours
    covers your timezone's night
  • Lower
    scheduler.idle_timeout_minutes
    for testing (e.g.,
    1
    )
  • Check scheduler logs:
    ~/.metaclaw/logs/scheduler.log
Google Calendar integration fails
  • Re-run OAuth flow: delete
    ~/.metaclaw/token.json
    and restart
  • Ensure Calendar API is enabled in your Google Cloud project
OPD teacher distillation errors
  • Only supported with
    rl.backend: tinker
  • Requires a separate teacher model endpoint in config:
    yaml
    rl:
      opd_teacher: true
      teacher_base_url: https://api.openai.com/v1
      teacher_model: gpt-4o

执行
metaclaw start
后代理服务无法访问
  • 检查端口冲突:
    lsof -i :8080
  • 修改配置文件中的
    proxy.port
    并重启服务
rl
模式提示:“No training backend available”
  • 确保
    pip install -e ".[rl]"
    安装成功
  • 验证环境变量
    METACLAW_TINKER_API_KEY
    METACLAW_MINT_API_KEY
    已正确设置
  • 尝试显式设置
    rl.backend: tinker
    而非
    auto
技能在会话间无法持久化
  • 确认配置文件中
    skills.summarize_after_session: true
    已启用
  • 检查
    ~/.metaclaw/skills/
    目录的写入权限
  • 执行
    metaclaw skills list
    查看已存储的技能
Madmax模式从未进行训练
  • 验证
    scheduler.sleep_hours
    是否覆盖了你所在时区的夜间时段
  • 降低
    scheduler.idle_timeout_minutes
    用于测试(例如设置为
    1
  • 查看调度器日志:
    ~/.metaclaw/logs/scheduler.log
Google Calendar集成失败
  • 重新执行OAuth流程:删除
    ~/.metaclaw/token.json
    并重启服务
  • 确保你的Google Cloud项目中已启用Calendar API
OPD教师蒸馏功能报错
  • 仅当
    rl.backend: tinker
    时支持该功能
  • 需要在配置文件中单独设置教师模型端点:
    yaml
    rl:
      opd_teacher: true
      teacher_base_url: https://api.openai.com/v1
      teacher_model: gpt-4o

CLI Reference

CLI 参考

bash
metaclaw setup                   # interactive config wizard
metaclaw start                   # start in madmax mode
metaclaw start --mode skills_only
metaclaw start --mode rl
metaclaw start --config path/to/config.yaml

metaclaw skills list             # show all stored skills
metaclaw skills delete <name>    # remove a skill
metaclaw skills export skills.json

metaclaw status                  # show proxy, scheduler, training status
metaclaw logs                    # tail all logs
metaclaw logs --component scheduler
bash
metaclaw setup                   # 交互式配置向导
metaclaw start                   # 以madmax模式启动
metaclaw start --mode skills_only
metaclaw start --mode rl
metaclaw start --config path/to/config.yaml

metaclaw skills list             # 查看所有已存储的技能
metaclaw skills delete <name>    # 删除指定技能
metaclaw skills export skills.json

metaclaw status                  # 查看代理服务、调度器、训练状态
metaclaw logs                    # 查看所有日志
metaclaw logs --component scheduler