metaclaw-evolving-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMetaClaw Evolving Agent
MetaClaw 进化型Agent
Skill by ara.so — Daily 2026 Skills collection
MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows.
由ara.so开发的Skill — 属于2026每日技能合集
MetaClaw是一款兼容OpenAI的代理Agent,它可以拦截对话、注入已学习的技能,并通过真实世界的交互持续自我优化。它支持三种模式:轻量技能注入模式、即时RL训练模式,以及智能“madmax”调度模式——该模式会将权重更新延迟到空闲/休眠时段进行。
Installation
安装
bash
undefinedbash
undefinedMinimal — skills injection only, no GPU required
最小安装 — 仅包含技能注入功能,无需GPU
pip install -e .
pip install -e .
Full RL training support (torch, transformers, tinker)
完整RL训练支持(包含torch、transformers、tinker)
pip install -e ".[rl]"
pip install -e ".[rl]"
Skill evolution via LLM summarization
通过LLM总结实现技能进化
pip install -e ".[evolve]"
pip install -e ".[evolve]"
Google Calendar scheduler for madmax mode
为madmax模式安装Google Calendar调度器
pip install -e ".[scheduler]"
pip install -e ".[scheduler]"
Recommended: everything
推荐:全功能安装
pip install -e ".[rl,evolve,scheduler]"
---pip install -e ".[rl,evolve,scheduler]"
---Quick Start
快速开始
bash
undefinedbash
undefinedOne-time interactive config wizard
一次性交互式配置向导
metaclaw setup
metaclaw setup
Start in default madmax mode (skills + RL + smart scheduler)
以默认madmax模式启动(技能注入 + RL训练 + 智能调度)
metaclaw start
metaclaw start
Skills only — no GPU, no Tinker needed
仅技能模式 — 无需GPU,无需Tinker
metaclaw start --mode skills_only
metaclaw start --mode skills_only
RL mode — trains immediately when batch is full
RL模式 — 当批次数据填满时立即开始训练
metaclaw start --mode rl
metaclaw start --mode rl
RL without scheduler (same as above, explicit)
无调度器的RL模式(与上述命令功能相同,显式声明)
metaclaw start --mode rl
After `metaclaw start`, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at `http://localhost:<port>` instead of the upstream LLM endpoint.
---metaclaw start --mode rl
执行`metaclaw start`后,本地会启动一个兼容OpenAI的代理服务。将你的客户端(OpenClaw或任何OpenAI SDK应用)指向`http://localhost:<port>`,而非上游LLM的端点。
---Configuration
配置
metaclaw setup~/.metaclaw/config.yamlyaml
undefinedmetaclaw setup~/.metaclaw/config.yamlyaml
undefined~/.metaclaw/config.yaml
~/.metaclaw/config.yaml
proxy:
host: 0.0.0.0
port: 8080
llm:
provider: kimi # kimi | qwen | claude | minimax | openai | gemini
base_url: https://api.moonshot.cn/v1
model: moonshot-v1-8k
api_key loaded from env: METACLAW_LLM_API_KEY
skills:
enabled: true
max_injected: 5 # max skills injected per turn
summarize_after_session: true
rl:
enabled: true
backend: auto # auto | tinker | mint
batch_size: 32
algorithm: grpo
opd_teacher: false # optional teacher distillation
scheduler: # madmax mode only
enabled: true
sleep_hours: [22, 7] # local 22:00–07:00
idle_timeout_minutes: 15
google_calendar: false # set true + configure OAuth for meeting detection
logging:
level: info
log_dir: ~/.metaclaw/logs
undefinedproxy:
host: 0.0.0.0
port: 8080
llm:
provider: kimi # kimi | qwen | claude | minimax | openai | gemini
base_url: https://api.moonshot.cn/v1
model: moonshot-v1-8k
api_key 从环境变量加载:METACLAW_LLM_API_KEY
skills:
enabled: true
max_injected: 5 # 每轮对话最多注入的技能数量
summarize_after_session: true
rl:
enabled: true
backend: auto # auto | tinker | mint
batch_size: 32
algorithm: grpo
opd_teacher: false # 可选的教师蒸馏功能
scheduler: # 仅madmax模式可用
enabled: true
sleep_hours: [22, 7] # 本地时间22:00–07:00
idle_timeout_minutes: 15
google_calendar: false # 设置为true并配置OAuth以启用会议检测
logging:
level: info
log_dir: ~/.metaclaw/logs
undefinedEnvironment Variables
环境变量
bash
export METACLAW_LLM_API_KEY="your-llm-api-key"
export METACLAW_TINKER_API_KEY="your-tinker-api-key" # rl mode
export METACLAW_MINT_API_KEY="your-mint-api-key" # if backend=mint
export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json" # schedulerbash
export METACLAW_LLM_API_KEY="your-llm-api-key"
export METACLAW_TINKER_API_KEY="your-tinker-api-key" # RL模式所需
export METACLAW_MINT_API_KEY="your-mint-api-key" # 当backend=mint时所需
export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json" # 调度器所需Operating Modes
运行模式
| Mode | Command | GPU Required | Description |
|---|---|---|---|
| | No | Proxy + skills injection + auto-summarization |
| | Via API | Skills + GRPO training when batch fills |
| | Via API | Skills + RL + scheduler (trains only during idle/sleep/meetings) |
| 模式 | 命令 | 是否需要GPU | 描述 |
|---|---|---|---|
| | 否 | 代理服务 + 技能注入 + 自动总结 |
| | 通过API调用 | 技能注入 + 当批次填满时进行GRPO训练 |
| | 通过API调用 | 技能注入 + RL训练 + 调度器(仅在空闲/休眠/会议时段训练) |
Python API
Python API
Programmatic startup
程序化启动
python
import asyncio
from metaclaw import MetaClawAgent, AgentConfig, Mode
async def main():
config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")
agent = MetaClawAgent(config, mode=Mode.MADMAX)
await agent.start()
asyncio.run(main())python
import asyncio
from metaclaw import MetaClawAgent, AgentConfig, Mode
async def main():
config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")
agent = MetaClawAgent(config, mode=Mode.MADMAX)
await agent.start()
asyncio.run(main())Manual skill injection
手动技能注入
python
from metaclaw.skills import SkillStore, SkillInjector
store = SkillStore(path="~/.metaclaw/skills")python
from metaclaw.skills import SkillStore, SkillInjector
store = SkillStore(path="~/.metaclaw/skills")Add a skill manually
手动添加技能
store.add(
name="code-review-checklist",
content="Always check for: 1) error handling, 2) type hints, 3) docstrings.",
tags=["code", "review"]
)
store.add(
name="code-review-checklist",
content="Always check for: 1) error handling, 2) type hints, 3) docstrings.",
tags=["code", "review"]
)
Retrieve top-k relevant skills for a query
根据查询检索最相关的前k个技能
injector = SkillInjector(store)
relevant = injector.retrieve(query="review my Python function", top_k=3)
for skill in relevant:
print(skill.name, skill.score)
undefinedinjector = SkillInjector(store)
relevant = injector.retrieve(query="review my Python function", top_k=3)
for skill in relevant:
print(skill.name, skill.score)
undefinedIntercepting and recording conversations
拦截并记录对话
python
from metaclaw.proxy import ConversationInterceptor
from metaclaw.memory import ExperienceBuffer
buffer = ExperienceBuffer(max_size=1000)
interceptor = ConversationInterceptor(
upstream_url="https://api.moonshot.cn/v1",
on_complete=buffer.record # called after each turn with (messages, response)
)python
from metaclaw.proxy import ConversationInterceptor
from metaclaw.memory import ExperienceBuffer
buffer = ExperienceBuffer(max_size=1000)
interceptor = ConversationInterceptor(
upstream_url="https://api.moonshot.cn/v1",
on_complete=buffer.record # 每轮对话结束后调用,传入(messages, response)
)buffer.record signature:
buffer.record 的签名:
async def on_complete(messages: list[dict], response: dict) -> None:
...
undefinedasync def on_complete(messages: list[dict], response: dict) -> None:
...
undefinedTriggering RL training manually
手动触发RL训练
python
from metaclaw.training import RLTrainer, TrainingConfig
trainer = RLTrainer(
config=TrainingConfig(
backend="tinker", # or "mint"
algorithm="grpo",
batch_size=32,
lora_rank=16,
)
)python
from metaclaw.training import RLTrainer, TrainingConfig
trainer = RLTrainer(
config=TrainingConfig(
backend="tinker", # 或 "mint"
algorithm="grpo",
batch_size=32,
lora_rank=16,
)
)Collect a batch from the experience buffer and train
从经验缓冲区采集批次数据并开始训练
async def run_training(buffer):
batch = buffer.sample(n=32, split="support") # support/query separation
result = await trainer.train(batch)
print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
undefinedasync def run_training(buffer):
batch = buffer.sample(n=32, split="support") # 区分support/query数据集
result = await trainer.train(batch)
print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
undefinedReward modeling
奖励建模
python
from metaclaw.rewards import RewardModel
reward_model = RewardModel(provider="llm") # uses configured LLM for scoring
async def score_turn(prompt: str, response: str) -> float:
score = await reward_model.score(prompt=prompt, response=response)
return score # float in [-1.0, 1.0]python
from metaclaw.rewards import RewardModel
reward_model = RewardModel(provider="llm") # 使用已配置的LLM进行评分
async def score_turn(prompt: str, response: str) -> float:
score = await reward_model.score(prompt=prompt, response=response)
return score # 范围在[-1.0, 1.0]之间的浮点数Skills Lifecycle
技能生命周期
Conversation turn
│
▼
SkillInjector.retrieve() ← vector search over SkillStore
│ injects top-k skills into system prompt
▼
LLM responds
│
▼
ExperienceBuffer.record() ← stores (context, response, metadata)
│
▼ (end of session)
SkillSummarizer.run() ← LLM extracts reusable patterns
│
▼
SkillStore.upsert() ← new/updated skills persisted to disk对话轮次
│
▼
SkillInjector.retrieve() ← 在SkillStore中进行向量搜索
│ 将前k个技能注入系统提示词
▼
LLM生成回复
│
▼
ExperienceBuffer.record() ← 存储(上下文、回复、元数据)
│
▼(会话结束)
SkillSummarizer.run() ← LLM提取可复用的模式
│
▼
SkillStore.upsert() ← 新技能/更新后的技能持久化到磁盘Integration: OpenAI SDK as Client
集成:以OpenAI SDK作为客户端
Point any OpenAI SDK client at the MetaClaw proxy:
python
from openai import OpenAI将任意OpenAI SDK客户端指向MetaClaw代理服务:
python
from openai import OpenAIMetaClaw proxy is running on localhost:8080
MetaClaw代理服务运行在localhost:8080
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="not-used-but-required-by-sdk"
)
response = client.chat.completions.create(
model="moonshot-v1-8k", # passed through to upstream
messages=[
{"role": "user", "content": "Review my pull request strategy."}
]
)
print(response.choices[0].message.content)
Skills are injected transparently — the client code does not change.
---client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="not-used-but-required-by-sdk"
)
response = client.chat.completions.create(
model="moonshot-v1-8k", # 直接传递给上游服务
messages=[
{"role": "user", "content": "Review my pull request strategy."}
]
)
print(response.choices[0].message.content)
技能注入是完全透明的——客户端代码无需任何修改。
---Scheduler (MadMax Mode)
调度器(MadMax模式)
The scheduler ensures RL weight updates never interrupt active use:
python
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig
scheduler = MadMaxScheduler(
config=SchedulerConfig(
sleep_hours=(22, 7), # train between 22:00–07:00 local time
idle_timeout_minutes=15, # train after 15 min of no conversations
google_calendar=True, # also train during calendar meetings
credentials_path="creds.json"
)
)调度器确保RL权重更新不会干扰正常使用:
python
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig
scheduler = MadMaxScheduler(
config=SchedulerConfig(
sleep_hours=(22, 7), # 在本地时间22:00–07:00期间训练
idle_timeout_minutes=15, # 15分钟无对话时开始训练
google_calendar=True, # 同时在会议期间进行训练
credentials_path="creds.json"
)
)Check if it's safe to train right now
检查当前是否适合训练
if await scheduler.is_training_window():
await trainer.train(batch)
undefinedif await scheduler.is_training_window():
await trainer.train(batch)
undefinedGoogle Calendar Setup
Google Calendar 设置
bash
undefinedbash
undefined1. Enable Google Calendar API in Google Cloud Console
1. 在Google Cloud Console中启用Google Calendar API
2. Download OAuth2 credentials as creds.json
2. 下载OAuth2凭证并保存为creds.json
3. Set path in config or env
3. 在配置文件或环境变量中设置路径
export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"
export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"
4. First run will open browser for OAuth consent
4. 首次运行时会打开浏览器进行OAuth授权
metaclaw start
---metaclaw start
---Support/Query Set Separation
支持集/查询集分离
MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates:
python
from metaclaw.memory import ExperienceBuffer
buffer = ExperienceBuffer(
max_size=2000,
support_ratio=0.5 # 50% support, 50% query
)MetaClaw将经验数据分为支持集和查询集,以防止过时的奖励信号影响模型更新:
python
from metaclaw.memory import ExperienceBuffer
buffer = ExperienceBuffer(
max_size=2000,
support_ratio=0.5 # 50% 支持集,50% 查询集
)During training:
训练时:
support_batch = buffer.sample(n=16, split="support") # used to compute reward signal
query_batch = buffer.sample(n=16, split="query") # used for gradient update
await trainer.train_meta(support=support_batch, query=query_batch)
---support_batch = buffer.sample(n=16, split="support") # 用于计算奖励信号
query_batch = buffer.sample(n=16, split="query") # 用于梯度更新
await trainer.train_meta(support=support_batch, query=query_batch)
---RL Backends
RL 后端
Tinker (default)
Tinker(默认)
yaml
rl:
backend: tinker
tinker_project: my-metaclaw-project
lora_rank: 16
learning_rate: 1e-4yaml
rl:
backend: tinker
tinker_project: my-metaclaw-project
lora_rank: 16
learning_rate: 1e-4MinT
MinT
bash
undefinedbash
undefinedInstall MinT compatibility layer separately
单独安装MinT兼容层
pip install metaclaw-mint
```yaml
rl:
backend: mint
mint_endpoint: https://your-mint-endpointpip install metaclaw-mint
```yaml
rl:
backend: mint
mint_endpoint: https://your-mint-endpointAuto-detection
自动检测
yaml
rl:
backend: auto # tries tinker first, falls back to mint, errors if neither availableyaml
rl:
backend: auto # 优先尝试tinker,失败则切换到mint,若两者都不可用则报错Troubleshooting
故障排查
Proxy not reachable after
metaclaw start- Check port conflicts:
lsof -i :8080 - Change in config and restart
proxy.port
rl- Ensure completed successfully
pip install -e ".[rl]" - Verify or
METACLAW_TINKER_API_KEYis setMETACLAW_MINT_API_KEY - Try explicitly instead of
rl.backend: tinkerauto
Skills not persisting between sessions
- Confirm in config
skills.summarize_after_session: true - Check write permissions on
~/.metaclaw/skills/ - Run to inspect stored skills
metaclaw skills list
Madmax mode never trains
- Verify covers your timezone's night
scheduler.sleep_hours - Lower for testing (e.g.,
scheduler.idle_timeout_minutes)1 - Check scheduler logs:
~/.metaclaw/logs/scheduler.log
Google Calendar integration fails
- Re-run OAuth flow: delete and restart
~/.metaclaw/token.json - Ensure Calendar API is enabled in your Google Cloud project
OPD teacher distillation errors
- Only supported with
rl.backend: tinker - Requires a separate teacher model endpoint in config:
yaml
rl: opd_teacher: true teacher_base_url: https://api.openai.com/v1 teacher_model: gpt-4o
执行后代理服务无法访问
metaclaw start- 检查端口冲突:
lsof -i :8080 - 修改配置文件中的并重启服务
proxy.port
rl- 确保安装成功
pip install -e ".[rl]" - 验证环境变量或
METACLAW_TINKER_API_KEY已正确设置METACLAW_MINT_API_KEY - 尝试显式设置而非
rl.backend: tinkerauto
技能在会话间无法持久化
- 确认配置文件中已启用
skills.summarize_after_session: true - 检查目录的写入权限
~/.metaclaw/skills/ - 执行查看已存储的技能
metaclaw skills list
Madmax模式从未进行训练
- 验证是否覆盖了你所在时区的夜间时段
scheduler.sleep_hours - 降低用于测试(例如设置为
scheduler.idle_timeout_minutes)1 - 查看调度器日志:
~/.metaclaw/logs/scheduler.log
Google Calendar集成失败
- 重新执行OAuth流程:删除并重启服务
~/.metaclaw/token.json - 确保你的Google Cloud项目中已启用Calendar API
OPD教师蒸馏功能报错
- 仅当时支持该功能
rl.backend: tinker - 需要在配置文件中单独设置教师模型端点:
yaml
rl: opd_teacher: true teacher_base_url: https://api.openai.com/v1 teacher_model: gpt-4o
CLI Reference
CLI 参考
bash
metaclaw setup # interactive config wizard
metaclaw start # start in madmax mode
metaclaw start --mode skills_only
metaclaw start --mode rl
metaclaw start --config path/to/config.yaml
metaclaw skills list # show all stored skills
metaclaw skills delete <name> # remove a skill
metaclaw skills export skills.json
metaclaw status # show proxy, scheduler, training status
metaclaw logs # tail all logs
metaclaw logs --component schedulerbash
metaclaw setup # 交互式配置向导
metaclaw start # 以madmax模式启动
metaclaw start --mode skills_only
metaclaw start --mode rl
metaclaw start --config path/to/config.yaml
metaclaw skills list # 查看所有已存储的技能
metaclaw skills delete <name> # 删除指定技能
metaclaw skills export skills.json
metaclaw status # 查看代理服务、调度器、训练状态
metaclaw logs # 查看所有日志
metaclaw logs --component scheduler