搜索：rl-training - AI Agent Skills

AI & Machine Learningkiterlin/intelligent-dete...

miles-rl-training

Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.

🇺🇸|EnglishTranslated

16

AI & Machine Learninghuggingface/skills

trl-training

Train and fine-tune transformer language models using TRL (Transformers Reinforcement Learning). Supports SFT, DPO, GRPO, KTO, RLOO and Reward Model training via CLI commands.

🇺🇸|EnglishTranslated

15

AI & Machine Learningkiterlin/intelligent-dete...

torchforge-rl-training

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or scalable training with Monarch and TorchTitan.

🇺🇸|EnglishTranslated

13

AI & Machine Learningkiterlin/intelligent-dete...

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

🇺🇸|EnglishTranslated

13

2 scripts/Attention

AI & Machine Learningkiterlin/intelligent-dete...

slime-rl-training

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

🇺🇸|EnglishTranslated

11

AI & Machine Learningaradotso/hermes-skills

openclaw-rl-training

Train personalized AI agents with reinforcement learning from conversational feedback using OpenClaw-RL's async framework

🇺🇸|EnglishTranslated

11

AI & Machine Learningkiterlin/intelligent-dete...

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

🇺🇸|EnglishTranslated

10

AI & Machine Learningaradotso/trending-skills

openclaw-rl-training

OpenClaw-RL framework for training personalized AI agents via reinforcement learning from natural conversation feedback

🇺🇸|EnglishTranslated

9

AI & Machine Learninghuggingface/skills

huggingface-llm-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

20

7 scripts/Checked

AI & Machine Learningsickn33/antigravity-aweso...

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for...

🇺🇸|EnglishTranslated

10

AI & Machine Learninghuggingface/skills

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

9

6 scripts/Checked

AI & Machine Learningaradotso/trending-skills

metaclaw-evolving-agent

Deploy and configure MetaClaw — an agent that meta-learns and evolves from live conversations using skills injection, RL training, and smart scheduling.

🇺🇸|EnglishTranslated

8

Search Results: rl-training

miles-rl-training

trl-training

torchforge-rl-training

grpo-rl-training

slime-rl-training

openclaw-rl-training

verl-rl-training

openclaw-rl-training

huggingface-llm-trainer

hugging-face-model-trainer

hugging-face-model-trainer

metaclaw-evolving-agent

Search Results: rl-training

miles-rl-training

trl-training

torchforge-rl-training

grpo-rl-training

slime-rl-training

openclaw-rl-training

verl-rl-training

openclaw-rl-training

huggingface-llm-trainer

hugging-face-model-trainer

hugging-face-model-trainer

metaclaw-evolving-agent