Search Results: reinforcement-learning

Found 22 Skills

DevOps & Cloud Servicesaws/agent-toolkit-for-aws

aws-lambda-microvms

Builds, runs, debugs, and operates applications on AWS Lambda MicroVMs — Firecracker-isolated, snapshot-resumable serverless compute environments running inside a container with up to 8 hr lifetimes. Applicable when workloads need strong isolation between tenants, isolated serverless compute, sandbox compute, or secure multi-tenant execution. Also suited for AI/agent code-execution sandboxes, interactive code playgrounds and notebooks (Jupyter, REPLs, dev environments running user-supplied code), reinforcement-learning environments, multi-tenant CI executors and build runners, sessionful game or simulation servers, or isolated security scanners. Also applicable when the workload needs long-lived sessions, a real port-listening server (gRPC, WebSocket, custom TCP protocols), state preserved across periods of inactivity (suspend/resume), container-level access (FUSE, eBPF, custom syscalls), or session-affine routing.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

nemo-rl-brev-etiquette

Brev instance operating guidance for NeMo-RL agents working in /home/ubuntu/RL with limited workspace disk, a larger /ephemeral volume, and optional /home/ubuntu/RL/.env secrets. Use when running nemo-rl-auto-research campaigns, experiments, training jobs, model or dataset downloads, shared cache-heavy commands, log-producing runs, checkpoint generation, W&B or Hugging Face authenticated workflows, or any workflow that may create large files on Brev.

🇺🇸|EnglishTranslated

AI & Machine Learningqodex-ai/ai-agent-skills

autonomous-agent-gaming

Build autonomous game-playing agents using AI and reinforcement learning. Covers game environments, agent decision-making, strategy development, and performance optimization. Use when creating game-playing bots, testing game AI, strategic decision-making systems, or game theory applications.

🇺🇸|EnglishTranslated

10 scripts/Checked

AI & Machine Learninghuggingface/skills

trl-training

Train and fine-tune transformer language models using TRL (Transformers Reinforcement Learning). Supports SFT, DPO, GRPO, KTO, RLOO and Reward Model training via CLI commands.

🇺🇸|EnglishTranslated

AI & Machine Learningkinhluan/skills

federated-learning-dqn

Federated learning with Deep Q-Networks for privacy-preserving optimization

🇺🇸|EnglishTranslated

AI & Machine Learningkiterlin/intelligent-dete...

slime-rl-training

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

nemo-gym-debugging

Use when debugging a Nemo Gym run or reward profiling job. Covers rollout collection failures, empty or partial JSONL outputs, stale materialized inputs, verifier/schema errors, Ray or Slurm issues, vLLM readiness, judge failures, tool/sandbox failures, cache problems, and throughput bottlenecks.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningaradotso/hermes-skills

openclaw-rl-training

Train personalized AI agents with reinforcement learning from conversational feedback using OpenClaw-RL's async framework

🇺🇸|EnglishTranslated

AI & Machine Learningtheneoai/awesome-skills

deepmind-researcher

DeepMind Researcher: AGI through deep understanding, AlphaGo/AlphaZero RL, AlphaFold scientific discovery, Gemini multimodal, neuroscience-inspired architectures. Scientific rigor + industrial scale. Triggers: DeepMind research, AlphaGo algorithms, protein folding AI, scientif...

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

auto-research

Autonomous NeMo-RL research agent workflow for directed hypothesis testing and open-ended discovery. Guides agents through the full experiment lifecycle: understanding recipes and environments, wiring RL or NeMo-gym runs, launching reproducible baselines and iterations, analyzing results, preserving human oversight, and using git plus TSV logs as the research ledger.

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/trending-skills

openclaw-rl-training

OpenClaw-RL framework for training personalized AI agents via reinforcement learning from natural conversation feedback

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

nemo-gym-pivot-datasets

Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

🇺🇸|EnglishTranslated

5 scripts/Checked