nemotron-customize
Purpose
Use this skill to turn a model-customization request into a repo-native Nemotron step pipeline. It plans the step DAG, validates artifact wiring, and creates only the YAML configs needed to run existing steps.
Use it only for inspecting, configuring, validating, running, or submitting
existing Nemotron steps or multi-step training/customization pipelines. If the
request is a frontend, dashboard, visualization, generic ML-advice,
billing/access, or unrelated coding task, stop with a short scope note and do
not inspect the step catalog or edit files in that turn.
Security Notes
This skill may use
to create or modify YAML/README files and
to
run repository commands. Confirm with the user before file writes or shell
execution. Keep Bash usage scoped to repo-safe commands such as
uv run nemotron steps ...
,
,
, and targeted
validation commands. Never run environment dumps (
,
, broad
) or commands that expose secret values.
Requirements
- Checkout of this Nemotron repo with present.
- Invoke from the repo root. All paths in this document are repo-root-relative.
- User-provided model, data, hardware, backend, and output constraints before writing configs.
- Backend credentials only when the selected step needs them (translation, W&B, hosted endpoints).
Limitations
- Does not invent new catalog steps when an existing one fits.
- New Python/shell code only in Explorer mode after the gap is explicit.
- Post-training deployment-only requests are out of scope.
Invocation:
. The repo under
is the
source of truth; this skill orchestrates and does not duplicate per-step
knowledge.
Priority order: (1) reuse existing repo code, CLIs, recipes, steps, runners,
and configs; (2) add YAML configs for the user's request; (3) generate new
Python/shell only when the repo cannot satisfy the request, and name the gap
first.
For a command request: verify repo root, read the step catalog, read the
selected
, verify the requested config exists, read the active env
TOML for any remote profile, then emit the complete command. Do not guess
profiles from examples or naming conventions.
Quick Decision Tree
- AutoModel vs Megatron-Bridge: small GPU count, Hugging Face model,
LoRA/PEFT, or OpenAI-style chat JSONL → AutoModel path (
or the matching PEFT AutoModel step). Large distributed training, packed
Parquet/binidx data, or full fine-tuning → Megatron-Bridge, but verify
against and the category README first.
- BYOB / MCQ benchmark inputs route to , NOT
. BYOB preserves the multiple-choice schema
(question, choices, answer); the translate path would flatten or strip
those fields. Trigger on phrases like "BYOB benchmark", "MCQ", "evaluation
benchmark Parquet", "multiple-choice prep".
- Curate then translate: when the user says "curate and translate",
"filter then translate", or "prep data before translating", chain
(filter raw JSONL) →
(translate curated JSONL). Do not skip the curate stage.
- Checkpoint conversion: route "Megatron to HF", "HF export", "convert
checkpoint", or "iter_* to safetensors" to ; route
"HF to Megatron" imports to . Use a concrete
source for Megatron exports.
- Existing endpoint or checkpoint eval: route hosted endpoint smoke tests
and benchmark requests to ; use for hosted chat
smoke and for Megatron checkpoint evaluation.
- No env TOML profile present: do not invent Lepton or
profiles; ask the user or fall back to local execution.
Required inputs before finalizing configs or commands:
- , , , hardware/GPU count, backend/env profile,
and any needed API key environment variable name such as or an
evaluator key.
- For translation commands, also collect , target/source languages,
and the runtime-visible input/output paths.
- For BYOB, collect benchmark/source document path, stage (,
, , or ), target/source languages when translating,
and output directory.
- For conversion, collect source checkpoint path, output path, model/config
source, and whether the source is HF, Megatron , or LoRA adapter.
- For eval, collect endpoint URL/model ID or checkpoint path, task IDs,
endpoint type, API-key environment variable name, and sample limit.
Response shape for recommendations:
,
,
,
,
, and
. Always call out the stack to avoid
when the user's constraints make it a poor fit.
How information is split (and where to find it)
| Question | Look here |
|---|
| What does step X consume / produce / parameterize? | src/nemotron/steps/<cat>/<X>/step.toml
|
| When/why pick step X over its siblings? | src/nemotron/steps/<cat>/<X>/README.md
|
| Which step in category C should I pick? | src/nemotron/steps/<cat>/README.md
|
| What runner code does step X use? | src/nemotron/steps/<cat>/<X>/step.py
→ src/nemotron/steps/_runners/
|
| Cross-step constraint (tokenizer lock, sequence packing, data quality, ...) | src/nemotron/steps/patterns/<id>.md
|
| Artifact compatibility / hierarchy | src/nemotron/steps/types.toml
|
| GPU memory / parallelism heuristics | src/nemotron/steps/hardware.md
|
| Library API extracts for exceptional code generation | references/context/index.toml
→ references/context/<pack>.txt
|
| Project scaffold rules, only when repo code cannot support the request | references/act/PROJECT.md
|
| Per-stage code rules, only when repo code cannot support the request | |
If two sources say the same thing, the
deeper, more specific one wins
(
> category
> this file).
Instructions
Pipeline workflow (≥2 stages): Orient → Plan → Act → Verify. Discover
candidate steps, propose a DAG with validated artifact wiring, wait for
approval, create the minimal YAML configs, and re-check before reporting done.
Not general ML advice —
is the source of truth.
Single-step command flow:
- Confirm the repo root has and .
- Run
uv run nemotron steps list --json
when available; otherwise read
src/nemotron/steps/STEPS.md
.
- Read the selected step's and the requested checked-in config.
- For remote execution, read or a repo-root
and pick an actual section whose profile matches the step.
- Emit the full command in one reply; then add brief rationale for the
config/profile choices. For translation, also read
src/nemotron/steps/translate/README.md
and return , ,
, , .
Source tiers for command answers — Verified (CLI + manifest + config +
env + dry-run all succeeded), Repo-grounded (manifest/config/env read, no
dry-run), Blocked (a required repo file or env TOML is missing — name it and
stop before guessing).
Canonical commands:
bash
uv run nemotron steps run <step_id> -c <config-or-path> --dry-run
uv run nemotron steps run <step_id> -c <config-or-path> --dry-run --batch <profile>
uv run nemotron steps run <step_id> -c <config-or-path> --batch <profile>
Workflow
Four phases, in order:
Orient → Plan → Act → Verify. Never skip Verify.
For detailed phase checklists and Explorer-mode implementation rules, read
.
Operational Nuances
- Smoke configs (, ) are wiring tests, not quality evidence.
- references belong in recipe-backed configs; standalone YAML uses plain paths.
- Keep pretraining data and from the same Nemotron release.
Examples
-
Single step: read manifest + config + env profile, then return a complete
uv run nemotron steps run <step_id> -c <config> --dry-run
command.
-
Translate (one-shot command): for "translate EN → <lang>" requests,
collect
,
, source/target language,
, and
runtime-visible input/output paths first, then emit the full command in one
reply (do not split across turns):
bash
uv run nemotron steps run translate/nemo_curator \
-c <translate-config.yaml> \
--batch <env-profile-from-env.toml>
-
Curate then translate: chain
→
. The curate stage produces filtered JSONL that
becomes the translate stage input. Both steps need YAML overlays; wire
curate's
to translate's
.
-
BYOB benchmark prep: route MCQ Parquet inputs through
, not
, so the multiple-choice schema is preserved.
-
SFT pipeline: plan the DAG (
→
or
), validate artifact edges via
, then create the
YAML overlays.
Two modes
Catalog mode — a step exists
Fast path:
STEPS.md → category/README.md → step.toml → step.py → adapt YAML config
. Use whenever the user's request maps to a step in the catalog.
Explorer mode — no repo path supports it
Use only after confirming no existing step, runner, recipe, CLI, or YAML config
surface can satisfy the request. Follow
.
Choosing a mode
| User says | Mode |
|---|
| "SFT with Megatron-Bridge / AutoModel" | Catalog |
| "DPO / RLVR / GRPO / RLHF" | Catalog: |
| "Synthesize preference / SFT data" | Catalog: |
| "Translate EN → <lang> for training data" | Catalog: |
| "Curate and translate" / "filter then translate" | Catalog chain: → |
| "Curate web text" | Catalog: |
| "BYOB benchmark" / "MCQ benchmark prep" | Catalog: (preserves MCQ schema) |
| "Train with X exotic backend" | Explorer or ask |
| Post-training-only request | Out of scope; redirect to a more appropriate workflow. |
| Ambiguous | Ask |
Boundaries
Do: build pipelines from existing steps and cite
directly;
reuse repo CLIs/runners/recipes first; adapt configs (don't copy
blindly); ask about hardware/data/backend/output path; surface
tradeoffs (Megatron-Bridge vs AutoModel, full FT vs LoRA); present the plan
and wait for approval.
Don't: invent steps; skip Plan for pipelines ≥2 stages; generate Python or
shell when YAML suffices; import modules outside the step's reference code;
add monitoring/W&B unless asked; tune parallelism beyond
and
; assume GPU count; generate Slurm/Airflow/Kubeflow wrappers;
handle non-training requests in this skill; modify
;
restate per-step rules here — link the step's
.
Troubleshooting
| Situation | Action |
|---|
| Artifact types do not chain | Recheck ; change the DAG before writing configs. |
| Remote profile unclear / ambiguous | Read the active env TOML; do not guess. |
| Config key unclear | Read the step config, , and shared runner before editing. |
| Strategy points to a missing skill file | Skip the load; use the text and flag the plan with WARNING: <topic> docs unavailable
. |
| Hardware too small | Show ; suggest smaller model → AutoModel → LoRA. |
| Two failed Act attempts | Stop, explain what was tried and what failed, ask the user how to proceed. |
| No existing repo path matches | Check libraries cited in . If supported, use Explorer mode; otherwise ask. |