NV-Generate-MR-Brain-Finetune
Purpose
- Used for finetuning the NV-Generate-CTMR diffusion UNet from user-supplied NIfTI training volumes.
- Not for clinical interpretation, regulatory use, or approving synthetic data for production training.
- The wrapper stages the config glue locally and delegates execution to existing upstream scripts:
scripts.diff_model_create_training_data
, , and optionally . It does not execute the notebook.
- Manifest I/O: inputs are and ; outputs are , optional , and .
- The underlying training contract is the upstream config/env JSON (the same one driven from cell of
train_diff_unet_tutorial.ipynb
). The wrapper stages those JSON files for you and exposes the most-tuned fields as CLI flags; the sections below document the fields, their defaults, and how to monitor/tune a run.
Instructions
- Read before changing arguments, side effects, or validation gates.
- Run
scripts/run_mr_brain_finetune.py
from the Medical AI Skills repo root.
- If a host agent exposes , use
run_script("scripts/run_mr_brain_finetune.py", args=[...])
; otherwise run the Bash/Python command below.
- Use first when checking a new datalist; remove only when the user explicitly wants to launch GPU finetuning.
- For a staged preflight input bundle directory, use
BUNDLE/preflight_datalist.json
as the datalist and as when those files are present.
Examples
Validate and stage a preflight finetune check from an input bundle (the recommended first step — no GPU, no training). This is the single canonical command; replace
and
with your paths:
bash
export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
INPUT_BUNDLE/preflight_datalist.json \
--data-base-dir INPUT_BUNDLE/preflight_dataset \
--output-dir OUT_DIR \
--modality mri_t1 \
--preflight
For real GPU finetuning and other variations, see
Usage below.
Available Scripts
| Script | Purpose | Arguments |
|---|
scripts/run_mr_brain_finetune.py
| Primary entrypoint declared by . | DATALIST.json --data-base-dir DATA_DIR --output-dir OUT_DIR [--epochs N] [--modality mri_t1] [--num-gpus N] [--no-amp] [--model-config FILE] [--run-inference] [--preflight]
|
Prerequisites
- may point to a current checkout of
https://github.com/NVIDIA-Medtech/NV-Generate-CTMR
containing scripts/diff_model_create_training_data.py
, scripts/diff_model_train.py
, and scripts/diff_model_infer.py
.
- If is unset, the wrapper searches
.workbench_data/upstreams/NV-Generate-CTMR
.
- is optional and can be used to select the GPU for real training.
- Runtime requirements: NVIDIA CUDA GPU for real training, Python packages from the upstream , and downloaded MR-brain weights.
- Side effects: writes staged configs, embeddings, checkpoints, optional inference images, and logs under the caller-provided ; may write model caches under the upstream checkout and ; may contact for model assets and for the upstream checkout.
- The datalist is a MONAI-style JSON object with paths relative to . is optional and defaults to .
1. Config and environment JSON (adapt to your data)
This is a thin wrapper around the upstream
train_diff_unet_tutorial.ipynb
flow. Each run performs four steps, delegating the heavy lifting to the model author's scripts:
- Stage configs — copy the three config JSONs and rewrite only the run-specific paths and (notebook cell 15).
python -m scripts.diff_model_create_training_data
→ latent embeddings (cell 17).
- Write embedding sidecars — a per embedding with / (and body-region indices when the model uses them). This is the one piece of glue that lives in the notebook (cell 19), not in upstream , and requires it; the skill owns it.
python -m scripts.diff_model_train
(cell 21), optionally python -m scripts.diff_model_infer
.
Tune by editing the config JSON, not by adding flags. All training/inference hyperparameters (
,
,
, inference
/
/
/
, …) live in
config_maisi_diff_model_rflow-mr-brain.json
. Edit the upstream copy, or pass your own with
(and
/
for the other two). The wrapper only ever rewrites the fields below.
Environment JSON (
environment_maisi_diff_model_rflow-mr-brain.json
) — fields the wrapper rewrites per run:
| Field | Set from | Notes |
|---|
| | Root for relative paths. |
| your datalist | Staged copy with per-entry filled in. |
| , , | | Latent embeddings, checkpoints, inference images. |
| upstream | Maps modality name → integer code. |
| | Output checkpoint name (default diff_unet_3d_rflow-mr-brain_v0.pt
). |
| upstream weights / | Starting checkpoint; cleared by . |
| upstream weights / --trained-autoencoder-path
| VAE used to encode/decode latents. |
Model config (
config_maisi_diff_model_rflow-mr-brain.json
) — the only fields the wrapper touches:
| Field | Set from | Default | Notes |
|---|
diffusion_unet_train.n_epochs
| | (upstream config ships ) | Convenience override (cell 15 does the same); wrapper default is small for verification. |
diffusion_unet_inference.modality
| | from | Kept consistent with the training modality for optional . |
Everything else in that file (
,
,
, the rest of
) is left exactly as written — edit the JSON to change it.
Runtime flags (not config fields):
(
launches
),
(disable mixed precision, passed through to
).
selects the integer code from
configs/modality_mapping.json
. Supported brain values:
(8),
(9, default),
(10),
(11),
(20), and their
variants (29/30/31/32). Per-case
overrides
. The modality also feeds the step-3 embedding sidecars.
For an end-to-end reference including example data download and checkpoint loading, see the upstream tutorial
train_diff_unet_tutorial.ipynb
.
2. Usage (one-line training)
Preflight only:
bash
export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
PATH_TO_DATALIST.json \
--data-base-dir PATH_TO_DATA_ROOT \
--output-dir runs/nv_generate_mr_brain_finetune_preflight \
--preflight
Preflight bundle input:
bash
export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
PATH_TO_INPUT_BUNDLE/preflight_datalist.json \
--data-base-dir PATH_TO_INPUT_BUNDLE/preflight_dataset \
--output-dir runs/nv_generate_mr_brain_finetune_preflight \
--preflight
GPU finetuning:
bash
export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python -m pip install -r "$NV_GENERATE_ROOT/requirements.txt" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
PATH_TO_DATALIST.json \
--data-base-dir PATH_TO_DATA_ROOT \
--output-dir runs/nv_generate_mr_brain_finetune \
--epochs 2 \
--modality mri_t1 \
--run-inference
Replace
and
with the user's actual paths. Do not use the fixture datalist for real training; it is a preflight-only placeholder.
3. Monitor training (TensorBoard)
writes TensorBoard event files under the staged
(
). Launch TensorBoard against the output directory and watch the loss curve:
bash
python -m pip install tensorboard && \
tensorboard --logdir runs/nv_generate_mr_brain_finetune/artifacts
The run summary is written to
OUT_DIR/artifacts/workflow_summary.json
(checkpoint path, embedding sidecars, inference outputs); the JSON the wrapper prints to stdout mirrors the same paths plus
and a
for quick triage.
4. Hyperparameter tuning and common pitfalls
- Loss not decreasing / unstable — lower (default ) in the model-config JSON, or keep AMP on (default); is slower but more numerically stable on older GPUs.
- Out-of-memory — keep
diffusion_unet_train.batch_size
at and at in the config JSON, and confirm the autoencoder/UNet fit your GPU before scaling. Multi-GPU () shards the batch via .
- Few cases / quick check — keep small (the wrapper default is for verification, not convergence; the upstream config ships ).
- Wrong modality conditioning — set or per-case to a value present in
configs/modality_mapping.json
; a mismatch produces a clear error rather than silently mislabeling latents.
- Slow startup on first run —
diff_model_create_training_data
precomputes latent embeddings once; reuse the same to avoid recomputing them.
5. Evaluate the finetuned model
Use the staged checkpoint (
OUT_DIR/artifacts/models/<model_filename>
) as the diffusion UNet for generation, then inspect the synthesized volumes:
- Pass here for a quick built-in sanity render, or
- Point the inference skill at the finetuned checkpoint to generate fresh brain MRI volumes for qualitative review.
This skill gates file accounting and command provenance only — anatomical realism and downstream utility must be judged by a domain expert on the generated images.
Limitations
- Requires a current upstream checkout with the existing diffusion training scripts. The skill itself stages the required config and datalist glue locally and does not depend on the notebook or PR #33.
- Full training can be expensive and is not deterministic across hardware, CUDA, and package versions.
- The wrapper gates file accounting and command provenance, not anatomical realism or downstream model utility.
- Not for clinical deployment, clinical interpretation, autonomous diagnosis, regulatory submission, or production training-data approval.
Troubleshooting
| Error | Cause | Fix |
|---|
diffusion training scripts were not found
| does not point at a current NV-Generate-CTMR checkout. | Clone or update https://github.com/NVIDIA-Medtech/NV-Generate-CTMR
and set . |
| paths are not relative to or files are absent. | Fix the datalist or pass the correct data root. |
| CUDA or MONAI import failure | Runtime environment lacks upstream dependencies. | Install "$NV_GENERATE_ROOT/requirements.txt"
in the selected environment. |