Loading...
Loading...
Read and write large cuPyNumeric arrays to HDF5 with Legate's parallel, distributed HDF5 I/O (legate.io.hdf5: to_file, from_file, from_file_batched). Use when a developer needs to save a cuPyNumeric array to an .h5/.hdf5 file, load an HDF5 dataset into a distributed cuPyNumeric array, read a large HDF5 dataset in chunks, hand arrays to an HPC pipeline as a single file, or accelerate HDF5 disk I/O with GPUDirect Storage (GDS). Do not use it for Parquet/cuDF/raw-binary or other sharded/custom layouts (see the cupynumeric-parallel-data-load skill), Zarr or object-store/S3 output, .npz or pickled archives, plain h5py without cuPyNumeric, or pure array compute such as FFT, matmul, or reductions.
npx skill4agent add nvidia/skills cupynumeric-hdf5legate.io.hdf5.h5.hdf5assets/legate.h5.hdf5legate.io.hdf5legate.io.hdf5.npznp.loadcn.asarray(...)legate.io.hdf5cupynumeric.load.npywith h5py.File(path, "r") as f: arr = f["dataset"][:]legate.io.hdf5conda install -c conda-forge h5py # required; legate/io/hdf5.py imports it at loadfrom legate.io.hdf5 import ...ModuleNotFoundErrorh5py| Function | Signature | Purpose |
|---|---|---|
| | Write a cuPyNumeric array / |
| | Read one HDF5 dataset into a distributed array. |
| | Read a dataset in chunks — chunks the file read, not the assembled array. |
legate.io.hdf5dataset_name"/data""/group/x"import cupynumeric as cn
from legate.core import get_legate_runtime
from legate.io.hdf5 import from_file, to_file
a = cn.arange(64, dtype=cn.float32).reshape(8, 8)
# Write: pass the cuPyNumeric ndarray straight in - no manual conversion.
to_file(array=a, path="out.h5", dataset_name="/data")
get_legate_runtime().issue_execution_fence(block=True) # needed before any external reader
# Read: from_file returns a legate LogicalArray; cn.asarray bridges it back.
b = cn.asarray(from_file("out.h5", dataset_name="/data"))
assert cn.array_equal(a, b)assets/hdf5_roundtrip.pyfrom_file_batchedLogicalArraychunk_size=2chunk_sizeoutimport h5py
import cupynumeric as cn
from legate.core import get_legate_runtime
from legate.io.hdf5 import from_file_batched
with h5py.File("big.h5", "r") as f: # read shape/dtype without loading data
shape, dtype = f["data"].shape, f["data"].dtype
out = cn.empty(shape, dtype=dtype)
for chunk, (r0, c0) in from_file_batched("big.h5", "data", chunk_size=(4096, 4096)):
out[r0:r0 + chunk.shape[0], c0:c0 + chunk.shape[1]] = cn.asarray(chunk)
get_legate_runtime().issue_execution_fence(block=True)chunk_sizefrom_file_batchedValueErrorassets/hdf5_batched_read.pyto_file__legate_data_interface__to_fileLogicalArrayLikenp.array(...)cn.asarray(...)from_filefrom_file_batchedLogicalArraycn.asarray(la)to_fileget_legate_runtime().issue_execution_fence(block=True)from_filecd /tmpsys.pathcupynumeric/ModuleNotFoundError: cupynumeric.install_infopathto_filefrom_filepathtempfile.mkstemp()to_fileh5pyto_fileto_filepathto_filepath/path/to/file.h5ValueErrorto_fileValueErrorcreate_array(dtype, ndim=n)LogicalArrayLEGATE_IO_USE_VFD_GDS=1export LEGATE_IO_USE_VFD_GDS=1 # set before launching
# or, with the legate driver:
legate --io-use-vfd-gds my_script.py=1export CUFILE_ALLOW_COMPAT_MODE=true=1H5FD__gds_open: Successfully opened file w/GDS VFD| Symptom | Cause and fix |
|---|---|
| h5py is missing — |
File looks empty/truncated to h5py right after | The async write hasn't landed — add |
| |
| Running inside the source tree — |
| Abort/crash reading a GPU array ≳128 MB | Default 128 MB ZCMEM staging buffer — set |
| Expected — wrap it with |
legate.io.hdf5legate.core.io.hdf5dataset_nameLEGATE_IO_USE_VFD_GDS=1cd /tmp # outside the cupynumeric source tree
conda install -c conda-forge h5py # one-time, if not already present
LEGATE_CONFIG="--cpus 4" LEGATE_AUTO_CONFIG=0 python <skill>/assets/hdf5_roundtrip.py
LEGATE_CONFIG="--cpus 4" LEGATE_AUTO_CONFIG=0 python <skill>/assets/hdf5_batched_read.pyHDF5 ROUND TRIP OKHDF5 BATCHED READ OK--gpus 1LEGATE_IO_USE_VFD_GDS=1