Hello,
I am using GPUDirect Storage (GDS) on a machine with an NVIDIA RTX A6000 (48 GB HBM) and Ubuntu 24.04 with kernel 6.8.0-85-generic. I am trying to pre-register a large GPU buffer with cuFileBufRegister() for use as a GPU-side block cache.
The issue:
-
Registration works for sizes up to ~256 MB.
-
Any registration size above ~256 MB fails with error 5016 (CU_FILE_INVALID_MAPPING_SIZE).
-
I attempted increasing the GDS limits in
/etc/cufile.json:
"max_direct_io_size_kb": 2097152,
"max_device_cache_size_kb": 2097152,
"max_device_pinned_mem_size_kb": 33554432
but registration still fails above 256 MB.
Additional details:
-
cuFileDriverOpen()succeeds. -
The kernel does not have an
nvidia_fsmodule (modprobe nvidia_fsfails, and nonvidia-fssystemd service exists). -
GDS otherwise works fine (I/O flows through cuFile and I can do direct GPU reads).
-
The system is running CUDA 12.x with the standard NVIDIA driver (same issue persists across restarts).
Question:
-
Is there a known upper limit (≈256 MB) for
cuFileBufRegister()on systems without thenvidia_fskernel module? -
How can I properly increase the maximum GPU buffer size that can be registered with GDS on this configuration?
-
Does the absence of
nvidia_fsindicate that GDS is running in a fallback / compatibility mode that enforces smaller registration limits? -
What configuration or driver changes are required to allow registering a larger GPU buffer (e.g., 512 MB, 1 GB, or 2 GB)?
Thank you!