GPUDirect Storage access remote SSD

Hi there,

I’m looking for assistance with setting up GPUDirect Storage (GDS) to access a remote NVMe storage device via RDMA. I have a remote NVMe-over-RDMA block device (/dev/nvme4n1), and I’m trying to use gdsio to perform read/write operations to it.

Hardware Setup

  • CPU: INTEL(R) XEON(R) GOLD 6526Y, 64 cores
  • GPU: NVIDIA A100-SXM4-40GB
  • SSD: SAMSUNG MZQL21T9HCJR-00A07, (remote)

Software Setup

  • Ubuntu 22.04, Linux kernel 5.15.0
  • MLNX_OFED: MLNX_OFED_LINUX-24.10-1.1.4.0-ubuntu22.04-x86_64
  • cuda 12.6
  • GDS release version: 1.11.1.6
  • nvidia_fs version: 2.22
  • libcufile version: 2.12
  • Platform: x86_64
  • Nvidia: 560.35.05
  • iommu is disabled.

Output of gdscheck:

$ sudo gdscheck -p
 GDS release version: 1.11.1.6
 nvidia_fs version:  2.22 libcufile version: 2.12
 Platform: x86_64
 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Supported
 NVMeOF             : Supported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 BeeGFS             : Unsupported
 WekaFS             : Supported
 Userspace RDMA     : Supported
 --Mellanox PeerDirect : Enabled
 --rdma library        : Loaded (libcufile_rdma.so)
 --rdma devices        : Configured
 --rdma_device_status  : Up: 1 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.force_compat_mode : false
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_size : 128
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 16384
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 33554432
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.beegfs.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 fs.gpfs.gds_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 execution.max_io_threads : 4
 execution.max_io_queue_depth : 128
 execution.parallel_io : true
 execution.min_io_threshold_size_kb : 8192
 execution.max_request_parallelism : 4
 properties.force_odirect_mode : false
 properties.prefer_iouring : false
 =========
 GPU INFO:
 =========
 GPU index 0 NVIDIA A100-SXM4-40GB bar:1 bar size (MiB):65536 supports GDS, IOMMU State: Disabled
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Nvidia Driver Info Status: Supported(Nvidia Open Driver Installed)
 Cuda Driver Version Installed:  12060
 Platform: R283-S93-AAF1-000, Arch: x86_64(Linux 5.15.134+release+2.10.0r8-amd64)
 Platform verification succeeded

Mount Setup

$ df -Th | grep nvme4n1
/dev/nvme4n1                         ext4   1.8T  153G  1.5T  10% /mnt/remote

$ findmnt -o TARGET,FSTYPE,OPTIONS,SOURCE /mnt/remote
TARGET      FSTYPE OPTIONS                            SOURCE
/mnt/remote ext4   rw,relatime,stripe=32,data=ordered /dev/nvme4n1

Output of stat

$ stat /mnt/remote/
  File: /mnt/remote/
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 10305h/66309d	Inode: 2           Links: 3
Access: (0755/drwxr-xr-x)  Uid: ( 1000/  user)   Gid: ( 1000/  user)
Access: 2025-05-08 20:12:36.544308291 +0000
Modify: 2025-05-08 18:46:39.982090301 +0000
Change: 2025-05-08 18:46:39.982090301 +0000
 Birth: 2025-05-08 17:49:35.000000000 +0000

Topology

$ nvidia-smi topo -m
	    GPU0	NIC0	NIC1	NIC2	CPU Affinity	NUMA Affinity	GPU NUMA ID
GPU0	X 	    SYS	    SYS	    NODE	16-31,48-63	    1		        N/A
NIC0	SYS	    X 	    PIX	    SYS				
NIC1	SYS	    PIX	    X 	    SYS				
NIC2	NODE	SYS	    SYS	    X 				

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0
  NIC1: mlx5_1
  NIC2: mlx5_2

I used mlx5_2 to establish the RDMA connection between the local and remote nodes.

Ovservations

When I run the following command:

$ sudo gdsio -f /mnt/remote/32GFile -d 0 -i 4K -w 1 -x 0 -I 2 -T 10 -k 42
file register error: internal error filename :/mnt/remote/32GFile

This issue does not happen when using a local NVMe SSD (e.g., /dev/nvme0n1) mounted in the same way. It only fails on the NVMe-over-RDMA device.

And there are some error messages in cuFile.log

 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio-udev:67 udev property not found: ID_FS_USAGE nvme4n1
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio-fs:742 error getting volume attributes error for device: dev_no: 259:5
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] NOTICE  cufio:293 cuFileHandleRegister GDS not supported or disabled by config, using cuFile posix read/write with compat mode enabled
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio-udev:67 udev property not found: ID_FS_USAGE nvme4n1
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio-fs:742 error getting volume attributes error for device: dev_no: 259:5
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio-obj:215 unable to get volume attributes for fd 56
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio:311 cuFileHandleRegister error, failed to allocate file object
 08-05-2025 20:38:46:605 [pid=610313 tid=610313] ERROR  cufio:339 cuFileHandleRegister error: internal error

I believe GPUDirect Storage should support NVMe-oF, but I’m not sure how to resolve this issue. I haven’t modified the cuFile.json configuration—it’s currently using all default settings.

@yk1234 please check if the field is somehow not populated

$ udevadm info /dev/nvme4n1 | grep ID_FS_USAGE
E: ID_FS_USAGE=filesystem

$ udevadm info /dev/nvme0n1 | grep ID_FS_USAGE
E: ID_FS_USAGE=filesystem

You are right!

$ udevadm info /dev/nvme4n1 | grep ID_FS_USAGE
$ udevadm info /dev/nvme0n1 | grep ID_FS_USAGE
E: ID_FS_USAGE=filesystem

I resolved the issue by formatting /dev/nvme4n1. Now, the command udevadm info /dev/nvme4n1 | grep ID_FS_USAGE returns the expected output, and gdsio is working.

Thanks!