GDS error: nvidia-fs MAP ioctl failed

I was trying to write files with cuFileWrite() but encountered the errors below

 29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:40:34:300 [pid=34095 tid=34095] ERROR  0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
 29-03-2023 19:40:34:300 [pid=34095 tid=34095] ERROR  0:515 map failed

 29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR  0:809 Buffer map failed for PCI-Group: 0 GPU: 0
 29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR  0:921 Failed to obtain bounce buffer from domain: 0 GPU: 0
 29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR  0:1234 failed to get bounce buffer for PCI group 0 GPU 0
 29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR  cufio:2431 could not perform unaligned writes for fd: 37
 29-03-2023 19:40:34:304 [pid=34095 tid=34095] ERROR  0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
 29-03-2023 19:40:34:304 [pid=34095 tid=34095] ERROR  0:515 map failed

Meanwhile, during my attempts to use batchIO, the following occurs:

 29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  cufio-obj:61 cuFile error fetching fd, invalid CUfileHandle
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  cufio-obj:61 cuFile error fetching fd, invalid CUfileHandle
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  cufio_batch:1183 cuFile error issuing IO, file descriptor is not registered fdinfo =  -22 for batch: 0
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  cufio_batch:1459 Error while submitting IO events 0 status:  5027
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  0:106 cuDeviceGet failed with error 4
 29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR  0:106 cuDeviceGet failed with error 4```


FYI

 GDS release version: 1.6.0.25
 nvidia_fs version:  2.15 libcufile version: 2.12
 Platform: x86_64
 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Supported
 NVMeOF             : Unsupported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 BeeGFS             : Unsupported
 WekaFS             : Unsupported
 Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Disabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.force_compat_mode : false
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_size : 128
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 1024
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 18014398509481980
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.beegfs.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 fs.gpfs.gds_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 execution.max_io_threads : 0
 execution.max_io_queue_depth : 128
 execution.parallel_io : false
 execution.min_io_threshold_size_kb : 1024
 execution.max_request_parallelism : 0
 =========
 GPU INFO:
 =========
 GPU index 0 NVIDIA GeForce RTX 4080 bar:1 bar size (MiB):16384, IOMMU State: Disabled
 GPU index 1 Quadro GV100 bar:1 bar size (MiB):256 supports GDS, IOMMU State: Disabled
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Platform verification succeeded
1 Like

When I switch to GV100 instead of RTX4080, the following shows:

 29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR  0:1529 IOCTL failed io-type 1 ret -5 expected 1048576 gpu_page_offset 0
 29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR  0:904 write failed at file_offset 0 cur_size 1048576 retval -5
 29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR  0:106 cuDeviceGet failed with error 4
 29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR  0:106 cuDeviceGet failed with error 4

Hi, I have encountered the same problem. I tried different Nvidia drivers but it still could not work. Could you please tell me whether you have solved this problem or not yet?

Thank you so much.

I’m also facing the same error. I am trying to execute the sample file ./cufile_sample_001 compiled from cufile_sample_001.cc at /usr/local/cuda/gds/samples. I set the logging level to trace to capture the most info, but it is hard to understand. Please advise.

GDS check:

GDS release version: 1.6.0.25
 nvidia_fs version:  2.15 libcufile version: 2.12
 Platform: x86_64
 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Supported
 NVMeOF             : Unsupported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 BeeGFS             : Unsupported
 WekaFS             : Unsupported
 Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Disabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.force_compat_mode : false
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_size : 128
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 16384
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 33554432
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.beegfs.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 fs.gpfs.gds_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 execution.max_io_threads : 0
 execution.max_io_queue_depth : 128
 execution.parallel_io : false
 execution.min_io_threshold_size_kb : 8192
 execution.max_request_parallelism : 0
 =========
 GPU INFO:
 =========
 GPU index 0 NVIDIA GeForce RTX 4090 bar:1 bar size (MiB):32768, IOMMU State: Disabled
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Platform verification succeeded

How to reproduce:

sudo ./cufile_sample_001 /home/user/Documents/gdstest/test 0

 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   0:324 Lib being used for urcup concurrency : libcufile_ck 
 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   cufio:609 Loaded successfully  libcufile_ck.so
 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   cufio:609 Loaded successfully  libmount.so
 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   cufio:609 Loaded successfully  libudev.so
 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   cufio:613 Using CKIT static library
 04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO   0:167 nvidia_fs driver open invoked
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:401 GDS release version: 1.6.0.25
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:404 nvidia_fs version:  2.15 libcufile version: 2.12
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:408 Platform: x86_64
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:290 NVMe: driver support OK
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:329 WekaFS: driver support OK
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:528 nvidia_fs driver version check ok
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:290 NVMe: driver support OK
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:329 WekaFS: driver support OK
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:189 ============
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:190 ENVIRONMENT:
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:191 ============
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:204 =====================
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:205 DRIVER CONFIGURATION:
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:206 =====================
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:208 NVMe               : Supported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:209 NVMeOF             : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:210 SCSI               : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:211 ScaleFlux CSD      : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:212 NVMesh             : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:215 DDN EXAScaler      : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:219 IBM Spectrum Scale : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:223 NFS                : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-drv:226 BeeGFS             : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-rdma:145 No valid ip addresses specified for RDMA devices. Disabling GDS userspace RDMA access

 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1127 WekaFS             : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1129 Userspace RDMA     : Unsupported
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1137 --Mellanox PeerDirect : Disabled
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1145 --rdma library        : Not Loaded (libcufile_rdma.so)
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1148 --rdma devices        : Not configured
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-rdma:1151 --rdma_device_status  : Up: 0 Down: 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio:885 =====================
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio:886 CUFILE CONFIGURATION:
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio:887 =====================
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1263 properties.use_compat_mode : true
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1265 properties.force_compat_mode : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1267 properties.gds_rdma_write_support : true
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1269 properties.use_poll_mode : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1271 properties.poll_mode_max_size_kb : 4
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1273 properties.max_batch_io_size : 128
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1275 properties.max_batch_io_timeout_msecs : 5
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1277 properties.max_direct_io_size_kb : 16384
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1279 properties.max_device_cache_size_kb : 131072
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1281 properties.max_device_pinned_mem_size_kb : 33554432
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1283 properties.posix_pool_slab_size_kb : 4 1024 16384 
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1285 properties.posix_pool_slab_count : 128 64 32 
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1287 properties.rdma_peer_affinity_policy : RoundRobin
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1289 properties.rdma_dynamic_routing : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1296 fs.generic.posix_unaligned_writes : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1299 fs.lustre.posix_gds_min_kb: 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1313 fs.beegfs.posix_gds_min_kb: 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1328 fs.weka.rdma_write_support: false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1354 fs.gpfs.gds_write_support: false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1367 profile.nvtx : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1369 profile.cufile_stats : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1371 miscellaneous.api_check_aggressive : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1379 execution.max_io_threads : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1380 execution.max_io_queue_depth : 128
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1381 execution.parallel_io : false
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1382 execution.min_io_threshold_size_kb : 8192
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   0:1383 execution.max_request_parallelism : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:790 =========
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:791 GPU INFO:
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:792 =========
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE  cufio-plat:306 gpu attribute read , cuDeviceGetAttribute GPU_DIRECT_RDMA_SUPPORTED value: 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-plat:379 GPU BDF: 0000:01:00.0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-plat:349 Searching IOMMU entries in /sys/bus/pci/devices/0000:01:00.0/iommu
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-plat:388 cuda GPU device attributes:  gpu :0 model :NVIDIA GeForce RTX 4090 nvdirect :0 numa:-1 pcibridge: bar :1 barBase :274877906956 barSize :34359738368 streamMemOps :0 dmaBufCapable:0 GDRBufCapable:0 bdf :0 : 1 : 0 : 0

 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:446 GPU index 0 NVIDIA GeForce RTX 4090 bar:1 bar size (MiB):32768, IOMMU State: Disabled
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:452 Total GPUS supported on this platform 1
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:803 ==============
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:804 PLATFORM INFO:
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:805 ==============
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:147 device pci path string : 0000:01:00.0->0000:00:01.0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-plat:674 GPU Dev: 0 numa_node: -1 PCI Group 0000:00:01.0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:570 ACS not enabled in GPU paths
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:723 cannot open scsi_mod path, skip scsi check
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:810 use_mq not detected in scsi configuration.cannot support SCSI disks!
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:695 IOMMU: disabled
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-plat:846 Platform verification succeeded
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE  cufio-drv:457 0000:01:00.0  0000:04:00.0  0x00820090 0x0082 0x04 0x04 0xffffffff 0 nvme
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE  cufio-drv:457 0000:01:00.0  0000:06:00.0  0x0082009e 0x0082 0x01 0x02 0xffffffff 0 network
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: nvme path: /sys/devices/pci0000:00/0000:00:1b.4/0000:04:00.0/nvme/nvme0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: nvme0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x144d
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:04:00.0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:84 adding attributes for device nvme0 device link width: 4 device link speed: 4 ) numa node : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO   cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:75 device name attribute already set nvme0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:84 adding attributes for device nvme0 device link width: 4 device link speed: 4 ) numa node : 0
 04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO   cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:75 device name attribute already set eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO   cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:545 vendor id attribute for device found: 0x8086
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:538 sys attribute sysname for device found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:543 vendor id attribute for device not found: lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR  cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:75 device name attribute already set eno2
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:273 printing cufile platform topology using nvfs probe:
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:283 gpu 0000:01:00.0 peers : 0000:04:00.0(8519824) 0000:06:00.0(8519838) 
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:291 peer 0000:06:00.0 gpus : 0000:01:00.0(8519838) 
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-topo-nvfs:291 peer 0000:04:00.0 gpus : 0000:01:00.0(8519824) 
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-drv:559 checking GPU attributes 0000:00:01.0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio-drv:568 Assigning numanode :  -1 to PCI Group
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio-drv:586 Added group : pgroup 0 0x56349f63e790 to map
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-drv:588 new group 0000:00:01.0 groupid 0 size 1
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-drv:640 ngpus 1 pos 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:290 Bounce buffers initializing... PCI-Groups 1
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:301 Buffer pool initializing for GPU 0 PCI-Group 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:223 Buffer pool initialized with 128 slots and priority: default
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:311 Buffer pool setup for GPU 0 with 128 slots Caching enabled: 1 priority: default PCI-Group: 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:341 Bounce buffers initialization complete
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-drv:677 PCI Groups initialized
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-px-pool:374 Initializing cufile POSIX pool
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:223 Buffer pool initialized with 128 slots and priority: default
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 4 slots: 128
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:223 Buffer pool initialized with 64 slots and priority: default
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 1024 slots: 64
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  0:223 Buffer pool initialized with 32 slots and priority: default
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG  cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 16384 slots: 32
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO   cufio-px-pool:448 POSIX pool buffer initialization complete
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO   curdma-ldbal:510 No RDMA devices configured,skipping RDMA load balancer initialization
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio:471 Threadpool Initialize Obtained pgroup 0x56349f63e790 from map
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio:482 Threadpool Initialize Obtained pgroup : numa : numgpus 0x56349f63e790 0 1
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  0:71 numa_num_configured_nodes obtained numNumaNodes :  1
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio:34 Discovered 1 numa nodes on this system
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  0:77 setting numa_set_bind_policy preferred policy
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio:24 Threapool workqueue 0x56349f30f230 for numa node 0
 04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE  cufio:64 create workqueue 0 0 0x56349f30f230
 04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE  cufio:38 Started tid:  140200718102528
 04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE  cufio:41 Creating a thread pool with 0 threads
 04-04-2023 14:31:56:901 [pid=6480 tid=6484] TRACE  cufio:78 Started Thread: 0x56349f63e370
 04-04-2023 14:31:56:901 [pid=6480 tid=6480] INFO   cufio:951 CUFile initialization complete
 04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE  cufio:3260 cuFileDriverOpen success
 04-04-2023 14:31:56:901 [pid=6480 tid=6480] DEBUG  cufio:1460 cuFileHandleRegister invoked
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found model nvme0n1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:299 detected nvme model: Samsung SSD 990 PRO 2TB                  wwid: eui.0025384a21403079 xport: pcie /sys/devices/pci0000:00/0000:00:1b.4/0000:04:00.0/nvme/nvme0/nvme0n1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1p1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1p1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found model nvme0n1p1
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1p2
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1p2
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found model nvme0n1p2
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1p3
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1p3
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found model nvme0n1p3
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found model nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:147 device pci path string : 0000:04:00.0->0000:00:1b.4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found integrity/device_is_integrity_capable nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-fs:284 block device nvme0n1p4 drive integrity check capability not present. Ok
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] INFO   cufio-fs:357 Block dev: /dev/nvme0n1p4 numa node: 0 pci bridge: 0000:00:1b.4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found device/transport nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found wwid nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-udev:94 sysfs attribute found queue/logical_block_size nvme0n1p4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG  cufio-fs:706 vol pciGroup : 0000:00:1b.4
 04-04-2023 14:31:56:902 [pid=6480 tid=6480] TRACE  cufio-fs:720 block device supported by cufile, bdev : /dev/nvme0n1p4 module nvme xport pcie
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio-fs:736 added volume attributes for device: dev_no: 259:4
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1145 cuFile DIO status for file descriptor 45 DirectIO supported
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio-fs:676 Found cached Volume Attributes for device: dev_no: 259:4 isDFS: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio-obj:201 setting default GDS write support for bdev /dev/nvme0n1p4 xport pcie module: nvme to true
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio-obj:350 Compatibility Mode: 1 Compat Read Mode: 0 Compat Write Mode: 0 Needs RDMA: 0 Needs Unaligned Access: 0 posix_io_threshold: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio-obj:356 Needs Kernel RDMA: 0 use_posix_for_unaligned_write: 0 gds batch enabled: 1 Posix retry on -ENOTSUPP: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1584 cuFileHandleRegister success
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1657 cuFileBufRegister invoked devPtr 0x7f82eac00000
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1232 devPtr: 0x7f82eac00000 chunk: 0 chunk base: 0x7f82eac00000 chunk offset: 0 chunk size: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1677 cuFile buffer checks passed devPtr 0x7f82eac00000 req size: 131072 mapped size: 131072 registered size: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:278 io priority: default stream level: -2
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio-obj:107 mapping nvinfo: 0x56349f876080 size: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:335 map buf 0x7f82eac00000 Size 131072 sbuf_size 16777216 pin_gpu_memory 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:336 map buf 0x7f82eac00000 bounce-buffer 0 groupId 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1032 Inc-bar-usage domain: 0 GPU: 0 size: 131072 cache: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:849 PCI Group found for domain: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:1057 Total usage 0 Max Usage 33554432
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:441 mmap shadow buffers size: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:487 MAP gpu index : 0 bdf: 0 1 0 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:515 map failed

 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:849 PCI Group found for domain: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1114 Dec-bar-usage domain 0 GPUID 0 size 131072 cache 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  cufio-obj:112 error allocating nvfs handle, size: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  cufio:1708 cuFileBufRegister error, object allocation failed
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio-obj:43 deleted chunk devPtr: 0x7f82eac00000 len: 131072
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  cufio:1786 cuFileBufRegister error cufile success
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:3202 cuFileWrite invoked
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:2801 cuFileReadWriteCheckandSubmit invoked
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:180 Hash Lookup nvinfo 0 key 0x7f82eac00000
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:335 map buf 0x7f82eac00000 Size 131072 sbuf_size 16777216 pin_gpu_memory 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:336 map buf 0x7f82eac00000 bounce-buffer 0 groupId -1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1988 GetPCIGroupIDAndDomain cuFile using GPU PCIGroup 0 for bounce buffer, fdpciGroupID: -1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:2543 write inode: 67403417 offset: 0 length: 131072 devPtr: 0x7f82eac00000 bufOff: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:2546 write dev: 4
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:1704 nvfs_io_submit file_offset 0 size 131072 gpu_offset 0 nvbuf 0x7ffcf53ce900 is_unaligned 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1207 io submit bb io_type: 1 file_offset: 0 size: 131072 gpu_index 0 gpu_buffer_offset 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1221 get bb:  1 cross_domain 0 unaligned 0 GPU 0 PCI-Group 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:849 PCI Group found for domain: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:712 Get buffer from PCI-Group 0 GPU 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:590 Found slot 127 Avaliable slots 127
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:747 Allocating and pinning buffer from PCI-Group 0 GPU 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:441 current cuda context present
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:57 push primary context: 0x56349ee494a0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:452 Allocate buffer of size 1048576 on GPU 0 PCI-Group 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:381 Bounce buffer page aligned
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:461 Buffer from aligned alloc, dptr 140200261058560 aligned_dptr 140200261058560 size 1048576
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:278 io priority: default stream level: -2
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:335 map buf 0x7f82eac20000 Size 1048576 sbuf_size 1048576 pin_gpu_memory 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:336 map buf 0x7f82eac20000 bounce-buffer 1 groupId 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:407 cuda stream 0x56349f62eeb0 created with priority: -2
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1032 Inc-bar-usage domain: 0 GPU: 0 size: 1048576 cache: 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:849 PCI Group found for domain: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:1057 Total usage 0 Max Usage 33554432
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:441 mmap shadow buffers size: 1048576
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:487 MAP gpu index : 0 bdf: 0 1 0 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:515 map failed

 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:849 PCI Group found for domain: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  0:1114 Dec-bar-usage domain 0 GPUID 0 size 1048576 cache 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:473 map failed for GPU: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:80 pop context: 0x56349ee494a0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:84 push context: 0x7ffcf53ce3c8
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:809 Buffer map failed for PCI-Group: 0 GPU: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:921 Failed to obtain bounce buffer from domain: 0 GPU: 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  0:1234 failed to get bounce buffer for PCI group 0 GPU 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:743 release unregistered nvHandle: 0x7ffcf53ce900
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:213 cufio-internal - free buf  0x7ffcf53ce900
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:3111 cuFileReadWriteCheckandSubmit done
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:3227 cuFileWrite done
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1867 cuFileBufDeregister invoked
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1882 Deregistering devptr: 0x7f82eac00000
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:180 Hash Lookup nvinfo 0 key 0x7f82eac00000
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1886 nvinfo obtained from hash table during cuFileBufDeregister 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:1904 Calling cuFileBufDeregister from here as nvinfo already removed from ht 0
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  cufio:1810 cuFileBufDeregister error, object for device pointer is not registered
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR  cufio:1932 cuFileBufDeregister error: device pointer lookup failure
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1609 cuFileHandleDeregister invoked
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1636 cuFileHandleDeregister done
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE  cufio:3294 cuFileDriver closing
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1023 cuFile clearing active batch operations
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  cufio:1025 Destroying Batch Pool
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG  0:378 Batch Ctx state 1
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1028 cuFile clearing buffer hashtable
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE  0:200 Bounce buffer io is not in-progress
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE  cufio-px-pool:99 Posix Bounce buffer io is not in-progress
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1057 cuFile destroying posix buffer pool
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE  cufio-px-pool:460 Releasing POSIX pool buffers
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:468 Releasing POSIX pool size: 4096 for GPU: 0
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:141 Tearing down pci-info with 1 GPUs
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 128
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 128
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:468 Releasing POSIX pool size: 1048576 for GPU: 0
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:141 Tearing down pci-info with 1 GPUs
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 64
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 64
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:468 Releasing POSIX pool size: 16777216 for GPU: 0
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:141 Tearing down pci-info with 1 GPUs
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 32
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 32
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] INFO   cufio-px-pool:479 POSIX pool buffer release complete
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1062 cuFile clearing file hashtable
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1065 cuFile clearing volumeAttributes
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1068 cuFile cleanring pciGroupMap
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1072 cuFile cleanring pci group number map
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1076 cuFile clearing Dynamic Routing info
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1079 cuFile clearing pci topology
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1082 cuFile clearing all gpu entries
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  cufio:1085 cuFile closing Driver
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:175 Tearing down bounce buffers
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:141 Tearing down pci-info with 1 GPUs
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:103 Tearing down buffers from GPU 0
 04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG  0:110 free buffers 128
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO   0:140 nvidia_fs driver closed
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] DEBUG  cufio:1094 cuFile clearing all hashtables
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] DEBUG  cufio:1097 cuFile shutting threadpool
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:110 killing pollworker 0x56349f63e370
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:45 marking thread exit 140200718102528 thread: 0x56349f63e370
 04-04-2023 14:31:57:911 [pid=6480 tid=6484] TRACE  cufio:80 Thread func exited 0x56349f63e370 tid: 140200718102528
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:48 Thread exited
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:117 killing cuFileWaitQueue 0x56349f626360
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:71 delete workqueue 0x56349f30f230
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE  cufio:33 Killing workQueue 0x56349f30f230
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO   cufio:1106 cuFile shutdown
 04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO   cufio:1108 Logger Shutdown

Vaz.Valois

GPUDirect Storage is currently only supported on Quadro and Tesla GPUs in p2p mode. This is shown with ‘supported’ string in gdscheck output.

Also the block filesystems that GPUDirect Storage supports are ext4 with ordered mode and XFS.
The linux distros tested are Ubuntu 18.02- 22.04 , Rhel 8.4 and above.
GPUDirect storage is supported on linux kernel from 4.15.x to 5.15.x

04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:515 map failed

This is from cuFileBufRegister that is failing with pinning the BAR1 space.

ExtremeViscent,

29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:1529 IOCTL failed io-type 1 ret -5 expected 1048576 gpu_page_offset 0
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:904 write failed at file_offset 0 cur_size 1048576 retval -5

The IO call is getting an -EIO.
This could be from the filesystem or an error in the kernel.

please check if the filepath you are registering is indeed supported by GDS.

you should get cuFileHandleRegister as CU_FILE_SUCCESS.

1 Like

@kmodukuri,

thank you for your reply. Is there a way to use it without p2p mode? I have a different computer with different GPUs where GDS is working fine, but I have been having trouble installing it on this other machine.

The machine which works show this output in gdscheck:

 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Unsupported
 NVMeOF             : Unsupported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 WekaFS             : Unsupported
 Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Disabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 16384
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 33554432
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 =========
 GPU INFO:
 =========
 GPU index 0 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
 GPU index 1 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
 GPU index 2 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Platform verification succeeded

I see it shows a support GDS but NVMe: Unsupported. Is this specific to the GPU version? Or can I configure my computer the same way?

Hi Kmodukuri,

I can confirm here the path is ext4 and mounted with data=ordered. The hardware model is shown as :

Samsung 980 PRO with Heatsink 2TB

Are there any further clues to solve the issue?

To force compat mode with cuFile. ypu can try use CUFILE_FORCE_COMPAT_MODE=True or if you are the only user and have admin privelleges, remove the nvidia-fs.ko driver

Hi Kmodukuri,
The details of compat mode is not described in the docs. Therefore, I would like to ask for a few clarifications on compat mode:

  • Is there any trade-offs made to use compat mode?
  • Is any CPU bounce buffers employed in compat mode?
  • What is the reason caused the incompatibility of my NVMe?

Hello, I have encountered the same problem. And I have found the description of the compact mode in the NVIDIA GPUDirect Storage Benchmarking and Configuration Guide. In the documentation, it says:

So if we want to use the full performance of GDS at present, do we need to change to use Quadro or Tesla GPUs?