I was trying to write files with cuFileWrite() but encountered the errors below
29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:40:34:296 [pid=34095 tid=34095] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:40:34:300 [pid=34095 tid=34095] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
29-03-2023 19:40:34:300 [pid=34095 tid=34095] ERROR 0:515 map failed
29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR 0:809 Buffer map failed for PCI-Group: 0 GPU: 0
29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR 0:921 Failed to obtain bounce buffer from domain: 0 GPU: 0
29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR 0:1234 failed to get bounce buffer for PCI group 0 GPU 0
29-03-2023 19:40:34:302 [pid=34095 tid=34095] ERROR cufio:2431 could not perform unaligned writes for fd: 37
29-03-2023 19:40:34:304 [pid=34095 tid=34095] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
29-03-2023 19:40:34:304 [pid=34095 tid=34095] ERROR 0:515 map failed
Meanwhile, during my attempts to use batchIO, the following occurs:
29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:44:01:814 [pid=34630 tid=34630] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR cufio-obj:61 cuFile error fetching fd, invalid CUfileHandle
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR cufio-obj:61 cuFile error fetching fd, invalid CUfileHandle
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR cufio_batch:1183 cuFile error issuing IO, file descriptor is not registered fdinfo = -22 for batch: 0
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR cufio_batch:1459 Error while submitting IO events 0 status: 5027
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR 0:106 cuDeviceGet failed with error 4
29-03-2023 19:44:01:818 [pid=34630 tid=34630] ERROR 0:106 cuDeviceGet failed with error 4```
FYI
GDS release version: 1.6.0.25
nvidia_fs version: 2.15 libcufile version: 2.12
Platform: x86_64
============
ENVIRONMENT:
============
=====================
DRIVER CONFIGURATION:
=====================
NVMe : Supported
NVMeOF : Unsupported
SCSI : Unsupported
ScaleFlux CSD : Unsupported
NVMesh : Unsupported
DDN EXAScaler : Unsupported
IBM Spectrum Scale : Unsupported
NFS : Unsupported
BeeGFS : Unsupported
WekaFS : Unsupported
Userspace RDMA : Unsupported
--Mellanox PeerDirect : Disabled
--rdma library : Not Loaded (libcufile_rdma.so)
--rdma devices : Not configured
--rdma_device_status : Up: 0 Down: 0
=====================
CUFILE CONFIGURATION:
=====================
properties.use_compat_mode : true
properties.force_compat_mode : false
properties.gds_rdma_write_support : true
properties.use_poll_mode : false
properties.poll_mode_max_size_kb : 4
properties.max_batch_io_size : 128
properties.max_batch_io_timeout_msecs : 5
properties.max_direct_io_size_kb : 1024
properties.max_device_cache_size_kb : 131072
properties.max_device_pinned_mem_size_kb : 18014398509481980
properties.posix_pool_slab_size_kb : 4 1024 16384
properties.posix_pool_slab_count : 128 64 32
properties.rdma_peer_affinity_policy : RoundRobin
properties.rdma_dynamic_routing : 0
fs.generic.posix_unaligned_writes : false
fs.lustre.posix_gds_min_kb: 0
fs.beegfs.posix_gds_min_kb: 0
fs.weka.rdma_write_support: false
fs.gpfs.gds_write_support: false
profile.nvtx : false
profile.cufile_stats : 0
miscellaneous.api_check_aggressive : false
execution.max_io_threads : 0
execution.max_io_queue_depth : 128
execution.parallel_io : false
execution.min_io_threshold_size_kb : 1024
execution.max_request_parallelism : 0
=========
GPU INFO:
=========
GPU index 0 NVIDIA GeForce RTX 4080 bar:1 bar size (MiB):16384, IOMMU State: Disabled
GPU index 1 Quadro GV100 bar:1 bar size (MiB):256 supports GDS, IOMMU State: Disabled
==============
PLATFORM INFO:
==============
IOMMU: disabled
Platform verification succeeded
When I switch to GV100 instead of RTX4080, the following shows:
29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 20:07:43:611 [pid=37710 tid=37710] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:1529 IOCTL failed io-type 1 ret -5 expected 1048576 gpu_page_offset 0
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:904 write failed at file_offset 0 cur_size 1048576 retval -5
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:106 cuDeviceGet failed with error 4
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:106 cuDeviceGet failed with error 4
Hi, I have encountered the same problem. I tried different Nvidia drivers but it still could not work. Could you please tell me whether you have solved this problem or not yet?
Thank you so much.
I’m also facing the same error. I am trying to execute the sample file ./cufile_sample_001 compiled from cufile_sample_001.cc at /usr/local/cuda/gds/samples. I set the logging level to trace to capture the most info, but it is hard to understand. Please advise.
GDS check:
GDS release version: 1.6.0.25
nvidia_fs version: 2.15 libcufile version: 2.12
Platform: x86_64
============
ENVIRONMENT:
============
=====================
DRIVER CONFIGURATION:
=====================
NVMe : Supported
NVMeOF : Unsupported
SCSI : Unsupported
ScaleFlux CSD : Unsupported
NVMesh : Unsupported
DDN EXAScaler : Unsupported
IBM Spectrum Scale : Unsupported
NFS : Unsupported
BeeGFS : Unsupported
WekaFS : Unsupported
Userspace RDMA : Unsupported
--Mellanox PeerDirect : Disabled
--rdma library : Not Loaded (libcufile_rdma.so)
--rdma devices : Not configured
--rdma_device_status : Up: 0 Down: 0
=====================
CUFILE CONFIGURATION:
=====================
properties.use_compat_mode : true
properties.force_compat_mode : false
properties.gds_rdma_write_support : true
properties.use_poll_mode : false
properties.poll_mode_max_size_kb : 4
properties.max_batch_io_size : 128
properties.max_batch_io_timeout_msecs : 5
properties.max_direct_io_size_kb : 16384
properties.max_device_cache_size_kb : 131072
properties.max_device_pinned_mem_size_kb : 33554432
properties.posix_pool_slab_size_kb : 4 1024 16384
properties.posix_pool_slab_count : 128 64 32
properties.rdma_peer_affinity_policy : RoundRobin
properties.rdma_dynamic_routing : 0
fs.generic.posix_unaligned_writes : false
fs.lustre.posix_gds_min_kb: 0
fs.beegfs.posix_gds_min_kb: 0
fs.weka.rdma_write_support: false
fs.gpfs.gds_write_support: false
profile.nvtx : false
profile.cufile_stats : 0
miscellaneous.api_check_aggressive : false
execution.max_io_threads : 0
execution.max_io_queue_depth : 128
execution.parallel_io : false
execution.min_io_threshold_size_kb : 8192
execution.max_request_parallelism : 0
=========
GPU INFO:
=========
GPU index 0 NVIDIA GeForce RTX 4090 bar:1 bar size (MiB):32768, IOMMU State: Disabled
==============
PLATFORM INFO:
==============
IOMMU: disabled
Platform verification succeeded
How to reproduce:
sudo ./cufile_sample_001 /home/user/Documents/gdstest/test 0
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO 0:324 Lib being used for urcup concurrency : libcufile_ck
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO cufio:609 Loaded successfully libcufile_ck.so
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO cufio:609 Loaded successfully libmount.so
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO cufio:609 Loaded successfully libudev.so
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO cufio:613 Using CKIT static library
04-04-2023 14:31:56:898 [pid=6480 tid=6480] INFO 0:167 nvidia_fs driver open invoked
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:401 GDS release version: 1.6.0.25
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:404 nvidia_fs version: 2.15 libcufile version: 2.12
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:408 Platform: x86_64
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:290 NVMe: driver support OK
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:329 WekaFS: driver support OK
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:528 nvidia_fs driver version check ok
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:290 NVMe: driver support OK
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:329 WekaFS: driver support OK
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:189 ============
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:190 ENVIRONMENT:
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:191 ============
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:204 =====================
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:205 DRIVER CONFIGURATION:
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:206 =====================
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:208 NVMe : Supported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:209 NVMeOF : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:210 SCSI : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:211 ScaleFlux CSD : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:212 NVMesh : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:215 DDN EXAScaler : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:219 IBM Spectrum Scale : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:223 NFS : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-drv:226 BeeGFS : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-rdma:145 No valid ip addresses specified for RDMA devices. Disabling GDS userspace RDMA access
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1127 WekaFS : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1129 Userspace RDMA : Unsupported
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1137 --Mellanox PeerDirect : Disabled
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1145 --rdma library : Not Loaded (libcufile_rdma.so)
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1148 --rdma devices : Not configured
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-rdma:1151 --rdma_device_status : Up: 0 Down: 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio:885 =====================
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio:886 CUFILE CONFIGURATION:
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio:887 =====================
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1263 properties.use_compat_mode : true
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1265 properties.force_compat_mode : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1267 properties.gds_rdma_write_support : true
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1269 properties.use_poll_mode : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1271 properties.poll_mode_max_size_kb : 4
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1273 properties.max_batch_io_size : 128
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1275 properties.max_batch_io_timeout_msecs : 5
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1277 properties.max_direct_io_size_kb : 16384
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1279 properties.max_device_cache_size_kb : 131072
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1281 properties.max_device_pinned_mem_size_kb : 33554432
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1283 properties.posix_pool_slab_size_kb : 4 1024 16384
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1285 properties.posix_pool_slab_count : 128 64 32
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1287 properties.rdma_peer_affinity_policy : RoundRobin
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1289 properties.rdma_dynamic_routing : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1296 fs.generic.posix_unaligned_writes : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1299 fs.lustre.posix_gds_min_kb: 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1313 fs.beegfs.posix_gds_min_kb: 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1328 fs.weka.rdma_write_support: false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1354 fs.gpfs.gds_write_support: false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1367 profile.nvtx : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1369 profile.cufile_stats : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1371 miscellaneous.api_check_aggressive : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1379 execution.max_io_threads : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1380 execution.max_io_queue_depth : 128
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1381 execution.parallel_io : false
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1382 execution.min_io_threshold_size_kb : 8192
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO 0:1383 execution.max_request_parallelism : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:790 =========
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:791 GPU INFO:
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:792 =========
04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE cufio-plat:306 gpu attribute read , cuDeviceGetAttribute GPU_DIRECT_RDMA_SUPPORTED value: 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-plat:379 GPU BDF: 0000:01:00.0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-plat:349 Searching IOMMU entries in /sys/bus/pci/devices/0000:01:00.0/iommu
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-plat:388 cuda GPU device attributes: gpu :0 model :NVIDIA GeForce RTX 4090 nvdirect :0 numa:-1 pcibridge: bar :1 barBase :274877906956 barSize :34359738368 streamMemOps :0 dmaBufCapable:0 GDRBufCapable:0 bdf :0 : 1 : 0 : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:446 GPU index 0 NVIDIA GeForce RTX 4090 bar:1 bar size (MiB):32768, IOMMU State: Disabled
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:452 Total GPUS supported on this platform 1
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:803 ==============
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:804 PLATFORM INFO:
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:805 ==============
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:147 device pci path string : 0000:01:00.0->0000:00:01.0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-plat:674 GPU Dev: 0 numa_node: -1 PCI Group 0000:00:01.0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:570 ACS not enabled in GPU paths
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:723 cannot open scsi_mod path, skip scsi check
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:810 use_mq not detected in scsi configuration.cannot support SCSI disks!
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:695 IOMMU: disabled
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-plat:846 Platform verification succeeded
04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE cufio-drv:457 0000:01:00.0 0000:04:00.0 0x00820090 0x0082 0x04 0x04 0xffffffff 0 nvme
04-04-2023 14:31:56:899 [pid=6480 tid=6480] TRACE cufio-drv:457 0000:01:00.0 0000:06:00.0 0x0082009e 0x0082 0x01 0x02 0xffffffff 0 network
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: nvme path: /sys/devices/pci0000:00/0000:00:1b.4/0000:04:00.0/nvme/nvme0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: nvme0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x144d
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:04:00.0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:84 adding attributes for device nvme0 device link width: 4 device link speed: 4 ) numa node : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] INFO cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:75 device name attribute already set nvme0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:84 adding attributes for device nvme0 device link width: 4 device link speed: 4 ) numa node : 0
04-04-2023 14:31:56:899 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:75 device name attribute already set eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO cufio-udev:503 no devices found by udev enumeration for pci class: infiniband
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:14.3/net/wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: wlp0s20f3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/pci0000:00/0000:00:1c.3/0000:06:00.0/net/eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:545 vendor id attribute for device found: 0x8086
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:571 sys attribute uevent for device found: 0000:06:00.0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: docker0 class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:523 scanning sys CLASS: net path: /sys/devices/virtual/net/lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:538 sys attribute sysname for device found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:543 vendor id attribute for device not found: lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-udev:566 sys attribute for device not found: lo class: net sysattr: device/uevent
04-04-2023 14:31:56:900 [pid=6480 tid=6480] ERROR cufio-topo-nvfs:78 pci device not present in topology device attribute table: 0000:00:14.3
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:75 device name attribute already set eno2
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:84 adding attributes for device eno2 device link width: 1 device link speed: 2 ) numa node : 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : docker0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:51 bus-device-function not found in the device attribute : lo
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:273 printing cufile platform topology using nvfs probe:
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:283 gpu 0000:01:00.0 peers : 0000:04:00.0(8519824) 0000:06:00.0(8519838)
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:291 peer 0000:06:00.0 gpus : 0000:01:00.0(8519838)
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-topo-nvfs:291 peer 0000:04:00.0 gpus : 0000:01:00.0(8519824)
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-drv:559 checking GPU attributes 0000:00:01.0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio-drv:568 Assigning numanode : -1 to PCI Group
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio-drv:586 Added group : pgroup 0 0x56349f63e790 to map
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-drv:588 new group 0000:00:01.0 groupid 0 size 1
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-drv:640 ngpus 1 pos 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:290 Bounce buffers initializing... PCI-Groups 1
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:301 Buffer pool initializing for GPU 0 PCI-Group 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:223 Buffer pool initialized with 128 slots and priority: default
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:311 Buffer pool setup for GPU 0 with 128 slots Caching enabled: 1 priority: default PCI-Group: 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:341 Bounce buffers initialization complete
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-drv:677 PCI Groups initialized
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-px-pool:374 Initializing cufile POSIX pool
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:223 Buffer pool initialized with 128 slots and priority: default
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 4 slots: 128
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:223 Buffer pool initialized with 64 slots and priority: default
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 1024 slots: 64
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG 0:223 Buffer pool initialized with 32 slots and priority: default
04-04-2023 14:31:56:900 [pid=6480 tid=6480] DEBUG cufio-px-pool:124 POSIX buffer pool initialized for GPU 0 slab size (KiB): 16384 slots: 32
04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO cufio-px-pool:448 POSIX pool buffer initialization complete
04-04-2023 14:31:56:900 [pid=6480 tid=6480] INFO curdma-ldbal:510 No RDMA devices configured,skipping RDMA load balancer initialization
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio:471 Threadpool Initialize Obtained pgroup 0x56349f63e790 from map
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio:482 Threadpool Initialize Obtained pgroup : numa : numgpus 0x56349f63e790 0 1
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE 0:71 numa_num_configured_nodes obtained numNumaNodes : 1
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio:34 Discovered 1 numa nodes on this system
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE 0:77 setting numa_set_bind_policy preferred policy
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio:24 Threapool workqueue 0x56349f30f230 for numa node 0
04-04-2023 14:31:56:900 [pid=6480 tid=6480] TRACE cufio:64 create workqueue 0 0 0x56349f30f230
04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE cufio:38 Started tid: 140200718102528
04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE cufio:41 Creating a thread pool with 0 threads
04-04-2023 14:31:56:901 [pid=6480 tid=6484] TRACE cufio:78 Started Thread: 0x56349f63e370
04-04-2023 14:31:56:901 [pid=6480 tid=6480] INFO cufio:951 CUFile initialization complete
04-04-2023 14:31:56:901 [pid=6480 tid=6480] TRACE cufio:3260 cuFileDriverOpen success
04-04-2023 14:31:56:901 [pid=6480 tid=6480] DEBUG cufio:1460 cuFileHandleRegister invoked
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found model nvme0n1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:299 detected nvme model: Samsung SSD 990 PRO 2TB wwid: eui.0025384a21403079 xport: pcie /sys/devices/pci0000:00/0000:00:1b.4/0000:04:00.0/nvme/nvme0/nvme0n1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1p1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1p1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found model nvme0n1p1
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1p2
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1p2
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found model nvme0n1p2
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1p3
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1p3
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found model nvme0n1p3
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found model nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:147 device pci path string : 0000:04:00.0->0000:00:1b.4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found integrity/device_is_integrity_capable nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-fs:284 block device nvme0n1p4 drive integrity check capability not present. Ok
04-04-2023 14:31:56:902 [pid=6480 tid=6480] INFO cufio-fs:357 Block dev: /dev/nvme0n1p4 numa node: 0 pci bridge: 0000:00:1b.4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found device/transport nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found wwid nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-udev:94 sysfs attribute found queue/logical_block_size nvme0n1p4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] DEBUG cufio-fs:706 vol pciGroup : 0000:00:1b.4
04-04-2023 14:31:56:902 [pid=6480 tid=6480] TRACE cufio-fs:720 block device supported by cufile, bdev : /dev/nvme0n1p4 module nvme xport pcie
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio-fs:736 added volume attributes for device: dev_no: 259:4
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1145 cuFile DIO status for file descriptor 45 DirectIO supported
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio-fs:676 Found cached Volume Attributes for device: dev_no: 259:4 isDFS: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio-obj:201 setting default GDS write support for bdev /dev/nvme0n1p4 xport pcie module: nvme to true
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio-obj:350 Compatibility Mode: 1 Compat Read Mode: 0 Compat Write Mode: 0 Needs RDMA: 0 Needs Unaligned Access: 0 posix_io_threshold: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio-obj:356 Needs Kernel RDMA: 0 use_posix_for_unaligned_write: 0 gds batch enabled: 1 Posix retry on -ENOTSUPP: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1584 cuFileHandleRegister success
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1657 cuFileBufRegister invoked devPtr 0x7f82eac00000
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1232 devPtr: 0x7f82eac00000 chunk: 0 chunk base: 0x7f82eac00000 chunk offset: 0 chunk size: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1677 cuFile buffer checks passed devPtr 0x7f82eac00000 req size: 131072 mapped size: 131072 registered size: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:278 io priority: default stream level: -2
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio-obj:107 mapping nvinfo: 0x56349f876080 size: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:335 map buf 0x7f82eac00000 Size 131072 sbuf_size 16777216 pin_gpu_memory 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:336 map buf 0x7f82eac00000 bounce-buffer 0 groupId 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1032 Inc-bar-usage domain: 0 GPU: 0 size: 131072 cache: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:849 PCI Group found for domain: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:1057 Total usage 0 Max Usage 33554432
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:441 mmap shadow buffers size: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:487 MAP gpu index : 0 bdf: 0 1 0 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:515 map failed
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:849 PCI Group found for domain: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1114 Dec-bar-usage domain 0 GPUID 0 size 131072 cache 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR cufio-obj:112 error allocating nvfs handle, size: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR cufio:1708 cuFileBufRegister error, object allocation failed
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio-obj:43 deleted chunk devPtr: 0x7f82eac00000 len: 131072
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR cufio:1786 cuFileBufRegister error cufile success
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:3202 cuFileWrite invoked
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:2801 cuFileReadWriteCheckandSubmit invoked
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:180 Hash Lookup nvinfo 0 key 0x7f82eac00000
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:335 map buf 0x7f82eac00000 Size 131072 sbuf_size 16777216 pin_gpu_memory 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:336 map buf 0x7f82eac00000 bounce-buffer 0 groupId -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1988 GetPCIGroupIDAndDomain cuFile using GPU PCIGroup 0 for bounce buffer, fdpciGroupID: -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:2543 write inode: 67403417 offset: 0 length: 131072 devPtr: 0x7f82eac00000 bufOff: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:2546 write dev: 4
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:1704 nvfs_io_submit file_offset 0 size 131072 gpu_offset 0 nvbuf 0x7ffcf53ce900 is_unaligned 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1207 io submit bb io_type: 1 file_offset: 0 size: 131072 gpu_index 0 gpu_buffer_offset 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1221 get bb: 1 cross_domain 0 unaligned 0 GPU 0 PCI-Group 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:849 PCI Group found for domain: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:712 Get buffer from PCI-Group 0 GPU 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:590 Found slot 127 Avaliable slots 127
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:747 Allocating and pinning buffer from PCI-Group 0 GPU 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:441 current cuda context present
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:57 push primary context: 0x56349ee494a0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:452 Allocate buffer of size 1048576 on GPU 0 PCI-Group 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:381 Bounce buffer page aligned
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:461 Buffer from aligned alloc, dptr 140200261058560 aligned_dptr 140200261058560 size 1048576
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:278 io priority: default stream level: -2
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:335 map buf 0x7f82eac20000 Size 1048576 sbuf_size 1048576 pin_gpu_memory 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:336 map buf 0x7f82eac20000 bounce-buffer 1 groupId 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:407 cuda stream 0x56349f62eeb0 created with priority: -2
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1032 Inc-bar-usage domain: 0 GPU: 0 size: 1048576 cache: 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:849 PCI Group found for domain: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:1057 Total usage 0 Max Usage 33554432
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:441 mmap shadow buffers size: 1048576
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:487 MAP gpu index : 0 bdf: 0 1 0 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:515 map failed
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:849 PCI Group found for domain: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE 0:1114 Dec-bar-usage domain 0 GPUID 0 size 1048576 cache 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:473 map failed for GPU: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:80 pop context: 0x56349ee494a0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:84 push context: 0x7ffcf53ce3c8
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:809 Buffer map failed for PCI-Group: 0 GPU: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:921 Failed to obtain bounce buffer from domain: 0 GPU: 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:1234 failed to get bounce buffer for PCI group 0 GPU 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:743 release unregistered nvHandle: 0x7ffcf53ce900
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:213 cufio-internal - free buf 0x7ffcf53ce900
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:3111 cuFileReadWriteCheckandSubmit done
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:3227 cuFileWrite done
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1867 cuFileBufDeregister invoked
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1882 Deregistering devptr: 0x7f82eac00000
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:180 Hash Lookup nvinfo 0 key 0x7f82eac00000
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1886 nvinfo obtained from hash table during cuFileBufDeregister 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:1904 Calling cuFileBufDeregister from here as nvinfo already removed from ht 0
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR cufio:1810 cuFileBufDeregister error, object for device pointer is not registered
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR cufio:1932 cuFileBufDeregister error: device pointer lookup failure
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1609 cuFileHandleDeregister invoked
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1636 cuFileHandleDeregister done
04-04-2023 14:31:56:903 [pid=6480 tid=6480] TRACE cufio:3294 cuFileDriver closing
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1023 cuFile clearing active batch operations
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG cufio:1025 Destroying Batch Pool
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:904 [pid=6480 tid=6480] DEBUG 0:378 Batch Ctx state 1
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1028 cuFile clearing buffer hashtable
04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE 0:200 Bounce buffer io is not in-progress
04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE cufio-px-pool:99 Posix Bounce buffer io is not in-progress
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1057 cuFile destroying posix buffer pool
04-04-2023 14:31:56:905 [pid=6480 tid=6480] TRACE cufio-px-pool:460 Releasing POSIX pool buffers
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:468 Releasing POSIX pool size: 4096 for GPU: 0
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:141 Tearing down pci-info with 1 GPUs
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 128
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 128
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:468 Releasing POSIX pool size: 1048576 for GPU: 0
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:141 Tearing down pci-info with 1 GPUs
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 64
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 64
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:468 Releasing POSIX pool size: 16777216 for GPU: 0
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:141 Tearing down pci-info with 1 GPUs
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:46 Tearing down POSIX pool slab for gpu 0 num objects: 32
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio-px-pool:65 Freed POSIX pool slab for gpu 0 num objects: 32
04-04-2023 14:31:56:905 [pid=6480 tid=6480] INFO cufio-px-pool:479 POSIX pool buffer release complete
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1062 cuFile clearing file hashtable
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1065 cuFile clearing volumeAttributes
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1068 cuFile cleanring pciGroupMap
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1072 cuFile cleanring pci group number map
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1076 cuFile clearing Dynamic Routing info
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1079 cuFile clearing pci topology
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1082 cuFile clearing all gpu entries
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG cufio:1085 cuFile closing Driver
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:175 Tearing down bounce buffers
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:141 Tearing down pci-info with 1 GPUs
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:103 Tearing down buffers from GPU 0
04-04-2023 14:31:56:905 [pid=6480 tid=6480] DEBUG 0:110 free buffers 128
04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO 0:140 nvidia_fs driver closed
04-04-2023 14:31:57:911 [pid=6480 tid=6480] DEBUG cufio:1094 cuFile clearing all hashtables
04-04-2023 14:31:57:911 [pid=6480 tid=6480] DEBUG cufio:1097 cuFile shutting threadpool
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:110 killing pollworker 0x56349f63e370
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:45 marking thread exit 140200718102528 thread: 0x56349f63e370
04-04-2023 14:31:57:911 [pid=6480 tid=6484] TRACE cufio:80 Thread func exited 0x56349f63e370 tid: 140200718102528
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:48 Thread exited
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:117 killing cuFileWaitQueue 0x56349f626360
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:71 delete workqueue 0x56349f30f230
04-04-2023 14:31:57:911 [pid=6480 tid=6480] TRACE cufio:33 Killing workQueue 0x56349f30f230
04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO cufio:1106 cuFile shutdown
04-04-2023 14:31:57:911 [pid=6480 tid=6480] INFO cufio:1108 Logger Shutdown
Vaz.Valois
GPUDirect Storage is currently only supported on Quadro and Tesla GPUs in p2p mode. This is shown with ‘supported’ string in gdscheck output.
Also the block filesystems that GPUDirect Storage supports are ext4 with ordered mode and XFS.
The linux distros tested are Ubuntu 18.02- 22.04 , Rhel 8.4 and above.
GPUDirect storage is supported on linux kernel from 4.15.x to 5.15.x
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:501 nvidia-fs MAP ioctl failed : ioctl_return: -22 ioctl_ret: -1
04-04-2023 14:31:56:903 [pid=6480 tid=6480] ERROR 0:515 map failed
This is from cuFileBufRegister that is failing with pinning the BAR1 space.
ExtremeViscent,
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:1529 IOCTL failed io-type 1 ret -5 expected 1048576 gpu_page_offset 0
29-03-2023 20:07:43:630 [pid=37710 tid=37710] ERROR 0:904 write failed at file_offset 0 cur_size 1048576 retval -5
The IO call is getting an -EIO.
This could be from the filesystem or an error in the kernel.
please check if the filepath you are registering is indeed supported by GDS.
you should get cuFileHandleRegister as CU_FILE_SUCCESS.
@kmodukuri ,
thank you for your reply. Is there a way to use it without p2p mode? I have a different computer with different GPUs where GDS is working fine, but I have been having trouble installing it on this other machine.
The machine which works show this output in gdscheck:
============
ENVIRONMENT:
============
=====================
DRIVER CONFIGURATION:
=====================
NVMe : Unsupported
NVMeOF : Unsupported
SCSI : Unsupported
ScaleFlux CSD : Unsupported
NVMesh : Unsupported
DDN EXAScaler : Unsupported
IBM Spectrum Scale : Unsupported
NFS : Unsupported
WekaFS : Unsupported
Userspace RDMA : Unsupported
--Mellanox PeerDirect : Disabled
--rdma library : Not Loaded (libcufile_rdma.so)
--rdma devices : Not configured
--rdma_device_status : Up: 0 Down: 0
=====================
CUFILE CONFIGURATION:
=====================
properties.use_compat_mode : true
properties.gds_rdma_write_support : true
properties.use_poll_mode : false
properties.poll_mode_max_size_kb : 4
properties.max_batch_io_timeout_msecs : 5
properties.max_direct_io_size_kb : 16384
properties.max_device_cache_size_kb : 131072
properties.max_device_pinned_mem_size_kb : 33554432
properties.posix_pool_slab_size_kb : 4 1024 16384
properties.posix_pool_slab_count : 128 64 32
properties.rdma_peer_affinity_policy : RoundRobin
properties.rdma_dynamic_routing : 0
fs.generic.posix_unaligned_writes : false
fs.lustre.posix_gds_min_kb: 0
fs.weka.rdma_write_support: false
profile.nvtx : false
profile.cufile_stats : 0
miscellaneous.api_check_aggressive : false
=========
GPU INFO:
=========
GPU index 0 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
GPU index 1 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
GPU index 2 NVIDIA RTX A5000 bar:1 bar size (MiB):256 supports GDS
==============
PLATFORM INFO:
==============
IOMMU: disabled
Platform verification succeeded
I see it shows a support GDS but NVMe: Unsupported. Is this specific to the GPU version? Or can I configure my computer the same way?
Hi Kmodukuri,
I can confirm here the path is ext4 and mounted with data=ordered. The hardware model is shown as :
Samsung 980 PRO with Heatsink 2TB
Are there any further clues to solve the issue?
To force compat mode with cuFile. ypu can try use CUFILE_FORCE_COMPAT_MODE=True or if you are the only user and have admin privelleges, remove the nvidia-fs.ko driver
Hi Kmodukuri,
The details of compat mode is not described in the docs. Therefore, I would like to ask for a few clarifications on compat mode:
Is there any trade-offs made to use compat mode?
Is any CPU bounce buffers employed in compat mode?
What is the reason caused the incompatibility of my NVMe?
Hello, I have encountered the same problem. And I have found the description of the compact mode in the NVIDIA GPUDirect Storage Benchmarking and Configuration Guide . In the documentation, it says:
Poll Mode The cuFile API set includes an interface to put the driver in polling mode. Refer to cuFileDriverSetPollMode()
in the cuFile API Reference Guide for more information. When the poll mode is set, a read or write issued that is less than or equal to properties:poll_mode_max_size_kb
(4KB by default) will result in the library polling for IO completion, rather than blocking (sleep). For small IO size workloads, enabling poll mode may reduce latency.
Compatibility Mode There are several possible scenarios where GDS might not be available or supported, for example, when the GDS software is not installed, the target file system is not GDS supported,O_DIRECT
cannot be enabled on the target file, and so on. When you enable compatibility mode, and GDS is not functional for the IO target, the code that uses the cuFile APIs fall backs to the standard POSIX read/write path. To learn more about compatibility mode, refer to cuFile Compatibility Mode .
So if we want to use the full performance of GDS at present, do we need to change to use Quadro or Tesla GPUs?