Understanding Read and Write Op Counts in Async GDS Operations

Dear GDS Team,

I’ve been exploring the Async GDS API and encountered an issue in the number of read and write IO requests reported in the cufile.log file.

Specifically, after running the first Async API tutorial code (MagnumIO/gds/samples/cufile_sample_031.cc at main · NVIDIA/MagnumIO · GitHub), I observed the following statistics:

GLOBAL STATS:
Read: ok = 2 err = 0 
Write: ok = 2 err = 0 
HandleRegister: ok = 2 err = 0 
HandleDeregister: ok = 2 err = 0 
BufRegister: ok = 1 err = 0 
BufDeregister: ok = 1 err = 0 
BatchSubmit: ok = 0 err = 0 
BatchComplete: ok = 0 err = 0 
BatchSetup: ok = 0 err = 0 
BatchCancel: ok = 0 err = 0 
BatchDestroy: ok = 0 err = 0 
BatchEnqueued: ok = 0 err = 0 
PosixBatchEnqueued: ok = 0 err = 0 
BatchProcessed: ok = 0 err = 0 
PosixBatchProcessed: ok = 0 err = 0

I’m puzzled by the very first two lines reported: 2 reads and 2 writes, as the samaple only submitted one cuFileReadAsync and one cuFileWriteAsync operation. Could you please clarify why GDS appears to double the read and write counts in this scenario?

I hope this doubled report in cufile.logis reproducable at your side. I was only run the code with one GPU and one SSD. Otherwise, I will post more configuration on my side.

Thank you for your assistance.


Versions:

$ gdscheck -v
 GDS release version: 1.7.2.10
 nvidia_fs version:  2.17 libcufile version: 2.12
 Platform: x86_64
NVIDIA-SMI 535.129.03             
Driver Version: 535.129.03   
CUDA Version: 12.2 

The whole log:

 27-02-2024 23:14:47:639 [pid=1516982 tid=1516982] INFO   0:324 Lib being used for urcup concurrency : libcufile_ck 
 27-02-2024 23:14:47:639 [pid=1516982 tid=1516982] INFO   cufio_core:556 Loaded successfully  libcufile_ck.so
 27-02-2024 23:14:47:640 [pid=1516982 tid=1516982] INFO   cufio_core:556 Loaded successfully  libmount.so
 27-02-2024 23:14:47:640 [pid=1516982 tid=1516982] INFO   cufio_core:556 Loaded successfully  libudev.so
 27-02-2024 23:14:47:640 [pid=1516982 tid=1516982] INFO   cufio_core:560 Using CKIT static library
 27-02-2024 23:14:47:640 [pid=1516982 tid=1516982] INFO   0:163 nvidia_fs driver open invoked
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:401 GDS release version: 1.7.2.10
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:404 nvidia_fs version:  2.17 libcufile version: 2.12
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:408 Platform: x86_64
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:290 NVMe: driver support OK
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:329 WekaFS: driver support OK
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:528 nvidia_fs driver version check ok
 27-02-2024 23:14:47:642 [pid=1516982 tid=1516982] INFO   cufio-drv:290 NVMe: driver support OK
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:329 WekaFS: driver support OK
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:189 ============
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:190 ENVIRONMENT:
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:191 ============
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:204 =====================
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:205 DRIVER CONFIGURATION:
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:206 =====================
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:208 NVMe               : Supported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:209 NVMeOF             : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:210 SCSI               : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:211 ScaleFlux CSD      : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:212 NVMesh             : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:215 DDN EXAScaler      : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:219 IBM Spectrum Scale : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:223 NFS                : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-drv:226 BeeGFS             : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1126 WekaFS             : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1128 Userspace RDMA     : Unsupported
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1136 --Mellanox PeerDirect : Enabled
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1144 --rdma library        : Not Loaded (libcufile_rdma.so)
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1147 --rdma devices        : Not configured
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-rdma:1150 --rdma_device_status  : Up: 0 Down: 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio_core:938 =====================
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio_core:939 CUFILE CONFIGURATION:
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio_core:940 =====================
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1321 properties.use_compat_mode : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1323 properties.force_compat_mode : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1325 properties.gds_rdma_write_support : true
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1327 properties.use_poll_mode : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1329 properties.poll_mode_max_size_kb : 4
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1331 properties.max_batch_io_size : 128
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1333 properties.max_batch_io_timeout_msecs : 5
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1335 properties.max_direct_io_size_kb : 16384
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1337 properties.max_device_cache_size_kb : 1048576
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1339 properties.max_device_pinned_mem_size_kb : 33554432
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1341 properties.posix_pool_slab_size_kb : 4 1024 16384 
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1343 properties.posix_pool_slab_count : 128 64 32 
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1345 properties.rdma_peer_affinity_policy : RoundRobin
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1347 properties.rdma_dynamic_routing : 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1354 fs.generic.posix_unaligned_writes : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1357 fs.lustre.posix_gds_min_kb: 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1371 fs.beegfs.posix_gds_min_kb: 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1386 fs.weka.rdma_write_support: false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1412 fs.gpfs.gds_write_support: false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1425 profile.nvtx : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1427 profile.cufile_stats : 3
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1429 miscellaneous.api_check_aggressive : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1439 execution.max_io_threads : 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1440 execution.max_io_queue_depth : 128
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1441 execution.parallel_io : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1442 execution.min_io_threshold_size_kb : 1024
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1443 execution.max_request_parallelism : 0
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1444 properties.force_odirect_mode : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   0:1446 properties.prefer_iouring : false
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:801 =========
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:802 GPU INFO:
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:803 =========
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:436 GPU index 0 Tesla V100-SXM2-16GB bar:1 bar size (MiB):16384 supports GDS, IOMMU State: Pass-through or Enabled
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:436 GPU index 1 Tesla V100-SXM2-16GB bar:1 bar size (MiB):16384 supports GDS, IOMMU State: Pass-through or Enabled
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:450 Total GPUS supported on this platform 2
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:814 ==============
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:815 PLATFORM INFO:
 27-02-2024 23:14:47:643 [pid=1516982 tid=1516982] INFO   cufio-plat:816 ==============
 27-02-2024 23:14:47:644 [pid=1516982 tid=1516982] WARN   cufio-plat:564 Found ACS enabled for switch 0000:5d:00.0
 27-02-2024 23:14:47:644 [pid=1516982 tid=1516982] WARN   cufio-plat:564 Found ACS enabled for switch 0000:85:00.0
 27-02-2024 23:14:47:644 [pid=1516982 tid=1516982] INFO   cufio-plat:734 cannot open scsi_mod path, skip scsi check
 27-02-2024 23:14:47:644 [pid=1516982 tid=1516982] INFO   cufio-plat:821 use_mq not detected in scsi configuration.cannot support SCSI disks!
 27-02-2024 23:14:47:644 [pid=1516982 tid=1516982] INFO   cufio-plat:705 IOMMU: Pass-through or enabled
 27-02-2024 23:14:47:649 [pid=1516982 tid=1516982] INFO   cufio-plat:723 WARN: GDS is not guaranteed to work functionally or in a performant way with iommu=on/pt
 27-02-2024 23:14:47:649 [pid=1516982 tid=1516982] INFO   cufio-plat:857 Platform verification succeeded
 27-02-2024 23:14:47:657 [pid=1516982 tid=1516982] INFO   cufio-px-pool:453 POSIX pool buffer initialization complete
 27-02-2024 23:14:47:657 [pid=1516982 tid=1516982] INFO   curdma-ldbal:510 No RDMA devices configured,skipping RDMA load balancer initialization
 27-02-2024 23:14:47:659 [pid=1516982 tid=1516982] INFO   cufio_core:1004 CUFile initialization complete
 27-02-2024 23:14:47:670 [pid=1516982 tid=1516982] INFO   cufio-fs:357 Block dev: /dev/nvme1n1 numa node: 0 pci bridge: 0000:3a:02.0
 27-02-2024 23:14:47:671 [pid=1516982 tid=1516982] INFO   cufio-fs:357 Block dev: /dev/nvme2n1 numa node: 0 pci bridge: 0000:3a:00.0
 27-02-2024 23:14:47:707 [pid=1516982 tid=1516982] INFO   cufio_core:118 cuFile STATS VERSION : 8
GLOBAL STATS:
Read: ok = 2 err = 0
Write: ok = 2 err = 0
HandleRegister: ok = 2 err = 0
HandleDeregister: ok = 2 err = 0
BufRegister: ok = 1 err = 0
BufDeregister: ok = 1 err = 0
BatchSubmit: ok = 0 err = 0
BatchComplete: ok = 0 err = 0
BatchSetup: ok = 0 err = 0
BatchCancel: ok = 0 err = 0
BatchDestroy: ok = 0 err = 0
BatchEnqueued: ok = 0 err = 0
PosixBatchEnqueued: ok = 0 err = 0
BatchProcessed: ok = 0 err = 0
PosixBatchProcessed: ok = 0 err = 0
Total Read Size (MiB): 2
Read BandWidth (GiB/s): 0
Avg Read Latency (us): 0
Total Write Size (MiB): 2
Write BandWidth (GiB/s): 0
Avg Write Latency (us): 0
Total Batch Read Size (MiB): 0
Total Batch Write Size (MiB): 0
Batch Read BandWidth (GiB/s): 0
Batch Write BandWidth (GiB/s): 0
Avg Batch Submit Latency (us): 0
Avg Batch Completion Latency (us): 0
READ-WRITE SIZE HISTOGRAM : 
0-4(KiB): 0  0
4-8(KiB): 0  0
8-16(KiB): 0  0
16-32(KiB): 0  0
32-64(KiB): 0  0
64-128(KiB): 0  0
128-256(KiB): 0  0
256-512(KiB): 0  0
512-1024(KiB): 0  0
1024-2048(KiB): 2  2
2048-4096(KiB): 0  0
4096-8192(KiB): 0  0
8192-16384(KiB): 0  0
16384-32768(KiB): 0  0
32768-65536(KiB): 0  0
65536-...(KiB): 0  0
PER_GPU STATS:
GPU 0(UUID: fb621244b33a3625ba373712b627b55) Read: bw=0 util(%)=0 n=1 posix=0 unalign=0 dr=0 r_sparse=0 r_inline=0 err=0 MiB=1 Write: bw=0 util(%)=0 n=1 posix=0 unalign=0 dr=0 err=0 MiB=1 BufRegister: n=1 err=0 free=1 MiB=0
GPU 1(UUID: e019366ad84f43dada9287dd2d9f) Read: bw=0 util(%)=0 n=0 posix=0 unalign=0 dr=0 r_sparse=0 r_inline=0 err=0 MiB=0 Write: bw=0 util(%)=0 n=0 posix=0 unalign=0 dr=0 err=0 MiB=0 BufRegister: n=0 err=0 free=0 MiB=0
PER_GPU POOL BUFFER STATS:
GPU : 0 pool_size_MiB : 1 usage : 0/1 used_MiB : 0
PER_GPU POSIX POOL BUFFER STATS:

PER_GPU RDMA STATS:
GPU 0000:62:00.0(UUID: fb621244b33a3625ba373712b627b55) : 
GPU 0000:89:00.0(UUID: e019366ad84f43dada9287dd2d9f) : 

RDMA MRSTATS:
peer name   nr_mrs      mr_size(MiB)
PER GPU THREAD POOL STATS:
gpu node: 0 enqueues:0 completes:0 pending suspends:0 pending yields:0 active:0 suspends:0 
gpu node: 1 enqueues:0 completes:0 pending suspends:0 pending yields:0 active:0 suspends:0 

 27-02-2024 23:14:47:707 [pid=1516982 tid=1516982] INFO   cufio-px-pool:484 POSIX pool buffer release complete
 27-02-2024 23:14:48:720 [pid=1516982 tid=1516982] INFO   0:136 nvidia_fs driver closed