GDS error: cuDeviceGet failed with error 4

Hi,
I am currently seeing similar performance for cuFIleRead in terms of POSIX compatible mode and GDS. I can see that the NVMe driver is supported from gdscheck. However, upon executing the sample programs the only errors that I can see in cufile.log is.

ERROR  0:106 cuDeviceGet failed with error 4

Does this affect the performance in anyway? why is my CUDA kernel shutting down, as shown by error 4: CUDA_ERROR_DEINITIALIZED?

GDS release version: 1.5.0.59
 nvidia_fs version:  2.14 libcufile version: 2.12
 Platform: x86_64
 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Supported
 NVMeOF             : Unsupported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 BeeGFS             : Unsupported
 WekaFS             : Unsupported
 Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Enabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.force_compat_mode : false
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_size : 128
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 16384
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 33554432
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.beegfs.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 fs.gpfs.gds_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 execution.max_io_threads : 0
 execution.max_io_queue_depth : 128
 execution.parallel_io : false
 execution.min_io_threshold_size_kb : 8192
 execution.max_request_parallelism : 0
 =========
 GPU INFO:
 =========
 GPU index 0 NVIDIA A100-SXM4-80GB bar:1 bar size (MiB):131072 supports GDS, IOMMU State: Disabled
 GPU index 1 NVIDIA A100-SXM4-80GB bar:1 bar size (MiB):131072 supports GDS, IOMMU State: Disabled
 GPU index 2 NVIDIA A100-SXM4-80GB bar:1 bar size (MiB):131072 supports GDS, IOMMU State: Disabled
 GPU index 3 NVIDIA A100-SXM4-80GB bar:1 bar size (MiB):131072 supports GDS, IOMMU State: Disabled
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Platform verification succeeded