I am doing RDMA data (images) transfer to CPU pinned memory then cudaMemcpyAsync to GPU
I can successfully transfer 1000 images and more but sometimes, randomly there are issues.
nvvp shows strange unexpected cudaMemcpyAsync lock and even overlapping Memcpy(htoD). Is this artefact (huge log file) or I am missing something ?
Is there any tool to inspect what happennig in driver/gpu ?
see the nnvp images PNG below