Nvprof warning: records have invalid time stamps due to

nvprof gives the following warnings:

==179614== Warning: 541 records have invalid timestamps due to insufficient device buffer space. You can configure the buffer space using the option --device-buffer-size.
==179614== Warning: 541 records have invalid timestamps due to insufficient semaphore pool size. You can configure the pool size using the option --profiling-semaphore-pool-size.

I did try increasing both those size values when launching nvprof to no avail. The doubly confounding part is, (not shown here) some kernels do get successfully profiled. The application depends on two sets of kernels. One set is compiled when the application is built. The other set is contained by dynamically loaded libraries, compiled elsewhere. The only kernels getting profiled are the ones loaded from library, while the ones not getting profiled are the ones implemented and built by the application itself.

The application contains plenty of cudaCheckErrors() throughout, which I would assume catches any errors along the way.

Compiled with nvcc flags:

-O3 -std=c++14 -Xcompiler -Wall -D_BSD_SOURCE -g -rdc=true --generate-code arch=compute_50,code=sm_50 --generate-code arch=compute_60,code=sm_60 --generate-code arch=compute_61,code=sm_61 --generate-code arch=compute_70,code=sm_70 --generate-code arch=compute_75,code=sm_75



Can you please provide few details about the system:

  1. CUDA toolkit version (can be obtained using the command $nvcc -V)
  2. OS
  3. GPU

Since you have already tried increasing the size of the profiling buffers, I am wondering if application runs fine under cuda-memcheck?
Would it be possible for you to give a try to the latest CUDA toolkit release i.e. CUDA 11.4?