I am trying to do a quick profile of my code by setting PGI_ACC_TIME=1.
This is part of a large older script so I would prefer not to move to nsight-systems just yet.
It seems that setting it does in fact try to profile but I get this error spit out:
“libcupti.so not found”
and then all the timings totals for routines are 0 (although the times per kernel are there).
When I look for libcupti.so, I noticed that it is not in CUDA 11.0 but is there in CUDA 10.1 and 10.2. I added the lib64 folders in those CUDA directories to my LD_LIBRARY_PATH but I still get the error.
UPDATE:
It turns out I had a missing “slash” on my path to CUDA 10.1 and 10.2.
It now seems to be working.
However, I am still wondering if this will be supported in the future since the libcupti does not seem to be in CUDA 11.0 (at least in the one that came with the HPC SDK). Will PGI_ACC_TIME=1 still work in the future or will it be deprecated?