Profiling failure due to CUDNN_STATUS_INTERNAL_ERROR

I’m running the Nsight Compute profiler on a CNN with Pytorch, and it fails with the following message:
End of trace:
‘lib/python2.7/site-packages/torch/nn/modules/", line 313, in forward’

I tried both Tensorflow and Pytorch, on several machines. I’m using GTX-1080Ti, tried both CUDA 10.0 and 9.0, and I’m following all minimal requirements.

How can I fix this?


Could you please describe in more detail what exact steps or commands you tried to profile with Nsight Compute? Does your usage of Pytorch work fine when not profiling with Nsight Compute, or are you seeing issues there, too? Note also that Nsight Compute 1.0 does not support profiling child processes, so if your usage of CUDA or a CUDA-accelerated library is not directly within the process launched via Nsight Compute, you will not be able to profile it.

Please note that Nsight Compute 2019.1 has been release in the CUDA Toolkit 10.1
and as a stand-alone download:
This version has an option to profile child processes. Let us know if this fixes your issue.

Also, as Felix mentioned, let us know if the problem goes away when you are not profiling.