Hi,
I’m running the Nsight Compute profiler on a CNN with Pytorch, and it fails with the following message:
“RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR”
End of trace:
‘lib/python2.7/site-packages/torch/nn/modules/conv.py", line 313, in forward’
I tried both Tensorflow and Pytorch, on several machines. I’m using GTX-1080Ti, tried both CUDA 10.0 and 9.0, and I’m following all minimal requirements.
Could you please describe in more detail what exact steps or commands you tried to profile with Nsight Compute? Does your usage of Pytorch work fine when not profiling with Nsight Compute, or are you seeing issues there, too? Note also that Nsight Compute 1.0 does not support profiling child processes, so if your usage of CUDA or a CUDA-accelerated library is not directly within the process launched via Nsight Compute, you will not be able to profile it.