Nvprof python.exe pytorch code

nprof works to profile C++ CUDA executable, but not python with Pytorch code:

python -c “import torch; torch.randperm(10, device=‘cuda’)”
======== Warning: No CUDA application was profiled, exiting

According to How do I know randperm is performed on GPU - #2 by ptrblck - C++ - PyTorch Forums it should work. Is there anything else I need to configure ?

The pytorch code uses torch.cuda.profiler.cudart().cudaProfilerStart()/End(), but still nothing.

Thank you !

I have pytorch compiled locally, in Release mode.

Do I need to compile with CUDA_DEBUG for nvprof profiling to work ?

Thank you !

I don’t know what was the issue, now

nvprof python -c “import torch; torch.randperm(10, device=‘cuda’)”

seems to work.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.