Hello,
I am trying to use nvprof tool to profile my code. The current CUDA version is cuda-9.0.
Before that I have tried using nvprof on one of the CUDA samples ‘vectorAdd’, but nvprof is becoming unresponsive without even generating errors. The profiling starts, then the terminal becomes dull. I cannot even break the execution with Ctrl+C, the terminal needs to be forced closed. Here is an example of the terminal log after waiting for a long time:
nvprof ./vectorAdd
[Vector addition of 50000 elements]
==8794== NVPROF is profiling process 8794, command: ./vectorAdd
After this, I get absolutely no response.
I also tried simpler samples with ‘cuda_profiler_api.h’, the result is the same.
When the kernel is not profiled, for instance in CUDA sample ‘deviceQuery’, the profiling results for API calls are generated though.
Besides, cuda-memcheck generates the results just fine:
cuda-memcheck ./vectorAdd
========= CUDA-MEMCHECK
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
========= ERROR SUMMARY: 0 errors
What could be the reason and solution for this issue?
Is this a PC or Optimus-enabled laptop? Haven’t tried profiling on Optimus-based systems, maybe that’s the issue, but no idea, otherwise. Post your hardware configuration & output of nvidia-smi.
Are you profiling the code on one or multiple GPUs? See if by using the CUDA_VISIBLE_DEVICES flag to target a single GPU, the issue with the hanging profiler goes away:
Hello,
Normally, I am working with multiple GPU’s, but for testing I used single GPU as well.
I have tried many other solutions along with these, nothing helped.
Then, I rebooted the machine and it worked this time…
Maybe, some of the updates hadn’t been processed before the reboot, and the reboot
solved the problem.