nvprof becomes unresponsive

zulal-b · June 21, 2018, 11:51am

Hello,
I am trying to use nvprof tool to profile my code. The current CUDA version is cuda-9.0.
Before that I have tried using nvprof on one of the CUDA samples ‘vectorAdd’, but nvprof is becoming unresponsive without even generating errors. The profiling starts, then the terminal becomes dull. I cannot even break the execution with Ctrl+C, the terminal needs to be forced closed. Here is an example of the terminal log after waiting for a long time:

nvprof ./vectorAdd
[Vector addition of 50000 elements]
==8794== NVPROF is profiling process 8794, command: ./vectorAdd

After this, I get absolutely no response.
I also tried simpler samples with ‘cuda_profiler_api.h’, the result is the same.
When the kernel is not profiled, for instance in CUDA sample ‘deviceQuery’, the profiling results for API calls are generated though.
Besides, cuda-memcheck generates the results just fine:

cuda-memcheck ./vectorAdd
========= CUDA-MEMCHECK
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
========= ERROR SUMMARY: 0 errors

What could be the reason and solution for this issue?

vacaloca · June 22, 2018, 3:42pm

Is this a PC or Optimus-enabled laptop? Haven’t tried profiling on Optimus-based systems, maybe that’s the issue, but no idea, otherwise. Post your hardware configuration & output of nvidia-smi.

zulal-b · June 24, 2018, 6:27pm

Hello,
This is a PC server. And the complete output of nvidia-smi is:

nvidia-smi
Sun Jun 24 21:24:39 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20Xm         Off  | 00000000:02:00.0 Off |                    0 |
| N/A   35C    P0    58W / 235W |      0MiB /  5699MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K20Xm         Off  | 00000000:03:00.0 Off |                    0 |
| N/A   35C    P0    57W / 235W |      0MiB /  5699MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K20Xm         Off  | 00000000:83:00.0 Off |                    0 |
| N/A   31C    P0    58W / 235W |      0MiB /  5699MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K20Xm         Off  | 00000000:84:00.0 Off |                    0 |
| N/A   30C    P0    57W / 235W |      0MiB /  5699MiB |     36%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

vacaloca · June 25, 2018, 12:08pm

Are you profiling the code on one or multiple GPUs? See if by using the CUDA_VISIBLE_DEVICES flag to target a single GPU, the issue with the hanging profiler goes away:

zulal-b · June 26, 2018, 1:34pm

Hello,
Normally, I am working with multiple GPU’s, but for testing I used single GPU as well.

I have tried many other solutions along with these, nothing helped.
Then, I rebooted the machine and it worked this time…
Maybe, some of the updates hadn’t been processed before the reboot, and the reboot
solved the problem.

Anyways, thanks for the attention @vacaloca !

zjw518 · June 27, 2018, 2:07pm

What version of the CUDA toolkit are you using? I have had strange issues running the profiler with unified memory profiling; adding the flag

--unified-memory-profiling off

resolved these issues in CUDA 8.0 for me. I’ve found that CUDA versions 9.* no longer have this issue.

zulal-b · June 27, 2018, 2:16pm

I am using CUDA 9.0. I no longer have the problem after rebooting once more.

Topic		Replies	Views
nvprof never returns CUDA Programming and Performance	8	6383	March 30, 2016
NVProf error on samples CUDA Programming and Performance	28	20638	December 29, 2020
unified memory profiling failed Visual Profiler and nvprof	12	6214	June 17, 2018
Visual Profiler Visual Profiler and nvprof	0	2089	October 24, 2014
Unable to profile application Visual Profiler and nvprof	3	2875	July 3, 2019
nvprof makes an application hang at exit after main Visual Profiler and nvprof	14	1362	April 15, 2019
Nvidia CUDA profiler is not able to profile certain code Visual Profiler and nvprof	5	527	July 10, 2020
The nvprof can not work on the Xavier Visual Profiler and nvprof	2	777	June 19, 2020
nvprof: incompatible CUDA driver version on TX2 Jetson TX2	12	3292	October 18, 2021
CUDA Profiler Error CUDA Programming and Performance	3	4140	June 5, 2008

nvprof becomes unresponsive

Related topics