nvprof and gprof combined data is picture of performance ?

Hi,
I had really tough time running visual profiler from my laptop. when i try running the visual profiler it keep geenerating timer and when i application closes it either comes with signal 134 error or display problem.
Since i needed profiling urgently, i used nvprof and gprof and got the profile picture of Cuda cores and CPU profiles. But i am not sure whether i can depend on these two datas, as they are generated separately . Can nvidia guys comment on this and guide me how i can profile and get whole system picture (CPU+GPU profiles), so that i can look at hotspots.

I am using Jetson Tx2 with jetpack 3.3.

Please help !

Hi,

It’s recommended to run nvprof on the device, save the result into file and open it with the NVIDIA visual profiler on the host.
To profile a CUDA application, please use nvprof.

Thanks.

Thanks AastaLL. I wanted combined profile on Jetson Tx2, which i tried doing as follows:
nvprof --cpu-profiling on
But it shows warning as below :

======== Warning: CPU Profiling is not supported on the underlying platform

Hence i wanted your help on how i can profile and get both results.
Or
My approach is correct ?
Approach :
1)Use nvprof for cuda kernels
2)Use gprof for CPU functions

Point 1 and Point2 are run separately , so instances are at different times, as gprof need modification in make file to add -pg option.

Please guide .

Hi,

It’s recommended to use NVIDIA system profiler to profile CPU task.
https://developer.nvidia.com/nvidia-system-profiler

You can install it with JetPack3.3.

Thanks.

Thanks AastaLLL.
I wanted to profile both CPU and GPU together , not separately.
I believe system monitor will do only for CPU.
Please let me know if this together (CPU+GPU) profiling is not available ?

Regards
Vinay

Hi,

Suppose you can use Nsight to get the profiling result for both CPU and GPU:
https://developer.nvidia.com/nsight-systems#features

Feature	                                                   Jetson Autonomous Machines
CPU cores utilization, process, & thread activities	|  yes
CPU thread periodic sampling backtraces	                |  yes
CPU thread blocked state backtraces	                |  yes
CPU performance counter sampling	                |  yes
GPU workload trace	                                |  yes
GPU context switch trace	                        |  yes
SOC hypervisor trace	                                |   -
SOC memory bandwidth sampling	                        |  yes
SOC Accelerators trace	                                |  Xavier

Thanks.

I am unable to get this, as nsight gives me warning saying CPU profiling is not enabled for this platform.
Did you tried in Tx2 ?

Hi,

Okay, I can reproduce this issue on my environment.
It looks the behavior is incompatible to the website.

Let me check this in detail internally.
Will update information with you later.

Thanks.

Thanks AstaLLL.
Look forward for your observations and solutions.

Hi,

Sorry for the misleading.
Nsight and nvprof are different profiling tool.
Although nvprof doesn’t support CPU profiling on Jeton, Nsight can do this as website mentioned.

Please use switch to use Nsight systems:
https://developer.nvidia.com/nsight-systems#features

Thanks.

Hi AastaLll,
I tried the Nsight, but there too I got same warning and hence couldn’t get CPU data.
Please check my earlier comments in this thread.

I also tried few sample examples like dct8x8 and vecadd .
Nsight nor nvprof gives CPU and GPU data. They only give GPU data, so i am confused on how system profileing can be done.
Did you tried on these samples ?

Hi vinaybk,
Could you provide a few details about the Nsight Systems profiling:

  • The Nsight Systems version (can be found in Help->About dialog
  • The screenshot of the warning Nsight Systems gives (about CPU profiling no being enabled)
  • Report file generated by Nsight Systems (you can obtain the report file via Project Explorer (Right click on the report -> "Show in folder")

Thank you!