Understanding CPU and GPU Behavior with NVIDIA Visual Profiler

hirakawa.yuya · March 4, 2024, 2:04am

I am trying to deepen my understanding of CPU and GPU behavior using the NVIDIA Visual Profiler.

There are a few things I would like to know about the NVIDIA Visual Profiler:

What is the range of time supported by NVIDIA Visual Profiler? For example, is it in seconds, milliseconds, microseconds, or nanoseconds?
The attached image is a screenshot from running a simple CUDA program.[1]

2-1. When horizontal bars for cudaMalloc or cudaMemory appear, is the CPU in a run state, or is it just waiting?

2-2. Are MemCpy(HtoD) mean and MemCpy(DtoH) mean the actual data transfers or other?

[1]The simple CUDA program which I used is the “mult.cu” found on the following web page: 第6回 GPU の仕組みと PyTorch 入門 / 真面目なプログラマのためのディープラーニング入門

Sorry, this web page is Japanese.

mjain · March 8, 2024, 1:01pm

What is the range of time supported by NVIDIA Visual Profiler? For example, is it in seconds, milliseconds, microseconds, or nanoseconds?

Time units are shown at all places. For ex - in the timeline, it is shown at the top bar. If you zoom in/out, unit might change between s (sec), ms (millisecond), us (microsecond) etc.

2-1. When horizontal bars for cudaMalloc or cudaMemory appear, is the CPU in a run state, or is it just waiting?

For CUDA APIs, these bars represent the entire duration of the API, starting from when CUDA starts processing it to when it finishes. It is not necessary that CPU is busy all the time during this duration.

2-2. Are MemCpy(HtoD) mean and MemCpy(DtoH) mean the actual data transfers or other?

Yes, activities shown under the CUDA device and context represent those activities which are executed on the CUDA device. MemCpy trace represent data transfers.

system · March 22, 2024, 1:01pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
cuda visual profiler CUDA Programming and Performance	12	8180	July 30, 2008
Profiler Times just need some info CUDA Programming and Performance	4	4538	June 16, 2010
Profiler, GPU/CPU time CUDA Programming and Performance	0	2562	January 29, 2009
Analysis of CUDA Visual Profiler Output CUDA Programming and Performance	2	1870	October 6, 2008
Does the Visual Profiler CUDA Programming and Performance	0	1113	March 20, 2012
Profiling GPU at source code level CUDA Programming and Performance	4	551	November 9, 2024
Nvidia Visual Profiler Not accurate in timing Visual Profiler and nvprof cuda	0	777	July 29, 2022
How to explain the performance difference? CUDA Programming and Performance	7	3522	March 26, 2008
Timing CUDA program CUDA Visual Profiler vs. clock() CUDA Programming and Performance	2	1493	June 17, 2009
cudaMemcpyAsync Func Used too long time. CUDA Programming and Performance	5	2373	July 15, 2019

Understanding CPU and GPU Behavior with NVIDIA Visual Profiler

Related topics