CUDA VISUAL PROFILER

Kiran_CUDA · December 4, 2009, 12:27pm

Hi,

I have a .cu file, which I have compiled using nvcc. Now I want to have a deeper understanding of how the program is actually interacting with the hardware. Therefore I suppose CUDA VISUAL PROFILER will help me here. Can anybody tell me the commands that we use for using profiler on my .cu file.
Also, can we use CUDA VISUAL PROFILER in emulation mode?

Thanks

Sanjiv.Satoor · January 1, 2010, 8:33am

Build the CUDA program executable.

Run CUDA Visual Profiler & select your program. After the program execution completes the profiler output will be displayed in the Visual Profiler. Look at the CUDA Visual Profiler document ‘cudaprof.html’ for details.

You cannot use CUDA Visual Profiler in emulation mode.

Kiran_CUDA · January 1, 2010, 8:52am

Thanks for your reply satoor!!

I have sent you a PM.

biebo · January 1, 2010, 1:24pm

its not show output please help

=== Start profiling for session ‘Session1’ ===
Start program ‘/home/bibrak/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery’ run #1 …
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: “GeForce 9200M GE”
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 267714560 bytes
Number of multiprocessors: 1
Number of cores: 8
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit…

Program run #1 was aborted after maximum program execution time duration of 10 seconds.
Start program ‘/home/bibrak/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery’ run #2 …
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: “GeForce 9200M GE”
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 267714560 bytes
Number of multiprocessors: 1
Number of cores: 8
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit…

Program run #2 was aborted after maximum program execution time duration of 10 seconds.
Start program ‘/home/bibrak/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery’ run #3 …
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: “GeForce 9200M GE”
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 267714560 bytes
Number of multiprocessors: 1
Number of cores: 8
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit…

Program run #3 was aborted after maximum program execution time duration of 10 seconds.
Start program ‘/home/bibrak/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery’ run #4 …
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: “GeForce 9200M GE”
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 267714560 bytes
Number of multiprocessors: 1
Number of cores: 8
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit…

Program run #4 was aborted after maximum program execution time duration of 10 seconds.
Error in reading profiler output.

mitchde · January 2, 2010, 8:46am

Is there an newer Version for Mac OS X also available ?

Topic		Replies	Views
How can I use Cuda Visual Profiler Tell me How i use Cuda Visual Profiler CUDA Programming and Performance	1	4179	November 18, 2009
profiler in emulation mode profiling in emulation mode CUDA Programming and Performance	0	2162	March 24, 2008
CUDA Visual profiler Use on early verson of final program? CUDA Programming and Performance	1	1303	January 28, 2010
Visual Profiler outputs nothing Help! CUDA Programming and Performance	4	9194	April 9, 2009
Running CUDA Visual Profiler CUDA Programming and Performance	8	5099	October 29, 2010
profiling cuda on mac Where is the profiler? CUDA Programming and Performance	3	4530	May 19, 2012
CUDA visual profiler getting CUDA visual profiler to work CUDA Programming and Performance	6	13337	May 6, 2009
CUDA VISUAL PROFILER CUDA Programming and Performance	1	2695	July 6, 2009
How can I use CUDA visual profiler with JCUDA application? CUDA Programming and Performance	15	740	May 5, 2022
CUDA Visual Profiler Error CUDA Programming and Performance	0	4829	March 17, 2010

CUDA VISUAL PROFILER

Related topics