Opencl Visual profiling

Opencl_jedi · April 23, 2010, 8:24am

Hi,

I’m trying to profile my kernel with opencl nvidia profiler,
but I can’t find anything about the code needed to enable the profiling,
anyone has an idea about the code command?

Thanks.

fcs · April 23, 2010, 8:32am

Not sure of what you needs but you don’t need to modify your code to use the profiler. just run it and select your binary and its options, itactive four you the hardwarve counter collecting.

Opencl_jedi · April 23, 2010, 9:20am

Actually I’m trying to use OpenCL visual profiler, it takes .oclpj file, and shows a table with kernel performances

I don’t know how to get the .oclpj from my visual project

fcs · April 23, 2010, 1:08pm

the oclpj file is the format of your saved project after profiling.

if your cuda app is on the same machine as your profiler it’s easy to use you just clic on ‘start’ and fill information about your application.

otherwise, you will need to do this: (linux syntax)

export CUDA_PROFILE=1

export CUDA_PROFILE_CSV=1

export CUDA_PROFILE_CONFIG=~/script/cudaprofile.sh

where the cudaprofile.sh contains the counter you need:

[codebox]#The profiler supports the following options:

#Time stamps for kernel launches and memory transfers.

#This can be used for timeline analysis.

timestamp

#Number of blocks in a grid along the X and Y dimensions for a kernel launch

gridsize

#Number of threads in a block along the X, Y and Z dimensions for a kernel launch

threadblocksize

#Size of dynamically allocated shared memory per block in bytes for a kernel launch

dynsmemperblock

#Size of statically allocated shared memory per block in bytes for a kernel launch

stasmemperblock

#Number of registers used per thread for a kernel launch

regperthread

#Memory transfer direction

#a direction value of 0 is used for host->device memory copies and a value of 1 is used for device->host

memtransferdir

#Memory copy size in bytes

memtransfersize

#Stream Id for a kernel launch

streamid

##The profiler supports logging of following counters during kernel execution

##There is a max of 4 profiler counters

##Non-coalesced (incoherent) global memory loads (always zero on coputa capability 1.3)

#gl_incoherent

##Non-coalesced (incoherent) global memory loads

#gld_coherent

##32-byte global memory load transactions

#gld_32b

##64-byte global memory load transactions

gld_64b

##128-byte global memory load transactions

gld_128b

##Global memory loads invalid on compute capability 1.3

#gld_request

##Non-coalesced (incoherent) global memory stores (always zero on coputa capability 1.3)

#gst_incoherent

##Coalesced (coherent) global memory stores

#gst_coherent

##32-byte global memory store transactions

#gst_32b

##64-byte global memory store transactions

#gst_64b

##128-byte global memory store transactions

#gst_128b

##Gobal memory stores invalid on compute capability 1.3

#gst_request

##Local memory loads

#local_load

##Local memory stores

#local_store

##Branches taken by threads executing a kernel

#branch

##Divergent branches taken by threads executing a kernel

divergent_branch

##Instructions executed

instructions

##Number of thread warps that serialize on address conflicts to either shared or constant memory

#warp_serialize

##Number of threads blocks executed

#cta_launched

[/codebox]

Topic		Replies	Views
Question about OpenCL Profiling CUDA Programming and Performance	5	11164	August 24, 2011
OpenCL commandline profiling need to know how to extract certain parameters CUDA Programming and Performance	1	5094	August 1, 2011
Commandline Profiling Trying to use the profiler via the command line CUDA Programming and Performance	3	10510	August 3, 2011
Confused about how to use the opencl visual profiler. CUDA Programming and Performance	0	1062	February 23, 2010
CUDA visual profiler using mpi? CUDA Programming and Performance	1	1173	November 9, 2009
command line profiling CUDA Programming and Performance	10	5621	August 23, 2010
NVIDIA profiler not working for OpenCL even for SDK samples CUDA Programming and Performance	2	10378	January 16, 2011
OpenCL and Ubuntu 10.10 CUDA Programming and Performance	7	80075	January 25, 2011
Kernel Launch Time (CPU Time) Reported in Visual Profiler how to optimize kernel launch CUDA Programming and Performance	1	682	July 7, 2011
kernel runs much faster when being profiled with Visual Profiler Visual Profiler and nvprof	4	4689	August 29, 2014

Opencl Visual profiling

Related topics