New comute profiler explanation

Valmass · December 7, 2010, 6:25pm

Hi,

I’m performing some tests and collecting results with the NVidia Compute Visual Profiler.
I inizialize my NDrange global variabile with values x = 159 y = 159 z = 1 tot = 25281
but I don’t explicitly set the local_work_size,
thus leaving the choice to set the size to the Opencl runtime.

Using Compute visual Profiler the results are quite different, in fact:
NDRangex NDRangey
1 159

Does anyone have an explanation for such strange behavour?
How does the runtime choose the allocation strategy?

Thanks in advance for any help

Valerio

Topic		Replies	Views
Opencl Global work size CUDA Programming and Performance	2	5542	December 23, 2010
Compute Profiler - Confused by Work Group Information CUDA Programming and Performance	3	11269	March 17, 2011
Local_work_size on NVidia drivers CUDA Programming and Performance	0	596	May 20, 2011
openCL --- weird behavior CUDA Programming and Performance	0	3875	February 3, 2011
work group and work group size CUDA Programming and Performance	0	3451	December 7, 2011
null workgroup size bug CUDA Programming and Performance	1	1195	January 26, 2010
5 questions about driver api, occupancy & profiling CUDA Programming and Performance	2	1481	August 14, 2008
Questions about global and local work size CUDA Programming and Performance	23	55542	November 1, 2010
Implementation Questions arrising from Ch.5 on Performace Guidelines in the Programming Guide 2.0 CUDA Programming and Performance	12	2571	June 8, 2009
How to explain the performance difference? CUDA Programming and Performance	7	3565	March 26, 2008

New comute profiler explanation

Related topics