Observation about performance change with change in grid size

sidxavier · May 19, 2009, 1:17pm

Hi all,

I am using a 8800GT for an image processing algorithm. Earlier the grid size for my application was 128128 with blocksize 88 (this block size gives the best performance) which covers the whole image (1024*1024). Each block performs a parallel process on an image portion.

Suppose in this configuration the kernel run time is X.
Now I am doing some scalability tests on my algorithm. So I force the whole task to be done by lesser number of blocks. Suppose there is just one block in the grid then this block is forced to perform task of all the 128*128 blocks serially. (using a for loop in the kernel).

So as I increase the number of blocks the performance scales up in the following way :

No. of Block Time taken
1 Block = 25X
2 Block = 13X
4 Block = 6.5X
8 Block = 3.3X
16 Block = 1.6X
32 Block = X
64 Block = .75X

Question (finally :-) !! ) : Why is there a performance speed-up when I keep 64 blocks make the kernel repeat the work on different parts of image? i.e. why is there a performance benifit when i have lesser number of blocks (64) against 128*128 for doing the same job. Is it not conflicting to the general notion that more number of blocks yield better performance?

Topic		Replies	Views
Strange performance relationship to grid dimension? CUDA Programming and Performance	1	964	November 26, 2009
increasing blokSize -> Faster or slower CUDA Programming and Performance	4	861	September 12, 2011
CUDA perormances CUDA Programming and Performance	10	7129	January 22, 2008
How to choose grid size ? No. of blocks and threads ? CUDA Programming and Performance	1	818	February 4, 2016
work group size different work group lead to different performance CUDA Programming and Performance	1	809	November 23, 2011
Unexpected CUDA processing time dependency on thread count CUDA Programming and Performance cuda , python , numba	0	785	April 17, 2021
Grid size & performance CUDA Programming and Performance	1	819	September 27, 2016
Grid Block launch configuration CUDA Programming and Performance	2	984	November 18, 2011
Speedup Trend With Increasing Blocks... Trouble Interpreting Results CUDA Programming and Performance	0	669	August 28, 2009
Newbie: More threads == much slower? :( CUDA Programming and Performance	4	2080	July 25, 2008

Observation about performance change with change in grid size

Related topics