problem passing big loop to cuda

mc_line · March 22, 2008, 4:22pm

Hi

I have a huge loop like the following (simplified for brevity)

for(i = 1 ; i < …; i++)
for(k = 1 ; k < …; k++){
.
.
.
if(…)
b[i][k] = b[i-1][k] + some stuff
.
.
.
}

so basically in this loop the [i] depends on [i-1] on the shared component b previously assigned

I have already written the coda in cuda form

The problem is that I do not know how to ensure that the [i-1] will always calculated before the [i] on the same k without incurring in deadlocks…

__synchronize wont work because I am using many block not only threads.
Please help

DenisR · March 22, 2008, 9:40pm

this looks like a cummulative sum, so check out the scan algorithm to maybe get some ideas on how to do this.

mc_line · March 23, 2008, 12:16am

no actually it is not exactly a sum

I wrote this “b[i][k] = b[i-1][k] + some stuff” just to show that the ith element depends on the (i-1)th element…it is not exactly a “sum” what the algorithm those

DenisR · March 23, 2008, 7:09am

Well, then you can keep the loop over i as a simple solution.

What is possible is the following : split k over you blocks, so keep all i’s belonging to the same k within 1 block. Then with your N threads within the block calculate the first N, then the second N, etc. etc. until you calculated all i’s. Then you can use synchtreads, to make sure all you N values have been calculated.

Topic		Replies	Views
How to synchronize a Kernel with many for loops CUDA Programming and Performance	12	12104	November 28, 2011
Problem with blocks that acces the same data CUDA Programming and Performance	0	706	July 11, 2009
does this code have problem? CUDA Programming and Performance	6	3932	December 9, 2007
Serializing and after parallelism again CUDA Programming and Performance	8	2522	July 12, 2008
this code resets my computer CUDA Programming and Performance	21	7363	March 29, 2008
Race condition in for loop Help! CUDA Programming and Performance	8	3348	September 10, 2008
help with kernel synchronization? CUDA Programming and Performance	22	14037	August 26, 2010
computing a sum leads to infinite values CUDA Programming and Performance	3	5414	September 16, 2008
syncthread and loops CUDA Programming and Performance	3	1829	November 6, 2008
Interpretation of Kernel CUDA Programming and Performance	4	3132	August 11, 2009

problem passing big loop to cuda

Related topics