sum over a matrix how to parallelize

jack_folla · November 6, 2009, 11:43am

I need to parallelize, a sum like this over a matrix: (threads index can be used for addressing elements of the matrix)

[codebox]for(unsigned int l=0;l<(blockSize*blockSize);l++)

{

    accsum+=*(sum + l);

    accsumsqr+=*(sumsqr + l);

    accsumqrt+=*(sumqrt + l);

}[/codebox]

in which way i can do this avoiding banck conflicts?

Quoc_Vinh · November 6, 2009, 3:13pm

I need to parallelize, a sum like this over a matrix: (threads index can be used for addressing elements of the matrix)

[codebox]for(unsigned int l=0;l<(blockSize*blockSize);l++)
{

    accsum+=*(sum + l);

    accsumsqr+=*(sumsqr + l);

    accsumqrt+=*(sumqrt + l);

}[/codebox]
in which way i can do this avoiding banck conflicts?

The “Reduction” program in Nvidia SDK is a good reference tutorial in solving your problem.

Cygnus_X1 · November 6, 2009, 9:40pm

Also check these sources:
[url=“http://www.cse.chalmers.se/~billeter/papers.html”]http://www.cse.chalmers.se/~billeter/papers.html[/url] - compaction
[url=“http://www.cse.chalmers.se/~billeter/pub/pp/index.html”]http://www.cse.chalmers.se/~billeter/pub/pp/index.html[/url] - compaction, prefix sum, sorting

Those provide one of the fastest currently available algorithms.

Topic		Replies	Views
sum of all elements of a matrix CUDA Programming and Performance	11	36882	October 18, 2010
Summing matrix elements CUDA Programming and Performance	3	7025	July 4, 2011
Add Rows of a Matrix Matrix row addition incredibly slow... CUDA Programming and Performance	3	4477	July 22, 2010
Paralel Reduction With less than 8000 values CUDA Programming and Performance	27	8091	July 22, 2010
Parallel reduction of nxn blocks in mxm matrix (oops, duplicate, not sure how to delete) CUDA Programming and Performance	0	515	September 22, 2017
Accumulate value within block CUDA Programming and Performance	15	3416	October 16, 2010
Parallel Addition ? How can i serialize parts at kernel? CUDA Programming and Performance	4	3003	August 16, 2009
Parallel reduction of nxn blocks in mxm matrix CUDA Programming and Performance	2	830	September 25, 2017
Combining sums CUDA Programming and Performance	1	1289	November 27, 2008
Parallel sum reduction 2D CUDA Programming and Performance	10	308	January 10, 2025

sum over a matrix how to parallelize

Related topics