Make a kernel

antothenewbi · July 17, 2009, 12:54pm

Could someone help me with my kernel, I want to do this function:

void Removemean(float* rfData)
{
//nmpts is my number of columns
//nmlne is my number of rows

float avg = 0.0;
for(int i=0,i<nmlne;i++)
{
avg=0.0;
for(int j=0;j<nmpts;j++) avg += rfData[ inmpts + j];
avg /= nmpts;
for(int j=0;j<nmpts;j++) rfData[ inmpts + j] -= avg;
}

I have a kernel that is working but not faster enough (not colaesced).

How can I do that?
Many thanks

Philipp82 · July 17, 2009, 3:18pm

Hi,

Your problem is not very easy for parallel computing, because the calculation of the average value can’t be done parallel easily. You cannot let all threads just write on the same position…
Maybe it’s best to use the cpu, or you calculate the average value by the host and do the rest on the gpu. But I guess with all the data transfer that won’t be very good. Maybe someone can explain you how to make a reduction for the calculation of the average… (I can’t because I never did that)

Philipp.

jack · July 17, 2009, 3:58pm

I’d say that your best bet is to have a kernel that calculates the average value for each row and stores it in an array in global memory (i.e. ‘reduces’ the matrix to a column vector), then have another kernel that reads the average value for each row and subtracts it from each element in the row (i.e. subtracting that column vector from each column vector in the matrix and storing it back in the matrix column).

Topic		Replies	Views
Performing average calculation of an array[2048][2048] expedient use of cuda in this case? CUDA Programming and Performance	9	7972	August 26, 2009
My reduction code is not really fast.. CUDA Programming and Performance	0	8710	April 11, 2011
CUDA mean. CUDA Programming and Performance	1	3304	April 7, 2009
How aggregate series on Cuda? CUDA Programming and Performance	2	1496	April 2, 2010
operation that returns a single value only? CUDA Programming and Performance	2	3204	September 2, 2009
CUDA kernel for-loop performance CUDA Programming and Performance	16	7037	September 7, 2019
How to optimize this kernel CUDA Programming and Performance	3	1185	November 20, 2010
Reduce choice CUDA Programming and Performance	25	652	March 23, 2025
How to launch CUDA Cooperative Groups Standard Deviation example kernel? CUDA Programming and Performance	11	3445	February 12, 2023
Paralel Reduction With less than 8000 values CUDA Programming and Performance	27	8091	July 22, 2010

Make a kernel

Related topics