Point-wise multiplication

RyuKa · July 5, 2011, 7:07am

Hello,

I have two matrix and I need to multiply element by element.

Like that :

__global__ void pixelbypixelmultiplication_kernel(float* d_Data,float* d_Data2,float* d_Product,int data1H, int data1W)

{

	int offset = threadIdx.x + blockIdx.x*blockDim.x;

	if(offset<data1H*data1W)

	{

		d_Product[offset]=d_Data[offset]*d_Data2[offset];

	}

}

Is there any way to do that faster ?

avidday · July 5, 2011, 8:27am

There is a lot of set up overhead for 1 FLOP of “real” work in that code. Try having each thread do multiple calculations rather than just one.

RyuKa · July 5, 2011, 2:15pm

Okay, thank you !!

I’ll edit my post later.

EDIT :

I tried

__global__ void pixelbypixelmultiplication_kernel(float* d_Data,float* d_Data2,float* d_Product,int data1H, int data1W)

{

	int offset = threadIdx.x + blockIdx.x*blockDim.x;

	while(offset<data1H*data1W)

	{

		d_Product[offset]=d_Data[offset]*d_Data2[offset];

		offset+=gridDim.x*blockDim.x;

	}

}

and launching

const int N= data0W*data0H/8;  // dimension

	int T=512; // number of threads

	const int B = (N+T-1)/T;

instead of N= data0W*data0H; (so 8 times less blocks), but it almost change nothing : (

Topic		Replies	Views
Matrix by vector multiplication (row/column wise) CUDA Programming and Performance	0	583	March 17, 2020
problem of matrix multiplication vector x matrix CUDA Programming and Performance	4	1221	August 22, 2010
Multiple threads on one large dataset CUDA Programming and Performance	0	408	July 11, 2018
Matrix multiply: A divide and conquer method CUDA Programming and Performance	2	1660	June 20, 2013
Just give me an advice. write global CUDA Programming and Performance	1	1039	November 13, 2009
multiplication of several matrices CUDA Programming and Performance	0	4151	September 8, 2010
CUDA Matrix Multiplication: One thread computes multiple elements CUDA Programming and Performance	4	4957	December 28, 2014
Array index in the device programm Improvment of the array index calculation in the device programm CUDA Programming and Performance	1	3795	October 22, 2009
optimization tips for 3D elementwise matrix multiply CUDA Programming and Performance	0	352	November 4, 2019
A simple problem CUDA Programming and Performance	10	5205	October 11, 2007

Point-wise multiplication

Related topics