Double pointer multiplication in CUDA C

YianSti · October 22, 2017, 3:06pm

Hello,
i am trying to multiply 2 double pointer arrays declared and saved in Pinned Memory as :

int A_d;
cudaHostAlloc((void)&A_d,N*sizeof(int *),cudaHostAllocMapped);

for (i = 0; i < N; i++){

            cudaHostAlloc((void**)&A_d[i],N*sizeof(int),cudaHostAllocMapped);
              for (j = 0; j < N; j++){
               A_d[i][j]=A[i][j];
               }

}

The gpu code is:

global void kernel(int **A_d,int **B_d,int **C_d,int N)
{
int i=threadIdx.x+blockIdx.xblockDim.x;
int j=threadIdx.y+blockIdx.yblockDim.y;
if((i<N)&&(j<N)){

int sum=0;
for(int k=0;k<N;k++)
{
sum=sum+A_d[i][k]*B_d[k][j];
}
C_d[i][j]=sum;
}
}

When i have small arrays ( N=3,N=256 ) i have correct results.But when i have bigger size like N=1024 (Nrows=1024,Ncols=1024) the result is not correct.
Any ideas?
Thank you in advance!

svennevs · October 23, 2017, 1:12pm

int i=threadIdx.x+blockIdx.x*blockDim.x;
int j=threadIdx.y+blockIdx.y*blockDim.y;

I’d be willing to bet it’s how you are launching the kernel / indexing things. Once N gets large enough, you start having more than one warp get launched. Previously, only one was which is why smaller problems are correct?

Shot in the dark, but seems likely.

Topic		Replies	Views
double pointer allocation CUDA Programming and Performance	4	12380	September 16, 2013
[help][beginner] gpu programming CUDA Programming and Performance	2	390	October 30, 2020
2d array not properly working CUDA Programming and Performance	3	1510	October 2, 2008
a problem with double pointer CUDA Programming and Performance	6	9133	February 16, 2011
Global memory double pointer problem CUDA Programming and Performance	4	1621	June 5, 2009
cudaMemcpy2D help CUDA Programming and Performance	4	10578	July 28, 2009
cudaMemcpy2D To Host CUDA Programming and Performance	6	3432	June 8, 2012
Matrix multiplcation peoblem CUDA Programming and Performance	2	1097	July 9, 2010
how to use cudamalloc and cudaMemcpy for double pointer in cuda CUDA Programming and Performance	0	1435	March 26, 2016
cudaMallocPitch CUDA Programming and Performance	5	4488	October 5, 2010

Double pointer multiplication in CUDA C

Related topics