hey coders! i want to play with 2d pointers in cuda, i google alot about cudamallocpitch() but didn’t find any good example, can any one of you provide me good example of cudacudamallocpitch() with cudaMemcpy2D().
and run kernal like
__global__ void kernel(float* a, float* b, float* c, int n, int m, int p)
{
int i = blockIdx.x*2 + threadIdx.x;
int j = blockIdx.y;
float sum = 0.0f;
if(i<10 && j<10)
{
for (int k = 0; k < p; ++k)
{
sum += b[i+n*k] * c[k+p*j];
}
a[i+n*j] = sum;
}
}
if there is something wrong with my kernel do tell me, i am new cuda and parallel processing so don’t mind if there is anything wrong External Image
LONG LIVE CODERS :)