Hello. I’m working on a program to apply a blur algorithm to a matrix. First I’ve used a 1D array and stored the matrix in row-major order and it worked fine, but now I want to modify it to use 2D arrays.
float **h_iA, **d_iA;
h_iA = (float**)malloc(sizeof(float*)*N);
for (int m = 0; m < N; m++)
h_iA[m] = (float*)malloc(sizeof(float)*N);
InitMat(h_iA, N);
cudaMalloc((void**) &d_iA, sizeof(float*)*N);
for (int u =0; u < N; u++)
cudaMalloc((void **) &d_iA[u], sizeof(float)*N);
for (int i=0;i<N;i++)
cudaMemcpy(d_iA, h_iA, sizeof(float*)*N , cudaMemcpyHostToDevice);
for (int i=0;i<N;i++)
cudaMemcpy(d_iA[i], h_iA[i], sizeof(float)*N , cudaMemcpyHostToDevice)
Is this the proper way to allocate the device memory for d_iA and copy the data from h_iA to it?