How can I call the global function when I use a 3d array

mtf_sn_1996 · May 5, 2021, 11:22pm

I’m working on a project where I need to use different sizes of arrays in one array. That’s why I created the array in different sizes while creating it. Below are the loops that I malloc and copy smoothly.

	    'double*** d_A = new double**[Asize];
		        for (int i = 0; i < Asize; i++) {
			int twoSize = A[i][0][3];
			d_A[i] = new double*[twoSize];
			for (int j = 0; j < twoSize; j++) {
				d_A[i][j] = new double[4];
			}
		}
                for (int i = 0; i < Asize; i++) {
			int twoSize = A[i][0][3];
			for (int j = 0; j < twoSize; j++) {
				cudaMalloc(&d_A[i][j], 4 * sizeof(double));
			}
		}'

Do I need to call the global function I will use now as ‘d_A’ or as ‘& d_A’?

Robert_Crovella · May 5, 2021, 11:34pm

Wow, fun. You pretty much never pass & of anything as a CUDA kernel argument. The address of a host variable is always a pointer to host memory. That is never usable in CUDA device code. The only possible exception would be if using managed memory.

Here is a worked example of how to do it (the second code example in that answer). That isn’t actually demonstrating variable row length, but I assume you can figure that part out.

For a collection of other approaches to multi-dimensional array handling, see here.

mtf_sn_1996 · May 6, 2021, 12:29am

Thank you for your response. Is there anything wrong with this code? I don’t get any errors when I run it.

In other parts of the code, the data comes in fragments. Fragmented and different in size. Is there any other way I can do this?

Although I tried both ‘d_A’ and ‘d_A’ when I called the function, the global function was not called.

Robert_Crovella · May 6, 2021, 12:41am

Yes, there are problems with your code. The first is that your d_A pointer (if that is what you are passing to the kernel) cannot be allocated with new. No pointer that you dereference in device code can be allocated with new. I suggest studying the link I gave you. It’s a complex procedure to make this work.

Regarding other method, I have you a link for some suggestions. The most canonical (and probably best) suggestion is to flatten your data. For uneven length rows, this means you’ll need an array of row start offsets. Something like this

mtf_sn_1996 · May 6, 2021, 12:46am

why don’t I get a error when I’m working? And why does it take up space in video card memory d_A?

Robert_Crovella · May 6, 2021, 12:48am

If your code is working to your satisfaction then full speed ahead! No need to ask me about it. I assumed you were having trouble based on statements like this:

If you want to see the errors that are resulting in your global function not being called, use proper CUDA error checking (just google that) and run your code with cuda-memcheck. If neither of those report errors, then that is good.

And anytime you do a cudaMalloc operation, it takes up space in video card memory. Even if your approach is wrong.

mtf_sn_1996 · May 6, 2021, 12:55am

You got it right. I think I misrepresented it. :)

So I may think the code is working correctly. I’ll review the response in the link and try to integrate it in my own way. Thank you for your support. :)

mtf_sn_1996 · May 6, 2021, 12:59am

Should I use a ‘malloc’ operator instead of a ‘new’ operator when I want to create an array?

Robert_Crovella · May 6, 2021, 1:48am

either one will work.

mtf_sn_1996 · May 7, 2021, 1:21pm

How can I assign a value to the code in this link in main? the link

Robert_Crovella · May 7, 2021, 1:24pm

I don’t know what “assign a value to the code” means. The kernel in that example demonstrates how to assign a value to the 3D array in device code.

mtf_sn_1996 · May 7, 2021, 1:30pm

#include <cstdio>
inline void GPUassert(cudaError_t code, char * file, int line, bool Abort = true)
{
	if (code != 0) {
		fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
		if (Abort) exit(code);
	}
}

#define GPUerrchk(ans) { GPUassert((ans), __FILE__, __LINE__); }



__global__ void doSmth(int*** a) {
	//int threadID = blockDim.x * blockIdx.x + threadIdx.x;
	for (int i = 0; i < 2; i++)
		for (int j = 0; j < 2; j++)
			for (int k = 0; k < 2; k++)
				printf("[%d][%d][%d]=%d\n", i, j, k, a[i][j][k]);
}
int main() {
	int*** h_c = (int***)malloc(2 * sizeof(int**));
	for (int i = 0; i < 2; i++) {
		h_c[i] = (int**)malloc(2 * sizeof(int*));
		for (int j = 0; j < 2; j++)
			GPUerrchk(cudaMalloc((void**)&h_c[i][j], 2 * sizeof(int)));
	}
	int ***h_c1 = (int ***)malloc(2 * sizeof(int **));
	for (int i = 0; i < 2; i++) {
		GPUerrchk(cudaMalloc((void***)&(h_c1[i]), 2 * sizeof(int*)));
		GPUerrchk(cudaMemcpy(h_c1[i], h_c[i], 2 * sizeof(int*), cudaMemcpyHostToDevice));
	}
	for (int i = 0; i < 2; i++)
		for (int j = 0; j < 2; j++)
			for (int k = 0; k < 2; k++)
				h_c[i][j][k] = i + j + k;
	int*** d_c;
	GPUerrchk(cudaMalloc((void****)&d_c, 2 * sizeof(int**)));
	GPUerrchk(cudaMemcpy(d_c, h_c1, 2 * sizeof(int**), cudaMemcpyHostToDevice));
	doSmth << <1, 1 >> > (d_c);
	GPUerrchk(cudaPeekAtLastError()); 
	int res[2][2][2];
	for (int i = 0; i < 2; i++)
		for (int j = 0; j < 2; j++)
			GPUerrchk(cudaMemcpy(&res[i][j][0], h_c[i][j], 2 * sizeof(int), cudaMemcpyDeviceToHost));
}

I wanted to say “assign a value to the array”. like the code above

Topic		Replies	Views
__device__ array to __global__ Cant pass a __device__ array to __global__ CUDA Programming and Performance	3	2290	February 24, 2012
How do I init an array? Don't know how to init an array CUDA Programming and Performance	8	3994	May 28, 2008
allocating double pointer memory in GPU CUDA Programming and Performance	3	11677	February 3, 2011
Multi-dimensional arrays in global memory CUDA Programming and Performance	3	3538	August 11, 2008
Global arrays? CUDA Programming and Performance	24	10611	August 18, 2010
Invalid Device Pointer CUDA Programming and Performance	9	24456	January 15, 2009
How to create __global__ array at runtime? CUDA Programming and Performance	3	2880	April 2, 2014
2D Array Not Updated CUDA Programming and Performance	6	5232	May 4, 2010
Problem in passing an integer array to global function and device function CUDA Programming and Performance	2	1561	June 11, 2014
How can I allocate 2-dimensional array on the device memory? CUDA Programming and Performance	5	15711	August 6, 2009

How can I call the global function when I use a 3d array

Related topics