cudaMalloc issue

cudaMancpy · January 23, 2018, 1:55am

Hello.

I am wondering about cudaMalloc issue…
There is a structure and structure consists of 24 bytes (int, float, int, float, pointer).
Actually, I already asked cudaMalloc question but I didn’t understand myself… so I need some help.

This structure is used in PageRank algorithm.
Now, the structure is called vertex.
Firstly, structure is allocated number of nodes (vertices).
Secondly, succesors is allocated number of adjacency nodes (this is just pointer for approaching to adjacency nodes).

So, I understand about C language Malloc. But I don’t know how to allocate in CUDA…
I was troubled for three days… Please help me. It’s really hard…

-------------------------------- structure

struct vertex {
        unsigned int vertex_id;
	float pagerank;
	float pagerank_next;
	unsigned int n_successors;
	vertex ** successors;
};

-------------------------------- allocation in C

vertices = (vertex *)malloc(n_vertices*sizeof(vertex));

for (i = 0; i<n_vertices; i++) {
		if (vertices[i].n_successors > 0) {
			vertices[i].successors = (vertex **)malloc(vertices[i].n_successors*sizeof(vertex *));
			
			if (!vertices[i].successors) {
				fprintf(stderr, "Malloc failed for successors of vertex %d.\n", i);
				exit(-1);
			}
			memset((void *)vertices[i].successors, 0, (size_t)(vertices[i].n_successors * sizeof(vertex *)));
		}
		else
			vertices[i].successors = NULL;
}

-------------------------------- befor allocation in CUDA, data memcpy

while (fscanf(fp, "%d %d", &vertex_from, &vertex_to) != EOF) {
	for (i = 0; i < vertices[vertex_from].n_successors; i++) {
		if (vertices[vertex_from].successors[i] == NULL) {
			vertices[vertex_from].successors[i] = &vertices[vertex_to];
			break;
		}
		else if (i == vertices[vertex_from].n_successors - 1) {
			printf("Setting up the successor pointers of virtex %u failed", vertex_from);
			return -1;
		}
	}
}

-------------------------------- allocation in CUDA

for (i = 0; i<n_vertices; i++) {
	if (vertices[i].n_successors > 0) {
		vertex ** d_testVar;
		cudaMalloc(&d_testVar, vertices[i].n_successors * sizeof(vertex*));
		
		if (!vertices[i].successors) {
			fprintf(stderr, "Malloc failed for successors of vertex %d.\n", i);
			exit(-1);
		}

		cudaMemcpy(&d_testVar, &vertices[i].successors, vertices[i].n_successors * sizeof(vertex*), cudaMemcpyHostToDevice);
		cudaMemcpy(&(d_vertices[i].successors), &d_testVar, sizeof(vertex**), cudaMemcpyDeviceToDevice);
	}
	else
		cudaMemcpy(&(d_vertices[i].successors), &(vertices[i].successors), sizeof(vertex**), cudaMemcpyHostToDevice);
}

Robert_Crovella · January 23, 2018, 2:39am

are you doing proper CUDA error checking? It doesn’t look like it.

cudaMancpy · January 23, 2018, 3:54am

sure. this is error check code.

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort = true)
{
	if (code != cudaSuccess)
	{
		fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
		if (abort) exit(code);
	}
}

the message is “invalid argument”
in here, cudaMemcpy(&(d_vertices[i].successors), &d_testVar, sizeof(vertex**), cudaMemcpyDeviceToDevice);

Robert_Crovella · January 23, 2018, 4:01am

that’s because &d_testVar is a host address, therefore the correct copy direction is cudaMemcpyHostToDevice.

You have a much bigger issue on the previous line. It has to do with fixup/conversion of host pointers to device pointers:

cudaMemcpy(&d_testVar, &vertices[i].successors, vertices[i].n_successors * sizeof(vertex*), cudaMemcpyHostToDevice);

The &d_testVar is incorrect there, it should just be d_testVar (and you should have gotten an error about that as well.)

But the bigger issue is that the pointers you are copying are presumably pointers to host addresses, and they will be useless on the device/in device code. They need to be converted.

That’s as far as I’ll go without a complete test code to look at. The complexity here is what causes many people to think about alternate data organization schemes.

cudaMancpy · January 23, 2018, 4:49am

Thanks a lot!
I have found the right code!

Topic		Replies	Views
How to allocation in cudaMalloc...? Please help me... CUDA Programming and Performance	7	824	January 15, 2018
how to change from cudaMallocManaged to cudaMalloc CUDA Programming and Performance	5	1942	January 25, 2018
Error allocating memory for custom struct CUDA Programming and Performance cuda , parallel-computing	1	297	March 7, 2024
First cudaMalloc in program failing CUDA Programming and Performance	3	3854	November 7, 2011
Problem With GPU Memory CUDA Programming and Performance	4	3077	May 29, 2009
cudaMalloc and cudaMemcpy for pointer in struct CUDA Programming and Performance	7	3468	November 26, 2014
CudaMalloc for a pointer inside structure CUDA Programming CUDA Programming and Performance	1	1441	March 1, 2012
Why my cudaMalloc function failed? CUDA Programming and Performance	1	919	April 21, 2010
Cuda array of pointers allocation CUDA Programming and Performance cuda	2	445	January 7, 2023
Multidimensional array, cudaMalloc CUDA Programming and Performance	1	7229	December 8, 2008

cudaMalloc issue

Related topics