Problem With GPU Memory

Mugen77 · May 28, 2009, 10:56pm

Hello everyone, I’m having a little problem with Copying/Allocating memory to the GPU.
Here’s my problem:

// I’m creating a Bounding Volume Hierarchy, basicaly a Binary Tree on a Raytracing program using interoprability with DirectX9
// (using a texture to display the raytracer’s rendering)

The Program:

device CBVH *cuda_BVH; // Tree on GPU
BVH bvh; // Tree on CPU

// Build the Tree on CPU
BuildBHV(spheres, nbSpheres, bvhDepth, bvh);

// Allocate the Structure on GPU
cutilSafeCall(cudaMalloc((void**) &cuda_BVH, sizeof(CBVH))); -----------------------------------------------------------> WORKS

// Copy the depth attribute of the Tree to GPU
cutilSafeCall(cudaMemcpy(&cuda_BVH->depth, &bvh.depth, sizeof(unsigned int), cudaMemcpyHostToDevice) ); —> WORKS

// Allocate the root node on GPU
cutilSafeCall(cudaMalloc((void**) &cuda_BVH->root, sizeof(CBVHNode))); ----------------------------------------------> DOESN’T WORK

Whenever I try to add the last line, VisualStudio pops up on execution, a window when entering the Message Loop saying an error occured and pointing the
debug pointer to the int __cdecl _write_nolock (int fh, const void *buf,unsigned cnt) function of “write.c - write to a file handle”, at line 335:

                /* write the lf buf and update total */
                if ( WriteFile( (HANDLE)_osfhnd(fh),
                            lfbuf,
                            (int)(q - lfbuf),
                            (LPDWORD)&written,
                            NULL) )

-----> {
charcount += written;
if (written < q - lfbuf)
break;
Important point : I don’t execute any kernel. This is just the memory initialisation.
THX for your help.

Jamie_K · May 29, 2009, 12:10am

cuda_BVH points to device memory.
cuda_BVH->root is a variable that resides on the device.
&cuda_BVH->root is an address in device memory. It cannot be assigned directly using cudaMalloc. Using cudaMalloc, the pointer to the allocated block must be saved in host memory only. (Though it can later be transferred to the device.)

It’s easy to make this kind of mistake due to the way cudaMalloc returns its result in a void ** parameter.

Linked structures in general are difficult to manipulate and transfer. I’d recommend allocating an array of node structures and referring to them by index instead of using individually-allocated pieces.

Mugen77 · May 29, 2009, 12:32am

Thanks for the explanation. Very Helpfull.

I also thought about using an array instead seems much doable, I’ll go try that.

Sarnath · May 29, 2009, 6:23am

Don’t use “cutil”. It is not supported by NVIDIA.

It will change from one CUDA version to another and will break your code.

One fine day, NVIDIA may even stop shipping cutil stuff with CUDA. Nothing is guaranteed for “cutil”.

Feel nervous.

Cygnus_X1 · May 29, 2009, 7:27am

I implemented a BVH builder recently.
My memory is managed as follows:

one array for all primitives
one array for tree structure (generated by the algorithm) with indexes ‘from’ and ‘to’ describing which primitives it contains
one array for ‘active front’ - list of nodes yet to be processed.

I launch a single kernel for every layer of the tree to be created, block per each node.
Hardest are the first few steps, becase very few blocks are spawned and those are very big, but later one quicly use full power of your GPU. Final steps are also demanding - lots of tiny blocks to process.
I believe it is best strategy for this task.

Currently I have about 400ms construction time on Conference (283k triangles) on my GTX 260 but I am working to make it better…

Topic		Replies	Views
cudaMalloc issue CUDA Programming and Performance	4	945	January 23, 2018
Allocating device memory for an struc inside an std::vector<struct> CUDA Programming and Performance	2	136	September 28, 2024
how and best way to allocate C struct to gpu? novice question CUDA Programming and Performance	6	8170	September 3, 2010
CUDA complex memory allocation problems Problem allocating pointer to pointer me CUDA Programming and Performance	4	3163	March 3, 2008
question about memory allocation CUDA Programming and Performance	1	1663	October 16, 2007
allocating double pointer memory in GPU CUDA Programming and Performance	3	11923	February 3, 2011
cudaMalloc and Structs and Pointers problem CUDA Programming and Performance	7	7204	January 17, 2014
cudaMalloc and cudaMemcpy for pointer in struct CUDA Programming and Performance	7	3468	November 26, 2014
GPU struct allocation CUDA Programming and Performance	1	6971	May 5, 2016
Multidimensional array, cudaMalloc CUDA Programming and Performance	1	7229	December 8, 2008

Problem With GPU Memory

Related topics