CUDA Ray Tracing - error when mesh's faces are a lot

Hello, everyone. I’m currently building a BVH ray tracing with CUDA. When i tried to do the cudaMemcpy from device to host, it gives me ‘unspecified launch error’. There is no recursion in my code. However, the code worked well when I ran it with 12 faces mesh. But, it gave the above error when I ran it with 33 faces mesh. Can someone please help me?

Thanks a lot.

It’s likely that your kernel (immediately preceding the cudaMemcpy that is reporting the error) is making an invalid operation of some sort, when you increase to 33 faces mesh.

  1. Use proper CUDA error checking throughout your code
  2. Run your code with cuda-memcheck. You should get an indication of the actual problem in the kernel
  3. Follow the procedure here:

and recompile (with -lineinfo) and run your code with cuda-memcheck again to have cuda-memcheck report the actual line of kernel code that is causing the fault
4. Use printf or other debugging techniques to further expose the nature of the problem in that line of code.

Hello txbob
I ran the cuda-memcheck and found my mistakes.

is this a closed project, or can you make it open later? I’d be very interested in seeing a BVH implementation in CUDA.

Hello, cbuchner1
I guess it’s closed since it is my bachelor thesis… But I’ll see my campus’ regulation about this later

I’d be curious as well. I’m looking at the wiki now. Maybe they use one thread to solve a pair-wise intersection test?

The issue is, imo, that for this to be effective on the GPU, the tree needs to be relatively shallow and wide. I’m imagining one thread per node at a certain depth in the tree. I’d be curious to see how this fares against a well-implemented CPU version.

A CPU version might be better in the sense that various geometric objects will have their own intersection routines which breaks the SIMT architecture of CUDA. Unless all your objects are the same.

Eh, all I read was the wiki for 5 minutes. I’d be curious to see the paper.