i am trying to implement a raytracer with CUDA. But i have some questions:
I’m reading some obj-files, where the vertices for the raytracing are stored. When I upload this data to the GPU with cudaMalloc, it spends too much time and the kernel loading failed. Is it better to upload this data in a texture or in constant memory? It should be thousands of triangles…or is there another solution, what I forgot?
When i am rendering an image with two or three triangles, my frame rate is only 10 fps. What could be the problem? I would like to have some ray packets, but i don’t know how to do this?
At the rendering process i could see some noise, as it would be the rays… I think, when i could raise the frame rate, this problem could be solved.
Get it working with the mesh data in global memory first, then move on to binding it to a texture.
You’ll need to profile your application. There’s not much point to ray packets with CUDA, as the simd architecture essentially gives you packets for free.
Oooh… i thought I have to allocate the memory for gpu with cudaMalloc and the copy it up with cudaMemcpy…
How do you mean this?
Thank you for this advice, but I thought I could generate rays for a block and such a block would be a ray packet. But so I have thought wrong… External Media
From looking at your code, it seems that you are doing a few cudaMemcpys per frame. How are you displaying the final pixels? Do you do a cudaMemcpy back to the host? If so, this is likely your bottleneck.
kernel arguments are already in shared memory. You are not copying anything into shared memory, just the pointer.
I think it is best to put your geometry into 1D arrays (float4), bind a texture to them, and access the geometry with a texture. How you are doing it right now looks to be by accessing it from global memory, which means all your accesses are non-coalesced.
Run your program through the visual profiler, there you can collect all the statistics on non-coalesced accessess you may ever need.
Hey man, you dont have to call that a noob question… I don;t even know this programming language that you guys are speaking. I can talk hardware, but not software programming.