CUDA Radiosity and hardware rasterizer

Hello everyone!:)
This is my first post here.
I’m doing radiosity simulation on CUDA. It’s progressive refinement version, using hemicube to calculate form factors.
As far as I can tell, every operation I’ve implemented, that simulates ‘rendering pipeline’ (to ‘render’ onto hemicubes) works reasonably fast.
Everything is fine until rasterization… I’ve tried several different approaches, and neither seems to work with decent speed. I read some threads here and there about software rasterization on CUDA, and it seems like it is not the best idea.

On to the problem. Just before rasterization phase, I have in global memory, vertices in screen space for every triangle that needs to be rendered at current stage. So basically I have this data, for example:

vertices: (123,56) (110, 99) (33, 59)
triangle index: 4561 (this can be converted to color if needed)
depth: 15.3 (for zbuffer)

I think this is all data that is needed to do rasterization. So my question is this:
Is it possible to pass data from CUDA directly into hardware rasterizer? Or any other reasonable way to do this? I dont need textures, shading, lighting, just pure z-buffered triangle shapes :]
Any thoughts, ideas, suggestions, links, articles will be very useful:)


Yes, but it’s not pretty.

There’s no access to the rasterizer from CUDA itself. You need to use OpenGL interop… set up your vertices in cuda arrays, exit the kernel, use OpenGL to rasterize, then start another kernel to use the Z results.

This is obviously hacky, obviously annoying, and obviously has lots of overhead, but it is possible.

It is not so difficult and slow. Check sdk samples with directx and opengl. There are a lot of samples with fast rendering, anyway radiosity it probably slow.

Thanks for response!

Ok, so I will try opengl interop, I read some tutorials, I think I can do the rendering stuff, but one matter is still unclear to me.

How to get the result back from framebuffer to global memory? In some older thread somebody asked similar question:

and one of the answers:

(Has anything changed since 2007 in this matter? )

I’m not sure how can I do this copy from openGL, so it is still GPU->GPU copy

thanks for your help :)


I am interested in implimenting hemicube based radiosity calculation in CUDA. Any chance I can look at your code ?



This is possible to do.

You perform an off-screen rendering in OpenGL into a frame-buffer which has a color and depth attachment. Then you map this buffer to CUDA and use it.

Look at the post process openGL SKD example.

What they do is first to render geometry using OpenGL, then use CUDA to perform blur on the pixel-buffer, and last display the pixel-buffer using OpenGL.

As i understood you will need the depth component of the frame-buffer. I haven’t had any luck in mapping a depth-buffer to CUDA but there is a small hack you could do. Write a shader program that outputs the depth into a color channel, and map the color buffer instead.

Is anyone making the research of radiosity based on CUDA,I have some problems on calculating the form factors when there are occulusion among the patches.Is there anyone who can help me ?