Cuda -> OpenGL bandwidth

savali · April 22, 2008, 1:41pm

I modified postProcessGL example from SDK to measure bandwidth from cuda to opengl. It works by generating image in cuda kernel and then transferring it to OpenGL context trough PBO. Nothing is transferred from cpu->cuda or OpenGL → cuda. Modified source can be loaded from here.

It seems that I get really bad results, only 390MB/s. Am I doing something wrong or is this really that slow?

It seems that bandwidth is about equal to what I get by transferring data from cuda to cpu to opengl… So is this what cuda drivers do currently?

MisterAnderson42 · April 22, 2008, 1:47pm

What CUDA version are you using? CUDA 2.0 is supposed to improve this, you might try the beta if you haven’t already.

savali · April 22, 2008, 1:55pm

I’m already running v2 beta… Been waiting for Cuda 2 for exactly this reason :(

(Bandwidth was similar with Cuda 1.1. )

paulius · April 22, 2008, 9:07pm

Which GPU are you using and how much memory does it have?

savali · April 23, 2008, 6:33am

GeForce 8800 GTS with 640MB.

I’m mostly running quadhead with two such cards or two 9600GT:s, but this has been also tested on machine with only one 8800GTS. Workstation is Sun Ultra 40M2.

I’m running Linux (Centos5.1).

savali · April 25, 2008, 7:54am

Did anyone try that program? What kind of results are you getting?

Linh_Ha · August 21, 2008, 9:57pm

I try the program with new CUDA 2.0 on my Quadro FX6500 and the result is 465Mbps.

It is improved vs 390Mbs, but not good enough so that we can exploit OpenGL function as a part of computing process.

I still can not understand why the speed is far from the optimal 70Gb for device to device memory bandwidth, and even lower than from host memory to device memory even with pageable memory (1278Mb)