brook's streamRead truncates to 0..1 on Tesla

I’ve been running the exact same brook binary on two different cards - NVIDIA GeForce 7900 and NVIDIA Tesla C870 (both on the same machine). While trying to figure out why I was getting very different results on the two GPUs, I finally narrowed it down to the following - when running on Tesla, streamRead would turn every negative value into 0 and every value > 1 into 1 (values between 0 and 1 would be transferred correctly)! Nothing like that happens with NVIDIA GeForce 7900.

This can be observed either by following a streamRead with streamWrite - this would show that all values were truncated. Or insert a simple “out = in * 100 - 50” kernel call between the streamRead and streamWrite calls - this would return value in the -50…50 range, demonstrating that it’s the streamRead that truncates, not streamWrite. I tried both 1D and 2D streams of floats, with the same result.

I am using brook 0.5 beta1, ogl backend on Red Hat Entreprise Linux wth NVIDIA drivers and libraries v. 169.09.

What’s going on?! How do I get it to stop? Any help would be greatly appreciated.

P.S. Crossposted to gpgpu.org forum

Seems like a brook issue. I’m not sure what this has to do with CUDA at all.

As far as I can tell, brook runtime calls

glTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, x, y, w, h, elemsize==1?GL_RED:(elemsize==3?GL_RGB:GL_RGBA), GL_FLOAT, 0);

It does work correctly on NVIDIA GeForce 7900, so the problem I am seeing appears more Tesla-specific than brook-specific.

Well, there is no “General-purpose computing using the Tesla card on Linux” forum, so “CUDA on Linux” seemed like the closest I could get. Sorrry if this was a bad idea.

No problem. I didn’t mean to sound so snappy. Most users on this forum are CUDA-only users who have never touched brook or similar tools before, so you aren’t likely to get a lot of responses here.

Converting brook code to CUDA would be pretty easy, and I’m sure you can get a lot of help with that here ;)

If you can reproduce your problem with a simple OpenGL code, then the OpenGL forums might be a better place to get help, even if it is on a Tesla card. You may have exposed some kind of driver bug related to the Tesla OpenGL drivers. An NVIDIA rep is sure to respond if you can post a short test OpenGL test code.

So, what would be the appropriate forum for discussing Tesla-specific OpenGL issues? I could not find any OpenGL forums on nvidia.com :(

Thanks!

P.S. Further testing showed that it’s only the first call to readStream that does this 0…1 truncation, all the subsequent calls are correct! Weird…