I’ve been running the exact same brook binary on two different cards - NVIDIA GeForce 7900 and NVIDIA Tesla C870 (both on the same machine). While trying to figure out why I was getting very different results on the two GPUs, I finally narrowed it down to the following - when running on Tesla, streamRead would turn every negative value into 0 and every value > 1 into 1 (values between 0 and 1 would be transferred correctly)! Nothing like that happens with NVIDIA GeForce 7900.
This can be observed either by following a streamRead with streamWrite - this would show that all values were truncated. Or insert a simple “out = in * 100 - 50” kernel call between the streamRead and streamWrite calls - this would return value in the -50…50 range, demonstrating that it’s the streamRead that truncates, not streamWrite. I tried both 1D and 2D streams of floats, with the same result.
I am using brook 0.5 beta1, ogl backend on Red Hat Entreprise Linux wth NVIDIA drivers and libraries v. 169.09.
What’s going on?! How do I get it to stop? Any help would be greatly appreciated.
glTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, x, y, w, h, elemsize==1?GL_RED:(elemsize==3?GL_RGB:GL_RGBA), GL_FLOAT, 0);
It does work correctly on NVIDIA GeForce 7900, so the problem I am seeing appears more Tesla-specific than brook-specific.
Well, there is no “General-purpose computing using the Tesla card on Linux” forum, so “CUDA on Linux” seemed like the closest I could get. Sorrry if this was a bad idea.
No problem. I didn’t mean to sound so snappy. Most users on this forum are CUDA-only users who have never touched brook or similar tools before, so you aren’t likely to get a lot of responses here.
Converting brook code to CUDA would be pretty easy, and I’m sure you can get a lot of help with that here ;)
If you can reproduce your problem with a simple OpenGL code, then the OpenGL forums might be a better place to get help, even if it is on a Tesla card. You may have exposed some kind of driver bug related to the Tesla OpenGL drivers. An NVIDIA rep is sure to respond if you can post a short test OpenGL test code.