Issues with GTX Titan

We have been developing an application that uses a optimization algorithm known as LBGFS-B. A Cuda implementation was done for it a few years ago and it is available on the web. We have been using it successfully for a few months.

The other day I installed a new GTX Titan card. Suddenly the optimization stopped working correctly in release mode. Works fine in debug mode, and it works fine in debug and release modes when I use NSight for Cuda debugging.

I installed the Titan in another computer on which I had first verified the code worked fine and the problem followed the card.

We are trying to find some other testers with GK110 or higher GPU to test and tell us their results. We have put the source code for a very simple test app on github ( that should be run in both debug and release modes. It works fine on GTX770, 690, 680 and lower cards, but not on Titan.

It works on GK110 (K20c) and GK208 (GT630) devices. Both are sm_35 devices.

Note that the 770, 690 and 680 are sm_30 devices.

I made some modifications though:

  1. Set the target architecture to sm_35 since I’m targeting a GK110/GK208:

  2. Allow a device to be selected from the command line:

This was built in VS2012 (Update 3) on Win7/x64.

I would also suggest exploring whether you can use a 32-bit CUDA build instead of 64-bit. It’s often slightly faster and if you don’t actually need 64-bit pointers then it’s a win.

Finally, the test seems a little sensitive. The GT 630 card will sometimes “fail” (and sometimes “pass”):


I would be happy to test your code on my 2 xards (1 x 660 Ti and 1x Titan).

Assuming that the Titan cards you tried are not damaged and the nvidia driver you use is the a test. I would thwrow in hee a wild guess that there might be a big in the code which is expoed only on the Titan card,

But first please give some instructions (like somw lines I have to copy paste into my linux console) and I will twst it in my cards (which so far appear to run correctly, at least for cufft libased code).

Hello AllanMac,
Thanks for the test data and good suggestions. Looking at your results I realized that I didn’t fully design the test correctly. In both debug and release mode it should run for 12 iterations. The fact that it is stopping before that tells me that it isn’t work correctly on your machine either. I suspect that if you ran it a few times in release mode you would get a failed message with the max weight error set to a large value (~10^20) which is what I am seeing on the Titan.

Let me know if you need any more tests run. :)