CUDA 8.0 RC (cuda_8.0.27) is much faster than CUDA 8.0 (cuda_8.0.44) on Pascal

On a few different machines, some with Pascal Titan X some with GTX 1070, with different versions of linux and current drivers I’ve found the CUDA 8.0 release (installed by to be much slower than the RC (installed by My code uses current Theano and the latest CuDNN. The performance difference is quite large, 50-100% slower per epoch of neural network training. If others can’t replicate on their code I’ll try to put together an example.

Probably a good idea to double-check your build settings, in particular debug vs release.

Given the magnitude of the difference, I would suggest putting together a self-contained example (as small and simple as possible) that reproduces the issue, and filing a bug with NVIDIA; the bug reporting form is linked directly from the CUDA registered developer website.

Both versions were installed the same way, so build settings shouldn’t be any different (and it should work in any case).

sudo sh

Given that others are complaining about speed, I’d bet it would be easy for them to reproduce by uninstalling 8.0.44 and installing 8.0.27. I don’t think the RC run file is available to download anymore but I could post somewhere it if it would be helpful.

You could wait and hope to find a volunteer to do work on your behalf, or you could do some due diligence on your own and then file a bug with NVIDIA (which gets the issue in front of people who can do something about it).

As for other reports of “slowdown with CUDA 8” I will predict that some percentage of those will turn out to be user error, while there will be multiple different root causes for actual performance bugs. That said, your issue might be related to the changes in SGEMM behavior reported in a neighboring thread (

Thanks, I did file a bug but I think that that reply in the other thread you linked to is likely the cause. I wasn’t really looking for others to volunteer to to do work for me, I was more looking for confirmation that others had seen such things (which the other thread provides) before spending a bunch of time trying to track it down (which I guess they already did). But thanks anyway, your reply was helpful.