I’m getting non-trivially different numerical results from cublasSgemm on the identical test code with the identical inputs, with CUDA SDK being the only difference between the test executables (CUDA 9.1 vs CUDA 8.0).
The differences are in the 0.00001-0.00000001 range, and they are not just a couple of numbers here and there; it’s pretty much the whole result set: https://www.dropbox.com/s/fvu5pridh4d3he9/cublasSgemm.png?dl=0
The code to reproduce is less than 200 lines long and is available here (together with the input data set):
https://www.dropbox.com/s/djk8kbokt3eg2yy/cublas_test.zip?dl=0. I’m compiling it with VS2012.
Is there a good explanation for this discrepancy or does this smell like a bug?
Thanks in advance,