So far for my typical tests CUDA 9.0 RC is 5-10% slower than CUDA 8.0 for my benchmarks. These are for applications which do not use any of the NVIDIA libraries, though I am getting to those tests next.
Also getting weird compiler warnings related to ‘__shfl()’.
This for Windows 8.1, Visual Studio 2015, GTX 1080ti.
Anyone else have good or bad experiences with CUDA 9.0 RC?
The compiler warnings related to __shfl may have to do with the shuffle sync mechanism included in CUDA 9. Previous shuffle ops are “deprecated”.
Refer to section B.15 in the CUDA 9 RC programming guide.
This change is in preparation for volta independent thread execution model. refer to section H.6.2
I haven’t tried it yet, but I suspect if you compiled for volta (sm_70) those warning would turn into errors.