dgemm performance in 3.2 vs 4.0 rc2

Is there some dgemm/C2050 performance improvements in cublas/4.0 rc2 in comparison w/cuda 3.2 cublas ?

No, but ZGEMM has been improved by about 10%

I think too much performance has been squeezed out for GEMMs…Its time to leave them free.