I am using a GTX 580 on windows 7 64bit.
when I set the GPU architechture in VS 2008 to sm_13, the performance is 2x better than setting it to sm_20 no matter the cache configuration is set to L1 preferred or shared preferred using cudaFuncSetCacheConfig().
All other settings are the same.
I think is compute capability for GTX580 is 2.0 and sm_20 should be the right setting, right? I am guessing the cudaFuncSetCacheConfig() is not working.
Does anyone have the same problem? Any suggestion or comment is appreciated.
I am using a GTX 580 on windows 7 64bit.
when I set the GPU architechture in VS 2008 to sm_13, the performance is 2x better than setting it to sm_20 no matter the cache configuration is set to L1 preferred or shared preferred using cudaFuncSetCacheConfig().
All other settings are the same.
I think is compute capability for GTX580 is 2.0 and sm_20 should be the right setting, right? I am guessing the cudaFuncSetCacheConfig() is not working.
Does anyone have the same problem? Any suggestion or comment is appreciated.