I recently upgraded my Windows 7 x64 development system by replacing 3 C2050 boards with 3 K20c boards. I left one C2050 in for display.
After doing so, I discovered a very strange error. When I attempt to run code that includes both cuBLAS and thrust::sort, cublasCreate() will hang for a very long time (minutes at least). This only occurs when thrust::sort is in the code - other thrust calls don’t cause problems. This happens regardless of whether I set the CUDA device to be a K20c or the C2050.
I did try replacing thrust::sort with cub::DeviceRadixSort but encountered the same issue, so perhaps thrust is using cub underneath?
A simple example of code that has this problem is below (examples of other thrust calls that do not cause problems are commented out). I am using Visual Studio 2010 for Windows 7 x64 with Nsight 3.0, CUDA Toolkit 5.0, NVIDIA Tesla Driver version 320.49. I compiled using all of the default options for a CUDA 5.0 project using Nsight.
thrust::fill( a.begin(), a.end(), 22 );
//thrust::inclusive_scan( a.begin(), a.end(), a.begin() );
//thrust::transform( a.begin(), a.end(), a.begin(), thrust::negate() );
thrust::sort( a.begin(), a.end() );
std::cout << “all done\n” << std::endl;
Is there anyone out there with any idea what’s going on? Is this a known issue?