After I installed the 320.00 beta driver (in order to use the final build of Nsight 3.0), all applications (including CUDA samples) using driver API fail to start with CUDA_ERROR_UNKNOWN (returned from cuCtxCreate), while those utilizing runtime API fail with error 46. The issue is reproduced on two computers with Windows 7 x64, one of them has GTX 470 and the other one GTX Titan.
With 314.22 everything is working fine, but Nsight Monitor is complaining about the driver version.
Exactly the same bug, just tested on 320.18. No one CUDA nor OpenCL program is working: error 46 or CuCtxCreate failed.
Windows 7 64-bit, two cards simultaneously - GTX 460 and GTX 650 Ti.
Downgrading to 314 drivers fixes the problem.
I’ve no idea what’s wrong.
Repeated the tests on 320.18 – the problem did not go away. On both our machines NVIDIA GPUs run headless (may be this is the key, but I did not try to connect a monitor).
Using CUDA Device [0]: GeForce GTX TITAN
GPU Device has SM 3.5 compute capability
Total amount of global memory: 6442254336 bytes
64-bit Memory Address: YES
initCUDA() returned 999
→ CUDA_ERROR_UNKNOWN
C:\Users\All Users\NVIDIA Corporation\CUDA Samples\v5.0\bin\win64\Release>matrixMul.exe
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “GeForce GTX TITAN” with compute capability 3.5
MatrixA(320,320), MatrixB(640,320)
cudaMalloc d_A returned error code 46, line(164)
No, my GPU is connected. As the problem is quite rare, we need to find the common in our hardware. So, I’ve got ASUS P7P55 WS Supercomputer motherboard with Intel Core i5-750 in, three GPUs (NVIDIA GTX 650 Ti and GTX 460 and one AMD), NVIDIA 650Ti is connected to monitor. The last non-standard thing is that CUDA is installed in non-default location.
Using CUDA Device [0]: GeForce GTX 650 Ti
GPU Device has SM 3.0 compute capability
Total amount of global memory: 1073414144 bytes
64-bit Memory Address: NO
initCUDA() returned 999
→ CUDA_ERROR_UNKNOWN
C:\Visual\CUDA\SDK\v5.0\bin\win64\Release>matrixMul.exe
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “GeForce GTX 650 Ti” with compute capability 3.0
MatrixA(320,320), MatrixB(640,320)
cudaMalloc d_A returned error code 46, line(164)
You’re not alone. We’re having this problem as well (in both CUDA and OpenCL) on two machines which both have Supermicro motherboards.
The only fix we’ve found is to remove some cards, then reboot (say, with only one card), then start adding cards back in. Somehow, this gets the driver back in shape… I have no idea why, but it seems that if you can get Windows to re-install 320 the problem goes away. (And, then mysteriously, the problem doesn’t come back if you downgrade to an older driver then upgrade again.)
Very scientific, right?
BTW, we saw something like this a while back but only on one machine, and the same “pull a card then reboot, then add the card back in” trick worked then too.