CUDA Samples only run with an annoying 'trick'

Hi!

I am trying to get CUDA working on my laptop, which has both an Intel HD Graphics 5500 processor and an NVIDIA GeForce 830M graphics card. It is running a Windows 7 64 bit.

PROBLEM:

I followed the installation instructions from the CUDA documentation, i.e. I installed Visual Studio 2015 and then the CUDA Toolkit 8.0. However, I have trouble running the provided samples, starting with the one mentioned in the documentation to verify the installation, ‘deviceQuery’.

The program compiles fine, then while running I get the following output:

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL

I also get a pop-up notification window informing me that “The ‘NVIDIA GeForce 830M’ device is not removable and cannot be ejected or unplugged.”. After this, the graphics card is unrecognized in the Device Manager (and also the NVIDIA Control Panel), trying to run ‘DeviceQuery’ again now would yield error 30. I have to restart my laptop for everything to return back to normal.

SOLUTION ATTEMPT

I’ve been browsing the internet for a day or so now, basically all I’ve found is that this error 38 usually appears when there is an API version mismatch or the drivers are not updated. I’ve now made sure I’m using the latest drivers for both my processor and graphics card, and I’ve also ran the nvidia-smi.exe program to verify that there is no API mismatch. The problem still persists.

HOW I CAN MAKE IT TO WORK

I found a rather unconventional way of making everything work fine. Starting from a freshly restarted laptop, first I have to run a game or anything that uses my graphics card. I can verify that a process thread for the GPU exists using nvidia-smi. If I attempt to run ‘deviceQuery’ (or any other sample) while this process is running, the expected output is produced. I can then exit the game, put the laptop to sleep, anything basically, and I am still able to run the samples. This trick does not work if I exit the game before I attempt to run any of the samples.

So, my question is, what can I do to make CUDA work normally, without having to go through the above trick every time I restart my laptop? I feel like there is some initialization which the samples are not performing, but happens if I run a game. This still doesn’t seem true though, as then the samples would work if I only run them after I ran the game, i.e. I would not need to run the first sample while the game is running.

Any help is very much appreciated. Also, I am new to CUDA and GPU programming, so please let me know if there is any information I did not supply which would be helpful.

Thanks!
-Tusike

It’s an optimus laptop. When the NVIDIA GPU is powered off, it won’t respond to CUDA operations.

Try googling “CUDA optimus”

Indeed, forcing the samples to run on the high-performance GPU in the NVIDIA settings solves the issue. Thanks!

Where did you find that setting?