cuInit failing on Quadro FX 1600M cuInit, driver API, Quadro FX 1600M

I’m trying to use the driver API on a Quadro FX 1600M running Windows XP SP3. The card appears as supported under CUDA 2.0 at least in the README.

I can successully load the cuda.dll, but cuInit(0) returns error code 100 which is CUDA_ERROR_NO_DEVICE.

I tried rebuilding some of the CUDA SDK samples and noticed that Device 0 was running under CPU emulation.

I understood “supported” as meaning that the device was supported both from the driver and runtime API.

Can anyone confirm that this is indeed the issue?


What driver are you running? Make sure it’s a 177 or 178.xx release if you’re using CUDA 2.0.

I downloaded/installed the driver, tools, and SDK this morning (10/5). So I’m using the latest available.



What OS? Are you sure the driver is installed correctly, etc.? Does driverQuery return anything about a CUDA device?

Specifically which driver version did you install?

This is on Windows XP (SP3) with a fresh install of the CUDA packages.

What am I looking for in the driverquery output? I see no mention of a CUDA or NVIDIA driver version (though that could be the problem).



I just reinstalled everything (after manually uninstalling). I have the latest driver, toolkit and SDK. The version is 177.84.

Still no luck getting anything but CPU emulation working. I can confirm this by running the samples. They all return a failure and I can see the “Device 0: CPU emulation” in the startup banner.

Can I at least get an answer as to whether CUDA 2.0 is actually supported on this device? Again, this is a Quadro FX 1600M mobile part.


Sounds like a G84, so yes, it should be supported.

From where did you obtain the 177.84 driver package?
What kind of notebook are you using?

I definitely is: I am using it on an HP 8710W laptop. Here is the output produced by the device query SDK example:

There is 1 device supporting CUDA

Device 0: “Quadro FX 1600M”

Major revision number: 1

Minor revision number: 1

Total amount of global memory: 536543232 bytes

Number of multiprocessors: 4

Number of cores: 32

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.25 GHz

Concurrent copy and execution: Yes


Press ENTER to exit…

I am using the latest CUDA release also (2.0). I however am running windows XP SP2

Good luck. Jeff

I figured it out. I was running remotely over RDP. I thought RDP worked more like the VNC protocol it simply displays the remote frame buffer. You will get the CPU emulation running over RDP.

When I run on the physical machine, everything is working fine and I get the hardware acceleration.

This may be obvious to some, but it wasn’t for me. Could this be added to a CUDA FAQ somewhere? It might save someone else a few hours of hair pulling.

Thanks to the folks at NVIDIA and others on this forum for suggestions and help.


Are you looking for something like CUDA FAQ…

13: Can I run CUDA remotely?

Under Linux it is possible to run CUDA programs via remote login . We currently recommend running with an X-server.

CUDA does not work with Windows Remote Desktop, although it does work with VNC.