compute-exclusive mode and cudaGetDevice(...) Always claims to be running on device 0.

I have a setup with one S1070 and one GTX 280, which make 5 devices. As I am not the only person using the system I decided to switch all devices to compute-exclusive mode. I leave nvidia-smi running in the background to ensure the setting is kep. All tools tell me that the cards are in compute exclusive mode. In addition, if I now run more then 5 applications in parallel (or set one card to prohibited) the additional applications will fail as expected. That far, that good.

Now my problem is the result of cudaGetDevice( int* devid ). While if I explicitly choose a device this works as expected, in the case that the driver assigns a GPU it will always report to be running on device 0. Of course this cannot be the case, as otherwise the applications shouldn’t fail if I overcommit. No I am wondering, is cudaGetDevice(…) broken in this case or am I doing something wrong.

Below is an outline of my code:

51 »·// CUDA initialization

 52 »·int devId = -1;

 53 »·if( vm.count( "device" ) )

 54 »·{

 55 »·»·// select user specified device

 56 »·»·devId = vm[ "device" ].as< unsigned int >();


 58 »·»·cudaErr = cudaSetDevice( devId );

 59 »·»·if( cudaErr )

 60 »·»·»·throw CudaError( "Failed to initialize device", cudaErr );

 61 »·}

 62 »·else

 63 »·{

 64 »·»·// force CUDA to select a device (just so we can be shure the queried

 65 »·»·// device is the one we run on)

 66 »·»·cudaErr = cudaSetValidDevices( 0, 0 );

 67 »·»·if( cudaErr )

 68 »·»·»·throw CudaError( "Failed to initialize device", cudaErr );

 69 »·}


 71 »·// check which device we got

 72 »·cudaErr = cudaGetDevice( &devId );

 73 »·if( cudaErr )

 74 »·»·throw CudaError( "Failed retrive used device", cudaErr );


 76 »·cudaDeviceProp props;

 77 »·cudaErr = cudaGetDeviceProperties( &props, devId );

 78 »·if( cudaErr )

 79 »·»·throw CudaError( "Failed to get device properties", cudaErr );


 81 »·cout << "Running on device " << devId << ": " << << endl;


 83 // do yourr work on the GPU

For testing I use the following command line:

for i in 1 2 3 4 5; do

  ./gpu_run &


I just upgraded to CUDA 2.3 and the problem persists.

I think this may help you…-Exclusive+Mode

Thanks for the pointer. However your example shows the same problem. While the run, when no device is selected, is obviously running on device 1 it reports to be running on device 0. Well, at least it seems like I am doing everything correct, thanks.

Possible that the device number reported by cudaGetDevice() is a logical number in this case and actually represents physical device 1…

Me just guessing here.

Try printing device name, properties etc…

Also, check for time for completion – That will give u a clue.

There is no concept of a “logical device” vs “physical device” in CUDA. cudaGetDevice will identify the actual device number in use as consistently listed by all other CUDA commands.

theMarix, I’m not sure what is going on in your case, but I certainly do not see the behavior that you do.

Code (

#include <iostream>

using namespace std;

int main()


	int *d_ptr;

	cudaError_t error = cudaMalloc((void**)&d_ptr, sizeof(int));

	if (error != cudaSuccess)


		cout << cudaGetErrorString(error) << endl;

		return 1;


	int dev;


	cout << "Running on device " << dev << endl;

	int left = sleep(10);

	while (left > 0)

		left = sleep(left);


	return 0;


Execution on my 9800 GX2 system

$ for i in 1 2 3; do

> ./devtest & done;

[1] 2195

[2] 2196

[3] 2197

user@host ~/cuda_test $ Running on device 0

Running on device 1

no CUDA-capable device is available

Now I see what you are doing. You are calling cudaGetDevice before a context is initialized. If I put the cudaGetDevice before the cudaMalloc in my code, I get the same behavior as you.

It is unfortunate that the documentation does not mention this behavior for cudaGetDevice().