Correct on Device 0, Incorrect on others

faircdl · July 21, 2009, 10:49am

Howdy,

I have a code similar to that outlined below. When I specify device 0 (via cudaSetDevice()), everything works fine. When I specify any other device, the code “runs to completion” but gives random incorrect answers, Other codes I have written continue to run just fine on the other devices.

Pseudocode:

[codebox]

for(i=0;i<numunknowns;++i)

{

zerovec<<<(length/THREAD_CNT)+1,THREAD_CNT>>>(Atmp,length); //a kernel that just zeros out the array Atmp

cudaThreadSynchronize();

fillAtmp<<<dimGrid,THREAD_CNT>>>(Atmp, extra data); //a busy function that fills Atmp with some data, nothing too weird going on here though, no use of shared mem, no divergent branches (according to cudaprof)

cudaThreadSynchronize();

fillA<<dimGrid2,THREAD_CNT>>>(A, Atmp, extra data); //a simple kernel that sums up columns of Atmp and inserts those results along the columns of matrix A

cudaThreadSynchronize();

}

[/codebox]

To me, the loop (and the kernels) seem very straightforward. I don’t understand what I’m doing that would cause some dependence on device number.

Things I have tried (not a complete list):

I have verified this behavior on two different machines with different cards/hardware. Here are the specs:

Machine 1:

Linux x64, RHEL5, CUDA 2.3b

Device 0 & 1: GTX 295

Device 2: Tesla C1060

Device 3: Tesla C1060

Machine 2:

Linux x64, RHEL5, CUDA 2.3b

Device 0: GTX 285

I have experimented with changing compute mode via nvidia-smi since we leave our cards in Exclusive mode usually.
I have also tried CUDA 2.2 and 2.3b.
Checked temperature of all the cards. Seem to hang around 70-75C.
I know there were a number of other things but now I’m drawing a blank.

faircdl · July 21, 2009, 11:15am

Blast! I have been struck by the curse of “figure out my problem as soon as I get done posting my question”. I rebooted my multicard machine and everything is working fine now. I would still be curious though if anyone has any thoughts on why this happened in the first place. Thanks.

Topic		Replies	Views
Different performance from different GPUs with Identical Code CUDA Programming and Performance	18	4416	April 11, 2012
A question about using cudaSetDevice CUDA Programming and Performance	4	9352	November 2, 2011
Very strange problem. Different behavior on different device numbers. CUDA Programming and Performance	2	835	May 14, 2013
Asking for help on a basic excersize CUDA Programming and Performance	4	3740	January 27, 2009
TESLA ISSUE CUDA Programming and Performance	1	646	October 31, 2011
Driver Error? CUDA Programming and Performance	0	2450	May 5, 2011
Help: A problem with cudaSetDevice() CUDA Programming and Performance	6	1866	April 3, 2010
same code gives different results on two Nvidia 2080Ti GPU CUDA Programming and Performance	7	1490	November 2, 2019
device code not executing? CUDA Programming and Performance	4	3845	July 10, 2008
Failure with independent devices on independent processes Try it yourself! CUDA Programming and Performance	19	3497	March 10, 2011

Correct on Device 0, Incorrect on others

Related topics