Very strange problem. Different behavior on different device numbers.

sckulp · May 11, 2013, 3:24am

Hello,

The last couple days, I’ve been struggling with this strange problem. I have two Geforce 590’s installed on my work machine. Since each 590 is a dual GPU, it is as if I have 4 GPU devices total.

The problem is, if I have cudaSetDevice(n) for n=1,2 or 3, I get different behavior than if I set n=0. More specifically, I have a thrust::device_vector called zIndices_d. I first initialize a thrust::host_vector zIndices_h, setting each element (total size 320) in a loop, and then set

zIndices_d=zIndices_h;

However, if I try to use it in a cuda kernel, it will always throw an out of bounds exception, even when looking at zIndices_d[0], when debugging with nsight, but ONLY if the device is set to 1, 2, or 3. It works perfectly if it is set to 0.

Additionally, I should be able to do something like this, in host code:
int temp = int4(zIndices_d[0]).x;
cout << temp << endl;

(Note that since zIndices_d is a thrust vector in host code, elements can be accessed in this way without tedious copying to/from device memory.)

However, when the device is 1,2, or 3, the program just completely crashes here, claiming there is an out of bounds error. On device 0, it works fine.

Also, if I instead try

int temp = int4(zIndices_d[72]).x;
cout << temp << endl;

It no longer crashes, but it gives the wrong answer in devices 1, 2, and 3. Obviously device 0 still works fine.

I’ve tried this program on a different machine with a Geforce 560, it works fine.

I’ve tried updating the drivers, reinstalling CUDA, reinstalling nsight, restarting the computer many times, nothing fixes the problem.

I am running Windows 7, 16 GB RAM, Intel i7 processor. Using Visual Studio for compiling and debugging.

Anyone have any thoughts?

Thanks!

vacaloca · May 11, 2013, 1:57pm

Do you get the same behavior without using Thrust on that particular machine w/ the 2 GTX 590s? If so, it would sound like it would be a bug in Thrust w/ multi-GPUs, but that’s just a guess. Try CUDA 4.2 if you’re using CUDA 5 or vice-versa and see if you’re able to replicate the issue on that machine.

Edit: [url]Multiple GPUs with Cuda Thrust? - Stack Overflow in particular: “Just keep in mind that you will need to create and operate on separate vectors on each device” might be the issue you’re encountering.

sckulp · May 14, 2013, 11:18pm

I see, thanks! It turns out that I was calling cudaSetDevice AFTER the thrust vectors were instantiated. Switching the order fixed the problem.

Topic		Replies	Views
Multi GPU with Thrust CUDA Programming and Performance	1	2000	June 6, 2013
thrust::system::system_error? CUDA Programming and Performance	1	3559	June 8, 2013
Different performance from different GPUs with Identical Code CUDA Programming and Performance	18	4533	April 11, 2012
invalid device function during creation of thrust::device_vector<std::int64_t> GPU-Accelerated Libraries	0	1133	August 30, 2017
Thrust throws exception when device_vectors is used CUDA Setup and Installation	2	1626	December 12, 2020
Correct on Device 0, Incorrect on others CUDA Programming and Performance	1	1312	July 21, 2009
[Optix 6.5] Use of thrust on optix buffer OptiX	3	1105	June 14, 2022
Help: A problem with cudaSetDevice() CUDA Programming and Performance	6	1938	April 3, 2010
A question about using cudaSetDevice CUDA Programming and Performance	4	9433	November 2, 2011
Question about Thrust Library with Kernel CUDA Programming and Performance	2	1074	March 19, 2019

Very strange problem. Different behavior on different device numbers.

Related topics