No GPU selected, code working properly, how's this possible?

mmarell · September 7, 2013, 3:31pm

Dear all,

I’m a bit confused. I’m working on a project for which I have a system with two GPUs: a Quadro 600 for display purposes and a GTX Titan for computing.

I want to store data in constant memory (of the Titan) using cudaMemcpyToSymbol and my code seems to work properly. That is: I can write data and read it back, the input data matches the output and Cuda doesn’t return any errors.

Later I found out I forgot to select one of the GPUs (the Titan), so I’m very surprised the code worked at all. Where does the data go?

Any help is appreciated.

Thanks

Milan

CudaaduC · September 7, 2013, 5:09pm

My understanding is that by default CUDA will query all CUDA capable GPUs, and then select the best of the bunch based on the compute capability.

On my machine I have a 680 and a K20, and it just will use the K20 for calcs unless I tell it otherwise.

mmarell · September 7, 2013, 8:27pm

I suspected as much, I just wasn’t able to find any documentation on this.

Thanks!

seibert · September 7, 2013, 11:14pm

Note that if you have several GPUs, you should not rely on the driver’s ordering to select the “best” card. On a system with 4 GPUs, I see this ordering:

Device 0: GTX 580
Device 1: GTX 275 (connected to display)
Device 2: GTX 680
Device 3: GTX 580

My guess is that the driver tries to ensure that CUDA device 0 is not the display device when there is more than one device available, but the ordering beyond that is arbitrary.

CudaaduC · September 8, 2013, 3:47am

That does make sense, but if something like this happened:

Device 0:Titan(video out)
Device 1: GTX 460

would it by default use the GTX 460 for CUDA?

Not that the above is the correct configuration, it is not, just wondering which would be default chosen when the difference is large in capability.

JFSebastian · September 8, 2013, 6:02am

By default, CUDA kernels execute on device ID 0.

You can check which device has ID 0 for your system by using deviceQry.

Concerning the rationale for ID assignment, here are two apparently contradictory documents on the web, namely

[url]https://www.cs.virginia.edu/~csadmin/wiki/index.php/CUDA_Support/Choosing_a_GPU[/url]

and

[url]How does CUDA assign device IDs to GPUs? - Stack Overflow

I’m not sure if things changed across different CUDA versions/drivers.

seibert · September 8, 2013, 12:29pm

I don’t recall seeing the device order change after switching versions of CUDA (this particular computer has had CUDA 4-5.5 on it), but given the lack of specification, it is probably best to assume that it could change.

mmarell · September 8, 2013, 12:41pm

Thanks for all the feedback!

In my code I have a couple of lines to ensure my calculations are performed on the Titan. Basically, I query the device properties and select the device named “Titan”.

Maybe I should look into using NVML.

What I find surprising is that there are no straightforward ways to select a device based on a unique serial number or something similar.

Also CUDA deciding what device is fastest and making it device 0 seems a bit arbitrary. What happens if I have for example a system with multiple Titans. Which of them will then become device 0? And is the enumeration the same everytime I start the system?

Milan

seibert · September 8, 2013, 2:10pm

I think my example above shows that CUDA does not map the fastest card to device 0 in general. The GTX 680 (for many, but not all) applications is a better card than the GTX 580, but is device 2.

The device properties structure is pretty extensive, and should let you create a device selection heuristic appropriate for your application based on compute capability, memory size, # of CUDA cores, whether or not a display is connected, etc. I don’t think you’ll need to use NVML to pick a CUDA device.

There does not appear to be a unique card serial number in the device property structure, but it does have fields for the PCI Express “coordinates” (domain, bus, device) of the device, which should be stable as long as the card is not moved to a different slot in the computer.

mmarell · September 9, 2013, 10:13am

Hi Seibert,

You’re right about using the device properties to select the appropriate device. At the moment selection based on the device name suffices, but in the future I may have resort to using PCI BUS ID etc as well.

I’m curious, what parameter in the device properties structure indicates whether a display is connected to the device or not?

Thanks

JFSebastian · September 9, 2013, 10:34am

I agree with seibert that it is more probable that the device IDs are assigned according to a “physical location” of the device, instead to performance heuristics, opposite to the answer at

[url]How does CUDA assign device IDs to GPUs? - Stack Overflow

Indeed, what does “best performance” mean? Throughput? Memory?

Concerning selecting the “best” device, at

[url]https://www.cs.virginia.edu/~csadmin/wiki/index.php/CUDA_Support/Choosing_a_GPU[/url]

there is a code snippet to select the card with the largest number of multiprocessors, but also some CUDA SDK multi-gpu examples (p2p) have parts of the code to make such selection.

JFSebastian · October 8, 2013, 8:31pm

Today I have installed a PC with a Tesla C2050 card for computation and an old 8084 GS card for visualization, by switching their positions between the first two PCI-E slots. I have used deviceQuery and noticed that GPU 0 is always that in the first PCI slot and GPU 1 is always that in the second PCI slot. I do not know if this is a general rule, but it is a proof that, at least for my system, GPUs are numbered not according to their “power”, but to their positions.

Topic		Replies	Views
GPU Priority CUDA Programming and Performance	11	5829	November 10, 2010
Device Enumeration and cudaSetDevice SDK Examples Failing to Run on Device 0, but run fine on Device CUDA Programming and Performance	5	30644	August 25, 2011
one CUDA card unrecognized in 64bit Win7 CUDA Programming and Performance	5	1698	April 15, 2011
Mixing architectures with drivers that support CUDA 9.2 CUDA Setup and Installation	3	988	July 19, 2018
GPU numbering in Multi-GPU systems CUDA Programming and Performance	3	3271	July 12, 2013
Choosing CUDA device programmatically CUDA Programming and Performance	3	8894	August 13, 2009
CL_DEVICE_VENDOR_ID == CU_DEVICE_ATTRIBUTE_PCI_DEVICE_ID? CUDA Programming and Performance	4	2707	May 14, 2013
How to calculate memory bandwidth from device properties ? CUDA Programming and Performance	11	5447	June 20, 2015
CUDA never uses two GPUs CUDA Setup and Installation	2	1704	April 27, 2016
Different performance from different GPUs with Identical Code CUDA Programming and Performance	18	4365	April 11, 2012

No GPU selected, code working properly, how's this possible?

Related topics