Max number of CUDA devices


I have a rather simple question that people at NVIDIA can surely answer.

Say that we have a very large motherboard or assuming we have a backplane with plenty of PCIe slots (talking about 10+), we also have 10 GTX 1080 Ti cards and that OS can detect all these devices, is there some hardcoded limit in NVIDIA drivers how many of these devices we can access for CUDA work?

Thanks for the answers.

This post is from a while ago, back when Tim Murray was a part of NVIDIA, but 16 was the most that they tested at that time. (I’m assuming even with the newer drivers since then, you should be able to get 10 going fine.)

Overall, it mainly depends on the BIOS though.

There is no hardcoded limit at or in the vicinity of 10. There exist examples of systems that I am aware of where people have gotten in the range of 10-16 GPU devices working correctly. However it’s not trivial. The system BIOS can be a challenge, in addition to what you mention.

Also, P2P is not supported beyond 8 or 9 devices at any given instant. This is documented in the programming guide.

So, if I understand correctly, there should be no limits imposed by NVIDIA, no matter OS (Windows or Linux) and no matter GPUs (do not have to be Teslas?).

I am aware of BIOS issues. I am in a process of finding the right motherboard that would work with pcie expansion backplanes and offer availability of devices on all host pcie slots as well as pcie slots on backplane. Pretty much all cheap desktop or even expensive gaming motherboards top out at about 6 devices (if more devices are plugged in, no boot). I am about to try Asus workstation board (with 7 host pcie slots) now and see if that makes any difference.

If anyone has more information about suitable motherboards, it would greatly help my endeavours. Thanks!

That’s not what I said. I said there are no limits imposed by NVIDIA in the 10-16 range. There are certainly (mostly unpublished) software/driver limits, which may indeed vary by OS. But I’m not aware of any that are at or below 16.

And to be clear, 16 means 16 distinct PCI enumerated devices. A K80 for example consists of two distinct PCI enumerated devices. I’m not suggesting you can put 16 K80 GPUs in a system with any success.

LinusTechTips (a youtube channel) created a 10 GPU (NVIDIA) system where all of them were working in conjunction.
This was using a server grade hardware though and they used a VM to separate it into 10 separate clients.
Therefore, the central unit still had to operate all of the GPUs and in theory, you should be able to do the same using them as CUDA devices.
The video is called “8 (or is it 10?) Gamers, “1” CPU- Taking it to the Next Level!”
If you look it up you can learn more about it.

This guy claimed that he got 18 GPUs to work in one system:

Note that one issue with many GPUs in one system is PCIe interconnect. Commonly used CPUs don’t offer more than 40 PCIe lanes.

I can confirm that motherboard ASUS X99-E-10G WS cannot enumerate more than 8 GPUs.

I will try soon with ASUS Z10PE-D8 WS.

1080Ti does not allow TCC mode. its impossible to know what does and doesnt work because motherboard manufacturers and Microsoft/Linux don’t provide this information. even the experts here cant give you clean answers!

although people say here that its the system bios that matters, it is shown that linux can run more gpus on consumer motherboards. windows is 6, linux is 8 gpus.

unfortunately because there are is NOTHING published for any of this, which is extremely frustrating when designing gpu farms, you need to distribute your gpus over the network.

I suggest you look towards Supermicro boards with PLX chips, and then buy expansion boards, each of which has a PLX chip (eg Amfeltec)

if you work out how to do it please let us know.

The ASUS X99-E-10G WS works fine with 8 various dekstop GeForce GTX cards under Windows.

I already have Amfeltec’s expansion backplane with PLX chip.

The issue is when I plug in 9th card; the BIOS cannot get past code 95 - it simply reboots machine and enters into loop of initializing up to code 95 and then reboot etc…

I was looking into one Supermicro board that has 10 8-lanes wide but back-opened PCIe slots, but some of these slots don’t have enough space (there is obstacle - RAM modules), so I would have to use risers for at least some of cards which is not so convenient as plugging them directly into the board what Asus motherboards allows me to do.