Max # gpus supported?

I was wondering if there is a maximum number of gpus that CUDA supports. For example, if I plug in 8* dual cards into the same server, is that going to function? I seem to recall either an 8 gpu or 16 gpu limit for some reason, but maybe I imagined it.

What I’m looking at:

Reason for not using a cluster - tons and tons of inter-gpu communication that will be severely hurt by clustering

I’m pretty sure the current limit is 8 CUDA supported devices per host.

Take a look at this thread:

That’s very impressive.

On an aside - what sort of bandwidth might you expect for a cudamemcpy between each half of a dual card gpu? As far as I can tell it shouldn’t need to transfer through the pci interface