Multiple GPU's of different types

immo · July 15, 2008, 2:28pm

Hello,

I am developing my application so that it runs on multiple CUDA devices of different types, a workload divider makes sure that the fastest GPU gets the highest amount of work. It all appeared to work quite well until some strange acces violation (cudaError_enum at memory location…) error occured. Now I did some reading (I know, should have done it before) and the manual actually states:

The use of multiple GPUs as CUDA devices by an application running on a multi-
GPU system is only guaranteed to work if theses GPUs are of the same type.

The program seemed to work OK in the beginning but now fails for larger inputs. I have tried to find the source of the error but it occurs at random (after 5 minutes or after 2 hours) although it seems that it fails in the same part of the code.

I wondered if anybody tried to create an application that runs on multiple GPU’s of different types and encountered any problems. I dont want to chase a bug that is caused because I am doing something which is not allowed. Could this error really be the result of my program running on multiple GPU’s of different types? Maybe that in the future this will actually be allowed/supported?

Any feedback is appreciated,

Kevin

BarsMonster · July 18, 2008, 12:21am

What is maximum execution time of your kernels on the slowest GPU?

immo · July 18, 2008, 7:11am

It falls way below the 5 seconds rule, if it is the watchdog timer you are referring to. I actually solved the problem yesterday and it appeared to have nothing to do with running multiple GPU’s of different types. I launched too many threads that were apparently indexing an array out of bounds, strangely making it past a basic indexing guard (if index >=0 && index < maxindex)… For some reason this only constantly failed on one specific type of the GPU’s presumably because it has less memory, making all memory blocks in use packed tighter together which leads more quickly to an access violoation. Regardless of that, it seems that running an application on multiple GPU’s of a different type while not supported works perfectly fine.

Regards,

Kevin

AndreiB · July 18, 2008, 8:22am

Yes, I’ve created application which runs on multiple GPUs. It runs fine on different cards (tested 8800GTX+8600GTS; 8800GTX+8800GT; C870+8600GTS).

Can someone (maybe from NVIDIA) explain what are potential problems with different GPUs?

Topic		Replies	Views
different type of multiple GPUs one one system CUDA Programming and Performance	2	3278	April 17, 2008
CUDA - multiple devices Using multiple gpus CUDA Programming and Performance	2	4521	March 12, 2010
Multi-GPUs of same type What is "type" actually? CUDA Programming and Performance	9	2764	October 1, 2008
Different performance from different GPUs with Identical Code CUDA Programming and Performance	18	4417	April 11, 2012
Two different cards in parallel with Cuda CUDA Programming and Performance	1	1549	November 2, 2008
Problem with multiple GPUs The multiple GPUs are not working in parallel CUDA Programming and Performance	6	1897	September 2, 2010
Speed problem on 295 gtx cards CUDA Programming and Performance	19	10551	January 8, 2010
Different exe times on same type cards CUDA Programming and Performance	0	2056	September 2, 2011
multiGPU poor performance up to 10x lowest performance in multiGPU CUDA Programming and Performance	14	10812	January 18, 2008
Failure with independent devices on independent processes Try it yourself! CUDA Programming and Performance	19	3509	March 10, 2011

Multiple GPU's of different types

Related topics