Im working on porting an existing application to run on multigpus. It already works on a single gpu but i have a 9800gtx and a gtx280 in this pc at work so i might as well see if multigpu can help speeds things up.
Ive pretty much copied the multigpu example in the sdk. I get (way) worse performances though.
The code used to run on the gtx280 alone and the profiler reports 430ms. As a first try, ive split the work load in half between the gtx280 and 9800GTX.
The profiler now reports 1.5sec for half the work being done on the GTX280. Ive also tried to limit the number of GPUs being used to ‘1’, thus basicaly doing all the work on the GTX280 but still using host threads generations and the computation time logicaly raised to around 3 secs.
I dont get why this is much slower than a non cpu threaded application where all the work is done on the GTX280, any ideas?
Im using constant memory and textures defined in another file if this can cause problems… Populating both by each host thread.