GTX295 multi GPU programming

I am a newbie of CUDA.
I run a program with CUDA cause hang of the windows.
I think that may be the program using all the resource of the GPU.

I want to run the program use another GPU in GTX295.
How can i do it?
Using CUDA 2.2

Check out the SimpleMultiGPU project in the CUDA SDK for a start.

I can only list one device on my computer.
I have one GTX295 card.
It must have two GPU inside.

you need to disable multi_gpu mode in Nvidia Control Panel to use both GPUs :thumbup:

Just one side question: Are there techniques (SLI/drivers …) allowing the CUDA system to treat multiple GPU devices (on one or multiple PCIe slots) on a PC (or a cluster node) as a single virtual CUDA device, whose computing capability is from combining all component devices? This will ease manual inter-GPU scheduling in CUDA programming quite a lot.

I think no, I have 2 GTX 295, and I work with 4 GPUs. I use the multithreading, so I lauch four tasks, and after I compact the results by the host,

after a transfer vi CudaMemCopy on the host of each result.

Thanks, x248

I’ve recognized that the main barrier is with combining global memories. Each GPU has its own memory, and making all global memories accessed transparently requires intensive work on CPU side.

BTW, does SLI help somethings for CUDA?

i had set it before.

But the cuDeviceGetCount (int *) said have 1 device only.

for SLI, I think no. I read you have to work without it, and instead of that, Cuda will not see one GPU, which is only one, and not the sum

of it.

You have to try to build your program to do only one workof transfering datas at the start and one at the end.

You splitt your program is four equivalent tasks, one per GPU.

For each GPU you transfer datas in global memory.

After, from the global memory you try to work only with registers or shared memory on cache on constant memory which are a lot faster than global memory.

When you have the result of the thread(or a block) you need, you put it in an array in global memory.

After all you tasks are finisched, only at the end, one transfer of all by the host to you host memory.

Then you do the final calculation in the host.

If you need absolutly intense work of the CPU, then that means your program is not adapted to GPU, maybye it

is better to sayy in CPU. GPU is not a better CPU: it is different, and some programs can have factor X1000 :thumbup: , as other will have no gain :unsure: .

If you can have gain with one GPU, you shoud have x2 with 2 GPUs, if you do like I told you. :thumbup:

When I bought the GTX295 I had the same problem. :">

I had to update the driver of my motherboard (my chipset was nforce680 SLi) and after that it was ok. :thumbup:

It was a problem that my motherboard did not see the 2 GPUS with the old driver.

Go in your system characteriscs and you shou see 2 GPUs in your profile of “graphic cards” :whistling:

The link for the upgrade was for me:…tional_whql.exe, :mellow:

but of course you have to see with your motherboard and your system.

May be you need to configure your BIOS (e.g. disable SLI, enable multi GPUs)

It wouldn’t really be possible (or at least feasible) to allow all the GPU’s to transparently pool their global memories like you suggested. Just think about the access latencies! If that were abstracted away from the developers, it might become a source of performance issues for developers new to CUDA.

However, there is an idea I’ve mentioned before, which would be to make some kind of API call that allowed you to pass a kernel (and an array of pointers to memory locations) to multiple cards at once for simultaneous multi-GPU launches. Besides that, if nVidia ever makes DMI to/from the cards an available feature in CUDA, I don’t see why you couldn’t explicitly DMA between devices if you needed that for your multi-GPU programs.

Thanks for your discussion.

Access latencies would not be too serious for computing-intensive jobs. What I am talking about is the scalability of CUDA programs. With the current CUDA support, programming multiple GPUs is much harder than programming a single GPU. Unfortunately, they cannot put as many cores as we like on a single GPU package.


i may think that BIOS unable to set this.

But i would try it.


Did you upgrade your driver?
I send you a print of the “control manager” of my system and graphic cards.
I don’ t know the translation so I think with the graphic print you will recognize.

Here I hace 2 GTX295, so you will see 4 GPU cards.

So for you, you should see 2 GPUs.

If not the case, it is not a problem of Cuda, but a problem of motherboard and the GTX295.

i have one GTX295.

THe CUDA program find one GPU only

If I have two or more gpus, How can I make them do different compuation at the same time? And how to sync them ?

I don’t understand,it seems to be ok on your control panel.

Try to install the 2.3 beta: see this link…&hl=gtx+295

and the remark of tmurray.

Maybye this version will solve your problem :rolleyes: , because it has something

special for the gtx295.

you have to see the multithread exemple in the SDK.