Device and memory: GTX 690


I am with one problem with my GTX 690.
I use cusparse and cublas libraries and I don’t Know how specify my GPU to work with 2 devices. I dont Know how manage private memory of the every device.

Thanks, Ana.

use cudasetdevice(0 or 1) to specify which card to use, otherwise it will always use the one with id 0. he memory is separate. Even if the cards are in a single package they are act as 2 independent cards. If you want your program to use both in the same time you will have to work a little more. For your card the only way is by using streams.

I am just specify this parameter with cudasetdevice(0 or 1). However, I think something strange is happening. Every time I run my program, the values of the results ??are overlapping. I think that the memory is getting some residue. Do I have to specify any parameter to memory too? Every time I run my program I get one different values.

Thanks, Ana.

Memory copies will be done on whatever GPU is currently set, so you need to copy whatever respective data you’re using on both GPUs.

This might help:

Please tell us more what are you doing. I assumed you have some basic understanding. Otherwise I suggest you to read some tutorials or the first 3-4 chapters of the book CUDA by example .

I work with cusparse and cublas libraries. I understand about several functions and use cudaMalloc and cudaMemcpy every time and also cublas and cusparse. My GPU old (GT 240) dont have this problem the overlapping and residue. However, my new GPU (GTX 690) dont work well. Some times the results is ok, some times isnt ok. Thus, I think the problem is in the transfer of the datas. I just use one device. I dont work with 2 devices. I specify the parameter cudasetdevice. There is outher parameter that I have to specify?

If you already have a program which worked on gt 240 but does not work on 690, there might be some problem. I suggest running several time with cudasetdevice(0) and then with cudasetdevice(1) and compare with another card on which your program works using the same flags at compiling. If the results are changing then there might be a problem with the memory of rams.

I had a similar problem:

My code originally worked flawlessly on a GTX 280. When I got a GTX 690 and ran the same code (recompiled) on it, the results were unstable, that is, they changed from run to run, but only in the third or higher significant figures. I at first thought I had a bad card, and so I returned it and got a GTX 670 instead. But the problem was still present. After examining my code carefully, I found there was a race condition in some obscure place. I fixed the bug [by adding __syncthreads()] and then it ran flawlessly on the GTX 670. So, I returned the GTX 670 and got a new GTX 690. Again, the code ran flawlessly.

I suggest you examine your code carefully to see if there is any race condition.

If you are suspecting a race condition, you might want to try the racecheck tool in cuda-memcheck:

Hi nasacort!

I work with cublas and cusparse libraries and it are several functions available by NVIDIA. I just have that specify the function with parameter. In this case, I tried cudaThreadSynchronize() and still continues the problem. I dont program “Kernel”. What do you think about this?


You might have to post the smallest possible code snippet that replicates the problem you’re experiencing. Try dumbing down your code until you find a series of calls that reproduces the incorrect output.

The chance that this is a hardware problem is very small. Most likely, it is a software problem.