Strange results

Hi Everyone,

I have a piece of C# code that I’ve parallelized using CUDA. It does some stuff with matrix vector multiplication and prints out the results. I then have a little program to compare the CPU results to the GPU results.

Now, I’ve got several generated matrices that I am testing this with. When I load the matrices in the following order, the last set of results does not match: 0, 1, 2, 3. However, if I load the matrices in in a different order (specifically, 2,1,0,3), all of the results match.

As far as I can tell, I am freeing all of the memory being used by cuda and C - but I wonder is there some sort of a function that can “reset” my GPU to make sure there is nothing hanging around? I should also mention that each run is completely separate from the other runs - meaning, I run the program which loads the matrix, does it’s cuda magic then exists, then loads the program again and the next matrix etc.

Any insight would be greatly appreciated.

Thanks!

You are probably using memory without initializing it somewhere. Unlike host memory, device memory is not cleared between program runs.

Tera - thank you for your quick reply! I added a couple of cudaMemset calls to my program and that seems to have fixed the issue. Oddly, I had to add them to the end of my program - right before I call cudaFree in order to completely fix my problem. I am calling cudaMemset before I start to do anything on the card, so I would have thought that this would fix the issue. Any thoughts on that? Again, thanks for your reply.

That doesn’t seem to be the real fix. If clearing the memory just before freeing it fixes a problem, that means that there still is some use of uninitialized memory somewhere that you just managed to paper over. However, any other user of the GPU who does not clear memory before freeing it (i.e., basically everyone) could still cause your kernel to produce wrong results. This could even be the GUI so you don’t have to start any other program to suddenly get wrong results again.