I am currently working with a GTX590 on linux Centos5, and I have latency issues with my cuda programs. Firstly I will explain the problem I have and then I will provide you a simple example that shows the issue. If any GTX590 owner ever had this kind of problem and found a cause and/or a solution, I would be very interested to hear it.
1) The Problem :
While measuring the processing time of one of my programs (to put it simply : it computes the mean value of all the pixels of an image), I found out that the cuda process was sometimes taking a lot of time. When measuring time, there are latencies that appear quite randomly.
The first thing I did was to try the same program with other cards (GTX460, GTX480, GTX560Ti) and there wasn’t any problem at all => so far I only have this issue with the GTX590.
I also know this issue occured with another GTX590.
- My config :[/u]
OS : linux Centos5 (2.6.18-274.el5)
CPU: Core i7 950 (@3,3GHz)
MB : Asus Rampage 3
Pow: corsair 850W
Graphic card : GTX590
GPU0 bios : 70.10.42.00.02
GPU1 bios : 70.10.42.00.02
Nvidia driver : 280.13
3) A simple example :
My issue can be seen with a modified version of the reduction program of the NVIDIA SDK.
The ‘reduction’ program is quite similar to my own program so that makes a perfect basis. However, keep in mind that my problem occured on all the programs I tried so far (my own convolution program, the NPP convolution example, etc …).
For my example I made some quick modifications to the NVIDIA reduction file :
The NVIDIA reduction programs perform 100 iterations of the
kernel <EDIT: ‘reduction’ algorithm>, measures the global time and then divide it to have the mean time of process. The averaging prevents to see the latencies of the process …
… so I made a modified version of the program (see attached .cpp files) which computes and measure the time of every single call to the
kernel <EDIT: ‘reduction’ algorithm> (maximum time and minimum time are checked/recorded after each call, and displayed at the end). My loop is 1000 iterations long to be sure the latency occur (as I said, it happens randomly)
I provided as an attachment (.png file) a quick summary of the nvidia SDK reduction program, and my modified version.
Here are some of the results I get :
-GTX590 : minTime = 0.524 ms
maxTime = 2.307 ms
-GTX560Ti : minTime = 0.665 ms
maxTime = 0.697 ms
In a nutshell, with the GTX 590 I have 90~95% of the results that ‘roughly’ equals the minimum time (very slight difference), and the remaining results are ~2ms higher than the minimum time. Those ‘peaks’ appear quite randomly along the process.
4) My questions :
Is there someone who had this issue ? If that’s the case, did you find any solution or cause ?
Is there a link with LPC latency ? I read something about people having trouble watching videos on windows, which “seemed” to be caused by LPC latency.
Is this a known problem that can’t be fixed ?
5) Attachments :
reduction_mod.png : A simple schematic showing the differences between my custom ‘reduction’ program.
reduction.cpp : the original ‘reduction’ program by nvidia, freshly taken from the sdk.
reduction_Modified.cpp : my modification of the NVIDIA programs which puts in light the issue I have with the GTX590. I put ‘MODIFICATION’ labels accros the file for simplifying comparison with the original file.