Does CUDA7.5 suppose to work with Quadro M2000M?

Changyun · April 15, 2016, 7:59pm

I ran some test and failed on GPU timer. The elapsed time feedback an unreasonable value to me. The integrated driver within CUDA 7.5 is 353.90 which does not support Quadro M2000M (which seems started to be supported since 354.56) so I manually updated my GPU driver to latest 362.13. And pick customer installation for CUDA 7.5 to skip Graphics driver and GPU Deployment Kit. Is it correct way? And is there any known issue there regarding GPU’s timer?
My test is on windows 7 64bits OS.
Thanks.

Robert_Crovella · April 15, 2016, 8:39pm

Yes, your method for install sounds correct. Either keep the driver that was originally installed, or update to the latest. Then install CUDA while deselecting the driver install.

I have no idea what you are referring to by “GPU timer”. CUDA event based timing? clock() or clock64() based timing?

Windows WDDM can definitely interfere with getting sensible results from CUDA event based timing. Furthermore I have seen that CUDA event based timing can give strange results on Windows WDDM when you are timing host code (only - no CUDA calls).

Changyun · April 18, 2016, 4:47pm

Thanks for the reply txbob.
My timer was CUDA event based timing, through cudaEventElapsedTime() and it indeed happened on the host measurement. Expected elapsed time should be around 300ms while CUDA event based timing returned something like 0.00112ms. Exact same code looks ok on my colleague’s Linux machine (while it is running different graphics card). So it sounds consistent with your obseration? Thanks.

Robert_Crovella · April 18, 2016, 4:56pm

It’s consistent with what I’ve seen, and I don’t have a ready explanation for it. On linux, cudaEvent based timing seems to work fine even for host code. I think the same is true on Windows GPUs for device code. But for host-only code on windows, I’ve seen similar odd results. I suspect WDDM command queue batching is involved in the explanation but I can’t go further than that.

My suggestion would be to use an ordinary windows host-based timing function (e.g. queryperformancecounter) for timing host code on windows. If you can tolerate appropriate synchronization, then it can be safely used to time device code as well or mixed host/device sequences.

cudaEvent timing can give reasonable results for timing e.g. a kernel call in windows, but even then WDDM command queue batching can inject difficult to understand behavior into any but the most trivial timing sequences.

Changyun · April 18, 2016, 5:44pm

Thank you, txbob.

Topic		Replies	Views
Different times Ubuntu Vs Windows CUDA Programming and Performance	8	1682	October 12, 2015
CUDA execution multiples of 16ms CUDA Programming and Performance	14	2063	May 30, 2015
Simple CUDA program hitting size limits/errors on Windows but not Linux CUDA Programming and Performance	23	1925	January 12, 2019
Help on my Weird Timing Results CudaEvent Timer gives weird results CUDA Programming and Performance	1	2330	March 1, 2012
why cudaGetDeviceProperties and cudaMallocPitch consume a lot of time CUDA Programming and Performance	18	2366	January 9, 2017
CUDA performance get slower after sleep in host side CUDA Programming and Performance	7	1177	November 22, 2022
CUDA slower in Windows 7 than in Windows XP same computer, two OSs, different run times CUDA Programming and Performance	21	18964	November 11, 2009
Computing with Geforce CUDA cards CUDA Programming and Performance	18	4992	March 3, 2014
Windows 7 no CUDA-capable device is detected CUDA Setup and Installation	23	19270	January 9, 2018
CUDA Timeout? CUDA Programming and Performance	7	27695	December 19, 2011

Does CUDA7.5 suppose to work with Quadro M2000M?

Related topics