10ms Block each seconds during execution

GloWondub · December 22, 2011, 11:13am

Hello

I’m currently working on a cublas programm who works very well.

However, i’ve profilled a very strange behavior.

During the execution, here is what happen :

920 ms execution | 10 ms block | 60 ms execution | 10 ms block | 920 ms execution … and so on…

It’s an exact period of one seconds. and it’s not code dependant.

The only way to remove these gap in execution is to remove any call to cuda or cublas. I can make my programm execute without running the cublas function and make cuda allocation, and i get the gaps just with a call to cublasInit.

Event a single call to cudaSetDevice ou cudaGetDeviceCount get me the gaps.

I’ve try my code an another machine, and there is no problem.

There is two differences between these machine :

The bugging one is 64 bit, other 32.
The bugging one get two Tesla GPU, the other one Quadro 4000.

Somehow i think this is related to the fact i get two gpu on the bugging machine, but from now i dont have any idea what i can do.

GloWondub · January 3, 2012, 2:05pm

It appears that any calls to cudaSetDevice or cudaGetDeviceCount give us the bugs.

Other call to cublasMalloc cublasCreate … do not create the bug.

Topic		Replies	Views
Time Measurement for CUBLAS why time (clock()) for CUBLAS is always 0 ms for any array size? CUDA Programming and Performance	2	2709	March 21, 2009
Varying Execution time CUDA Programming and Performance	2	1138	June 10, 2010
Strange Variations in Execution Time of cublas<t>geam() [cublasDgemm] GPU-Accelerated Libraries cublas	9	1168	September 2, 2021
cuBlas execution time Legacy PGI Compilers (archived)	1	2666	May 13, 2014
[Solved]Same Cublas Functions work slower on the GTX1080 from GTX 960M GPU-Accelerated Libraries	3	930	June 5, 2018
Large % of time in cuBLAS calls spent in clock_gettime GPU-Accelerated Libraries cublas	2	371	March 6, 2024
CUBLAS problem CUDA Programming and Performance	16	3746	July 1, 2010
Execution timings varying from instance to instance CUDA Programming and Performance	10	637	September 29, 2023
Evaluate cycle execution time Newbie question CUDA Programming and Performance	1	2206	July 13, 2007
Tesla C1060 and GTX 275 running same code on two platforms CUDA Programming and Performance	3	1451	November 30, 2009

10ms Block each seconds during execution

Related topics