Cuda Kernels running slow

rgc183 · November 9, 2018, 6:18am

I am using tensorflow library in my app to do inference of model. If I run single model, inference takes around X milliseconds. If I run the another binary with same model concurrently, then each of them takes more time. Now I want to know whether GPU is actually busy or all of its multiprocessors are being used and thats why it is running slow. When I use visual profiler, I see that total compute time is increased and each individual kernel is taking more time for the execution. Now, what could be the reason that kernels are taking more time for the execution? How do I further analyze this issue?

Topic		Replies	Views
Profiler speeding up my kernels? Nvidia employees please read Weird timing behavior during profiler CUDA Programming and Performance	6	5882	November 9, 2009
How to explain the performance difference? CUDA Programming and Performance	7	3566	March 26, 2008
Function executing time CUDA Programming and Performance	7	6478	December 17, 2007
overall time consumption computation how to compute how much time my GPU code is consuming ? CUDA Programming and Performance	0	1132	May 18, 2009
Two copies of same kernel, one runs 2x faster CUDA Programming and Performance	2	734	January 30, 2014
Simultaneous execution of multiple kernels CUDA Programming and Performance	4	2638	December 24, 2008
Kernel Overhead/Profiler Accuracy CUDA Programming and Performance	4	6447	May 25, 2008
Profiler timings vs. real world timings. VERY different... CUDA Programming and Performance	8	2505	May 15, 2009
cput time in cuda visual profiler CUDA Programming and Performance	0	1017	July 18, 2009
Inconsistent kernel run times CUDA Programming and Performance	12	5881	August 5, 2009

Cuda Kernels running slow

Related topics