I’am writting an OpenGL 4.4/GLSL Library for Deep Learning convolutional neuronal model prediction.
In ordrer to profil the GPU load of my compute shaders (between computing features maps/pooling operation etc…) I wrote a simple Windows application with an OpenGL window, a glClearColor(), and CNN prediction using compute shader at each redraw. Nsight 5.5 crash when I start “Performance analysis” however this is not my issue.
The frame rate captionned by Nsight is about 22.5 FPS (43 ms/frame). But when I do “Pause and capture Frame” the total scale of the Scrubber timeline reported only 3 ms (even I choose GPU or CPU Duration Scale).
I think there is an omission of a lot of CPU time (the time to wait a glMapBuffer() after a memory barrier for example). Note that I used IndirectDispatchCompute(), so the reported time for this call is not the time to process and finish the GPU computations. This is why I used memory barriers before reading the resulting textures.
Does anyone have the same problem ? Why this is not the actual elapsed time in the scrubber line ?
My config: VS2015, Nsight 5.5, Geforce GTX 1060, version 4.6.0 388.71 driver