I’m trying to profile my OpenGL application using the perf sdk. I need to know if nsight perf captures GL calls from all the threads in the application. For example, in my main thread I call the following functions,
m_nvperf.StartCollectionOnNextFrame();
m_nvperf.OnFrameStart();
// Do some rendering on multiple threads...
m_nvperf.OnFrameEnd();
Is nvperf going to capture all GL calls between FrameStart() and FrameEnd(), including those from other threads? Or is it only going to capture the GL calls made from the main thread?
Hi ystozlu,
The short answer is “yes”, the GL calls from the thread where the session was started as well as other shared gpu context threads would be captured within the range of Start to End (caveat: in most circumstances). The caveat is that certain driver optimizations might cause a threads GL context to be a separate gpu context. It is also important to note that Push/Pop commands are only captured per GL context. Push/Pop commands from other GL contexts will be ignored but if the GL context is in the same gpu context as the session, then the work will be attributed to ranges from the sessions GL context.
I see. I only create one GL context in my application, which is shared across multiple threads. I know this is a long shot but, is there a way to “force” the driver to not do that optimization? I’m not sure why the driver does such an extreme thing in the first place…
Hi ystozlu,
No, unfortunately we do not have a way to force the driver to not do that optimization.
However, I think it might help us if you explain a bit better what you believe your problem actually is. That way, we won’t be trying to address just specific questions but we can help resolve your root issue. We can probably guess but it would be better if you explain it.