Hi!
In my Vulkan application, I render one frame in each of my 2 GPUs, I record one command buffer for each GPU, submit both in a single VkQueueSubmit call and wait on a fence before recording the commands for the next frames.
When running my application through Nsight Graphics I see that before one of my GPUs finishes rendering the current frame the other starts rendering the next frame, which shouldn’t be possible since the fence blocks on the CPU. In my opinion, the timeline for both GPUs is not synchronized. Can someone explain why this happens?
Thank you for using Nsight Graphics and providing your feedback. I am not quite sure about your issue, could you please provide a simple example that would allow us to reproduce the issue? This will help us in investigating and resolving the problem more efficiently.
Hi Ayan,
thanks for your quick reply. I’m a bit busy right now and cannot provide a simple example (not so simple anyway with Vulkan). Maybe it helps if I show you an image of what I meant.
I drew rectangles on top of a capture to highlight the work done for each frame. The upper row corresponds to GPU 0 and the bottom one to GPU 1. The work for the frames with rectangles red and green is submitted at the same time and I wait on the CPU for them to be done with a fence before recording any more commands.
As you can see, Nsight Graphics shows that the work for the following frame in GPU 1 (yellow rectangle) starts before the work for the previous frames (red and green) finishes. That’s why it makes me think that the timelines for both GPUs is not synchronized.
It’s hard to say anything, but maybe you can try to capture more than 1 frame and see what happen? Just set Max Number of Frames within the Start Activity dialog box.
My application uses an offscreen engine, so the only way I found to use nsight graphics was with GPU Trace Profile and One-Shot, so it basically collects everything. I could show you even more frames but they follow the same pattern. Can you check from the image what I meant?
yes, that’s the issue I’m seeing. I create two logical devices (VkDevice) that correspond to my two physical GPUs.
Now I realize I need to correct what I said before. The commands are recorded separately for each GPU, sent with different VkQueueSubmit calls and waited on one VkFence object each, one after the other.
What version of Nsight Graphics are you using? Could you take a try on some latest release of Nsight Graphics? I have to mention that Nsight Graphics/GPUTrace doesn’t support multiple GPUs, only 1 GPU is supported. In theory, you should be able to see only 1 GPU’s workload in recent release of Nsight Graphics.
I’m using the version 2022.7.0.0. I can check tomorrow with the latest version since I need to update my drivers in order to use it. But well, I guess your comment explains why multi-GPU doesn’t work. Is there any plan to support multiple GPUs at some point?
Side question: how should I modify my offscreen application in order to use the Frame Debugger and not the GPU Trace Profiler? I think that I could get more information about the GPU utilization with it, but is it worth the trouble anyway?