Inconsistent times when profiling Vulkan-based render engine compared to D3D11 profiling

Hi,

We are currently completing the implementation of Vulkan in our render engine.

We compared the performance of the Vulkan-based renderer to the previous D3D11-based one.

Using a typical scene, we achieve 136 FPS (7.35ms) with D3D11 and 129 FPS (7.75ms) with the current Vulkan renderer.

However, we profiling both with NSight, we got weird timings. While the D3D11 range profiler shows expected duration, the times shown for the various actions, as well as the entire frame (about 17ms), are completely off when compared to the framerate given above. This makes any optimisation investigation impossible…

Are we doing anything wrong ?

See the D3D11 capture:

And the Vulkan capture:

Any idea ?

I used a RTX 2060 SUPER for that, with Windows 10 21H1 (19043.1202), NSight Graphics 2021.4 and NVidia drivers version 471.96.

Thanks a lot !

PS: BTW NSight reports a weird error on my Windows 10 system (“Unsupported OS detected: Windows 7. Not all Nsight features are supported on this version of Windows. Your debug session may become unstable.”)

Hello,
Thank you for using Nsight Graphics and sorry you ran into these issues. I have filed a bug on your behalf for our engineering team to investigate / resolve. In the meantime I also want to point to another feature within Nsight Graphics that you can try and that is GPU Trace. Let me know if this helps.

Regards,

Hi,

Do you have any update regarding this issue ?

It is still present in the latest 2022.1, and occurs with all the GPUs we have here (2060 super, 3060 Ti, 3070, 3070Ti, 3080 etc.)…

It is very annoying to not have any Vulkan profiler that gives correct timings…

Hello,

We were wondering were you able to try GPU Trace as suggested previously? We were hoping this would at least unblock your profiling needs. Also is it possible for you to profile a cpp capture of your app, to try to isolate profiling vs replay? This may help us to distinguish between “replay induced” and “timestamping induced” WFIs.

Hello,

We were wondering were you able to try GPU Trace as suggested previously? We were hoping this would at least unblock your profiling needs. Also is it possible for you to profile a cpp capture of your app, to try to isolate profiling vs replay? This may help us to distinguish between “replay induced” and “timestamping induced” WFIs.

HI Darrell,

Thanks for getting back to me, I really apologize for the late answer, it has been busy days here :/

Anyway, yes, I’ve been able to try using GPU trace and I confirm timings are correct using that activity (NSight 2022.1.1).

I’ve also make a cpp capture of a test scene. I’ll send you a PM.