Hey NVidia! I was curious if you guys intended to add support for profiling Vulkan applications and/or running applications which are taking advantage of transform feedback? Currently it is nearly impossible to track down certain classes of performance issues within an application. Additionally, any application which is using Transform Feedback immediately causes the application to crash when run under Nsight Graphics.
Thanks for the questions!
RE: Vulkan Profiling Support - If all goes to plan, we will have something early next year. ;)
RE: VK_EXT_transform_feedback support - We currently don’t have plans to support this extension since we hope to be able to profile the “base” API shortly. Is this an important use case for you? Would you expect the higher level API to be shown in the tool?
Thanks a lot for your prompt reply!
RE Profiling: I’m glad to hear that Vulkan Profiling is definitely on the road-map. Is there any chance to get a hold of a beta version of that at some point down the road? I’d be happy to help prove out the feature, even if incomplete / unstable. It would be extraordinarily helpful.
RE VK_EXT_transform_feedback: I’m not strictly sure what you mean by the base API. I’ll add to this that, I’m not necessarily concerned with actually being able to inspect the transform feedback queries and draws (since we use seldom have it show up in a frame), but rather, be able to just run the debugger at all. Today any use of those APIs by my application (possibly even just enabling that extension) causes Graphics Studio to crash, which causes a major issue for trying to debug. We’ve essentially reverted back to using Renderdoc (which is workable today, but will be a issue when profiling becomes available).
Good news! We may have a workaround for the Transform Feedback issue.
Could you try setting an Environment Variable in the launch options? Try: NSIGHT_VULKAN_SAFE_OBJECT_LOOKUP=1
Let me know if it still crashes after this.
Just wanted to follow up with you again to see if this worked around your issue.
I’ve been busy with another aspect of the project and have not had an opportunity to test this out. I’ll definitely let you know once I do whether or not this resolves the issue though!
P.S. What does changing NSIGHT_VULKAN_SAFE_OBJECT_LOOKUP actually do?
Without going into a ton of detail, it essentially modifies how our capture works. For performance reasons, we use object wrapping, but when we come across an extension that we don’t support, we can possibly crash. One of our engineers looked through the extensions we currently don’t support and this is actually the only extension that will cause this crash. Lucky you!
Using the Safe Object Lookup uses a different, but less performant, path through capture that doesn’t have this issue. Note that the extension still isn’t “supported” in this case, but it won’t crash.
Hopefully it will workaroud the issue for you until we have full support for the extension!
I managed to actually give you suggestion a shot today. Looks like the application still crashes even with Safe Object Lookup. On my first attempt, it just flat out failed due to the unsupported extension. I then enabled the Troubleshooting option for “Ignore Incompatibilities” and tried again, however doing a capture simply causes a small delay and then the application simply crashes and the capture fails.
Picture of my configuration as of right now:
In hindsight, I’ve selected a different test which does not exercise transform feedback and disabled Xfb completely; It seems Nsight causes my application to crash regardless of the extension being used. Any suggestions on how to find information on why the crash may have occurred?
Here is the NSight Graphics build info:
Version 2018.6.0.0 (Build 25144971) (public-release)
Here is my GPU info:
NVidia 1080 Ti - Driver 417.23
I think I’ve narrowed this particular issue down to Nsight Graphics not correctly handling cases where a descriptor set is not entirely populated. We have a particular descriptor layout which has 48 entries, but only a few of those are used by our shaders, and thus only a few of the entries are populated. It looks like when Nsight tries to replay it, it fails to set up the descriptors correctly.
I enabled Vulkan validation, and started to see errors such as this one:
[ UNASSIGNED-GeneralParameterError-RequiredParameter ] Object: VK_NULL_HANDLE (Type = 0) | vkUpdateDescriptorSets: required parameter pDescriptorWrites.pImageInfo.imageView specified as VK_NULL_HANDLE
I’ve tried disabling the parts of our implementation which creates those descriptors and flipping that on/off results in Nsight crashing or not, so it seems a likely candidate.
P.S. It seems like NSight actually generates quite a few other validation errors as well.
Thank you for the very detailed data! I’ll file a bug with the engineering team to see if we can get a repro on our side too.
Also, since you asked in the post above, here is our user guide on collecting more in-depth crash data: https://docs.nvidia.com/nsight-graphics/UserGuide/index.html#troubleshooting_crash_reporting
I’ll let you know if they need any more information, but based on the experiments you have done, I bet we can fix it.
Just wanted to give you an update. We found the issue and have fixed it! The fix will be included in our next release that should be out at the end of the month.
I’ll ping back here to follow up after our next release.
Hope you are doing well. We just released the new version of Nsight Graphics yesterday that contains your fix! Can you give 2019.1 a try and see if it fixed your issue?