We are developing a simulation using Optix which we would like to make as fast as possible. We understand that, up to now, using Optix with Nsight was not really possible. Using Optix 5.x, I was able to profile at the megakernel level, but not more fine grain than that. I’d really like to be able to profile individual kernels, since they can be relatively complex.
I recently installed Optix 6.0 and immediately realized a 6-8x performance improvement (really great!). But I was disappointed to see that support for Nsight profiling seems to not have been included in this release. I was particularly interested since this talk was given at SIGGRAPH last year:
http://on-demand.gputechconf.com/siggraph/2018/video/sig1843-johann-korndoerfer-nsight-compute.html
I was excited to see this talk since it gave me hope that we could more effectively profile and debug our simulation using Nsight. Unfortunately, the speaker skips the most important part of the process: how to create the report.
Is there a currently available method to create a profiling report? Perhaps a compiler option? When I try to use Nsight Compute I get an error that says “FailedReadingMagicNumber”. I am able to profile pure CUDA code without any issues.