Optix 6.5 Demo Performance Concern

droettger · June 18, 2020, 8:26am

The overall GPU load in the TaskManager might not show the compute workload by default.
Please select the TaskManager’s Performance tab and click on the GPU icon to show the individual engines and change one of the graphs to CUDA.
That should run at over 90% load when running the optixPathTracer. The 3D load inside the other graph there is the OpenGL texture blit of the rendered image.

That the OptiX 6.5.0 optixPathTracer example is only running around 60 fps in its default windows size and much slower in full screen is normal. That is implementing an iterative global illumination path tracer which shoots multiple paths per pixel at once. It’s expensive.

The final display in the OptiX SDK examples is normally done with an OpenGL texture blit to the back buffer and a swap buffers. By default that swap is synchronized to the monitor refresh rate, so giving a maximum of 60 fps if your monitor is running with 60 Hz.
For benchmarking it’s recommended to disable the vertical sync inside the NVIDIA Control Panel.
Right-click on the desktop, select NVIDIA Control Panel, go to 3D Settings → Manage 3D Settings → Settings → Vertical Sync and change the value to Off.

Then run the simple OptiX examples like the optixMeshViewer again. In a small window that should run well in the hundreds of frames per second. (On my Quadro RTX 6000 that runs at >650 fps at default window size.)

The memory usage on the device is dependent on what CUDA allocates alone for the CUDA context and then all OptiX buffers, textures, acceleration structures, shader programs, etc.

Performance analyses with Nsight need to happen with the standalone programs Nsight Systems for the overall application behaviour and Nsight Compute for the individual CUDA device kernels you programmed in OptiX. (Nsight Graphics won’t help with CUDA compute workloads.)
Use the latest Nsight versions and display drivers and make sure your PTX code was compiled with --generate-lineinfo (-lineinfo) to be able to match the CUDA source code to the PTX and SASS assembly.

When starting new projects with OptiX, I would recommend to use OptiX 7.0.0 which has a completely different host API which is a lot more modern and generally faster. Everything around the actual OptiX calls is handled in native CUDA Runtime or Driver API calls which gives you much better control and flexibility.

OptiX 7 based examples implementing rather fast and flexible unidirectional path tracers can be found here.
https://forums.developer.nvidia.com/t/optix-advanced-samples-on-github/48410
They should generally be more interactive than the OptiX 6.5.0 optixPathTracer example.

Topic		Replies	Views
OptiX performance loss due to RTX execution strategy? OptiX	6	1330	June 14, 2022
No performance difference on 2080Ti no matter if RTX is enabled OptiX	7	972	June 14, 2022
GPU usage is not 100% \|\| Performance question OptiX	6	2916	June 14, 2022
OptiX 6.0.0 performance loss? OptiX	13	1440	June 14, 2022
Bad optix ray-shooting performance. OptiX	8	1482	June 14, 2022
How can I force my OptiX program to run on the GPU to improve performance? OptiX	6	909	June 14, 2022
Memory usage in Optix 6.0 compared to Optix 5.1 OptiX	3	707	June 14, 2022
RTX ON/OFF Benchmark, Optix 6 OptiX	18	4475	June 14, 2022
Is real time ray tracing feasible? CUDA Programming and Performance	11	10097	November 17, 2009
Running OptiX on CPU OptiX	3	2477	June 14, 2022

Optix 6.5 Demo Performance Concern

Related topics