If you have any advice on how and where best to engage with the complexity optix and CUDA bring to the table, I’d be glad to hear it.
Well, that’s a wide topic.
My approach to learning new things is first to understand what is possible while having a specific use case in the back of my mind.
That’s where Programming Guides and API references come into play.
The online OptiX Programming Guide and API Reference have a nice search function in the top right.
The search results of the online CUDA Programming Manual site are sometimes a little broad. I also search inside the PDF version of that instead.
After finding out what is possible and what not, the next step is to figure out how the necessary things work.
The easiest way to learn that is from existing code and tutorials inside the SDK and whatever you can lay your hands on.
The issue with this is to find the good ones and not learn from bad example code, so the more programming experience you have, the better.
My most frequently used tool is Find In Files in source code editors (MSVS, VS Code, Notepad++, etc.)
VS Code can open whole folders and quickly search through all files conveniently. I sometimes have multiple of them open with different folders just for the searching functionality. 4K and multi-monitor setups for the win.
When you have a MSVS project running, all the tools which jump to function and structure definitions can help learning about how things work, since the headers defining these usually contain the documentation.
Means if you have, for example, questions how to copy memory, just search for words related to memory, copy, memcpy, etc. in all related example sources you have. Once you found the function doing that, look it up inside the include headers or Programming Manual again. Rinse, repeat.
When using OptiX, there is not really a need to dive too deep into the CUDA kernel programming because the OptiX single-ray programming model makes it simple to concentrate on what should happen per ray.
Some of the things CUDA device programing offers is not even available because the scheduling in OptiX forbids it, like warp wide synchronizations, shared memory, etc., though you can use everything in native CUDA kernels running on the input or output data outside the optixLaunch, as the optixRaycasting example shows.
You have to make a choice about the CUDA API you’re using on the host. The CUDA Runtime API is more high level and easier to use. There might also be some CUDA libraries which only work together with the runtime API. (Not my expertise.)
The CUDA Driver API is more low level, so it’s slightly harder to use, esp. when launching native CUDA kernels, but it has better control over the CUDA contexts which is why I use that especially for multi-GPU use cases.
(I’m showing the CUDA API differences between runtime and driver API in one introduction example.)
Then there are also books.
For CUDA programming just search the web for “CUDA Books” and you’ll get the standard ones (e.g. CUDA by Example, Programming Massively Parallel Processors, etc.)
For ray tracing beginners, have a look at the books from Peter Shirley. Online editions here: https://raytracing.github.io/.
Then there are the Raytracing Gems Books with a wide variety of topics.