The OptiX SDK examples are usually assuming that all *.cu files inside the project are OptiX device code.
These are only translated to OptiX module input code which is either PTX source code or OptiX-IR binary code.
The only exception is the optixRayCasting example which demonstrates how to use OptiX only for ray-intersection tests and does ray generation and shading calculations in native CUDA kernels.
Please note that none of the OptiX device functions can be used inside native CUDA kernels.
https://forums.developer.nvidia.com/t/can-cuda-launch-a-optix-kernel/243688/2
When you want to use standalone CUDA kernels inside OptiX programs, how that can be integrated into OptiX applications depends on the CUDA API you’re using (runtime, driver).
1.) When using the CUDA Driver API (which the OptiX SDK examples are not doing) then you could compile your kernel to PTX and then compile that at runtime and use the kernel functions inside that.
I’m doing that for a simple multi-GPU compositing kernel in my OptiX examples:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/src/Device.cpp#L293
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/MDL_renderer/src/Device.cpp#L2065
That cuLaunchKernel() mechanism is quite awkward and care needs to be taken to not exceed any hardware limits.
2.) The simpler approach is using the CUDA Runtime API and its kernel launch mechanism with the chevron operator <<<>>>
as in your test code.
The main hurdle with that is how to setup a CMakeLists.txt which can handle native CUDA kernels and OptiX device code separately.
That is possible in different ways and with the newer CMake versions that is actually supported with the CMake LANGUAGE CUDA
feature as well.
I’ve posted a CMakeLists.txt
which shows that here:
https://forums.developer.nvidia.com/t/why-am-i-getting-optix-dir-notfound/279085/4
Note that this requires CMake 3.27 or newer. Just follow the instructions and you should have a standalone CUDA Runtime API + OptiX device code framework using CMake.
Calling CUDA kernels inside the resulting application then works the same way as in all CUDA Toolkit examples, like this:
https://github.com/NVIDIA/cuda-samples/blob/master/Samples/0_Introduction/vectorAdd/vectorAdd.cu#L147
And OptiX host and device code would work as before.
You would only need to make sure to use the correct CUDA context and CUDA stream for these kernels to have them working on the same CUDA device data.