Can cuda launch a optix kernel?

I can use nvcc to compile an optix kernel file(a .cu file containing ray tracing kernels) into a ptx file. I know that cuModuleLoad can open a ptx file, but it fails to open the ptx file compiled from an optix kernel. cuModuleLoad returns an error code of 218, which means “a PTX JIT compilation failed”, I can’t get any other error messages.
I want to ask, is it possible to launch an optix ray tracing kernel with cuLaunchKernel, cudaLaunchKernel or other way but not optixLaunch?

I want to ask, is it possible to launch an optix ray tracing kernel with cuLaunchKernel, cudaLaunchKernel or other way but not optixLaunch?

No, it is not possible to compile PTX with OptiX device functions to CUDA modules.
If you look into the optix_7_device_impl.h header, the inline assembly calls there reference functions which do not exist anywhere.
These are provided by the OptiX internal compiler and that means it’s not possible to compile such code to CUDA modules outside of OptiX.

On the other hand, an optixLauch is bascially an asynchronous CUDA kernel launch and the whole OptiX 7 API integrates nicely with the rest of the CUDA host runtime or driver API, since you need to implement all resource management in native CUDA host code anyways. There is not much difference between calling optixLaunch or cuLaunchKernel, cudaLaunchKernel except for the automatic handling of the constant launch parameter block in OptiX and the launch dimension size limit you find inside the OptiX Programming Guide.

1 Like

Thx for your reply.

I want to launch optix kernel with cuLaunchKernel and cudaLaunchKernel because I want to modify the grid size and block size. Here is another way I’ve tried. I guess maybe optixLaunch calls cuLaunchKernel and cudaLaunchKernel. Then I tried to hook cuLaunchKernel and cudaLaunchKernel to modify the grid size and block size of kernels launched by optixLaunch, but it turns out that optixLaunch does not call cuLaunchKernel and cudaLaunchKernel.

There is not much difference between calling optixLaunch or cuLaunchKernel, cudaLaunchKernel except for the automatic handling of the constant launch parameter block in OptiX and the launch dimension size limit you find inside the OptiX Programming Guide.

So it’s not possible to modify grid size and block size of optix kernels? Because the size is aumomatic decided?

There is no way to control that part.
All internal scheduling is abstracted by the underlying raytracing drivers.

The optixLaunch call defines the launch dimension which defines the number of threads.
The OptiX launch dimension limit is 2^30 which is smaller than native CUDA kernels.

Grid and block sizes depend on the underlying GPU and the resources used inside the kernel.
The only other thing which could influence this aside from the launch dimension is the OptixModuleCompileOptions maxRegisterCount value.

What exactly are you trying to achieve?
If you want to improve performance, analyze your device code with Nsight Compute.

1 Like

I notice that optix block size is always 128*1*1. I want to check whether 128*1*1 is the best size, so I tried to hook cudaLaunchKernel and cuLaunchKernel to launch a different blocksize.