cudaMalloc in rg()


I was wondering if there is a way to allocate memory in rg, return the pointer and size of that buffer and then read it out in my main.cpp

As far as Visual studio tells me, i am not allowed to use cudaMalloc in the rg(). Is there another practice how i can achieve this? Or can I only allocate memory on the device from my host main.cpp before i start launching rays?

Thank you for your help and stay healthy!

Yes, you need to call cudaMalloc on the host, pass the pointer into your raygen using either launch params or your SBT entry, and then the host can read back the results after the launch is complete.


Hey David!

Thank you for your immediate reply!
I only get to know the size my buffer has to be once I’m in rg().
Is there absolutely no way in which I can allocate memory at the rg() runtime and later access that from my .cpp?
I dont’t render, so one launch index will launch many rays in rg().

I guess I could first run a launch without launching rays to determine the size of my buffers and after setting up the buffers in the main.cpp I do the actual raytracing part. But I would prefer a solution were I don’t have to launch OptiX twice.

Thanks for your help!

There is currently no way to allocate inside kernel code, neither in OptiX nor CUDA. I do use a 2nd launch for this kind of thing frequently. The other option would be to allocate a buffer of maximum possible size, and then after your launch re-allocate & copy the buffer to something smaller, if necessary.


1 Like

Thanks for the clarification!
I will use a “scouting” launch to determine the needed sizes.

Have a nice evening and stay healthy!