I’m trying to put together a program, what relies on gpu computing, but we should be able to extend it with external dlls / ptx(cubin) files. Compiling everything into a single executable / kernel is a nightmare, and makes development extremely slow (slow compile times, I can’t compile specific modules etc)…
So all I want to do is to define some behavior in c++ file, compile it to a ptx / cubin file and load that code into my main kernel. (if possible I don’t want to generate ptx code myself, I’m not familiar with code generation…)
So the basic structure looks like this
And someKernel calls the someFunction from the file1.ptx. I have tried to extract the function pointer to the someFunction (it is 128 actually) and pass it to the someKernel and using the driver api, but that doesn’t work.
So that means my function pointers are only valid for the kernel execution?
If yes is there some easy way to merge these two ptx files using the driver api functions, so I could use someFunction in someKernel? (and they shouldn’t take much long, since instant, or nearly instant code loading is important… nvcc works great, but with highly complex code the compilation is very slow)
If not, what happens if I manually merge the two ptx files, and copy the someKernel code to other ptx file, and get the function pointers from that module? I am afraid that would screw up my registers, right? (I’m using a lot of recursive calls, so I have a lot of stack available to the functions, can they manage that automatically?)