I want to store data from one kernel into global memory, and then read it from another kernel, with both kernels coming from different library functions. Something like:
global void kernel1(…)
store=save something to global mem
global void kernel2(…)
read something from store, was written by kernel1
extern “C” DLLEXPORT int libfunc1(…)
extern “C” DLLEXPORT int libfunc2(…)
Is that even possible? If I do a cudaMalloc and store in a pointer in kernel1, then how would kernel 2 in another library function know about it? I get the error message “error: identifier “params_d” is undefined” upon my compilation attempt, and if I declare it, I get a new one and couldn’t read from the one in store.
My suggestion would be to discover how to do the same thing with an ordinary C++ setup (no CUDA); you’ll be well on your way to figuring out how to do it with CUDA C++.
For the general mechanism, one library has to be able to return a pointer to the calling environment. The calling env then passes that pointer to the other library, via library call to that library.
Library 1 has to publish a function like:
int * get_allocated_ptr();
Library 2 then accepts a pointer:
void use_allocation(int * allocated_ptr);
An additional wrinkle with CUDA is that you need to manage the context correctly.
The driver API gives explicit control over this.
For the runtime API, any context should get picked up via either library as well, automatically.
Again, I would encourage you to figure this out via a pure C++ 2-library setup first. There is no reason that you should need to share a symbol from one library with the other library.