dynamically load cuda code in my application

Hi,
What I want to do is to call a cuda function in my existing application, but without recompiling my application.
I want to dynamically load cuda code like this,

main.cpp
main()
{
typedef void (*funcPointer)();
handle = dlopen (“./cudafunc.so”, RTLD_LAZY);
funcPointer fun = dlsym(handle, “hello”);
*fun();

}

cudafunc.cu (compile to cudafunc.so)

global void kernelfunc()
{

}

extern “C” hello ()
{
kernelfunc<<<…>>>();
}

Can I do this?

Yes, there are functions in the driver API (“loadModule()” or similar) that will let you load/run precompiled .cubin files.

I think he meant loading in Host code, not just device code.