Driver API Dynamic Parallelism

Hey all,

I’ve been toying around a few hours to dig into device side kernel launches and the using cudadevrt library.
My application has no cudart linked and I’m bound to driver API only.

The closest approach I could find from NVIDIA’s to use the .device-link.obj and link the cudadevrt. But that files seems rather empty and I get an error when trying to load it with cuModuleLodFatBinary().

How would this work?

(This is CUDA 6.5 and a GK110)