Host C++ without compilation attempts?

Is there a way to get nvcc to just spit out a c/c++ source file that has simply been massaged to be fed into the host compiler? I am thinking in terms of something like the Qt moc system, which makes no attempt to compile, but simply translates the ‘extensions’ to the language into more standard c/c++ with the necessary supporting code. For CUDA, I imagine this would be something like the kernel invocations being translated into a series of Runtime or Device API calls, and possibly the cubin encoded as a resource and loaded somewhere in the file.

I thought that is what the -cuda flag was supposed to do. I am running into issues with other libraries like Boost and Qt when they go through the nvcc chain that just don’t happen otherwise. Of course, I don’t use structs from those libraries in device code, but nvcc chokes even if I use the -cuda flag to generate .cxx files trying to parse some of the headers. I have putting device functions in separate files, but still need to use nvcc to parse kernel invocations.

Thanks for any assistance.