How to "wrap" function calls for running in CUDA ?

How to “wrap” function calls for running in CUDA ?

Could anyone help me how to “wrap” the function calls, so that those function calls will be executed in GPU?

I have C++ program already running ok, but I want to run some portions of the program in GPU… But as you already know, the functions that run as CUDA on GPU must be compiled by nvcc, and so on…

I want to compile only that portion by nvcc, and the rest as ordinary gcc…

I am using Linux… and I am a newbie. I would like to hear some explanations on the steps to compile… I really appreciate it…

Thank you

PS. Could you just provide simple code for wrapping it, and compile instructions ?

Deja-vu, I just read this post 30 seconds ago…

Just look at the simpleCpp example in the SDK, it has everything. And note that you can call any CUDA function directly from C++ (#include “cuda_runtime_api.h”) except for those that bind textures, copToSymbol and kernel calls. So, you only need to write thin wrappers for these methods in .cu files compiled via nvcc.