Avoiding cudaSetDevice in different calls to same executable is it possible to create a persistent c

Hi all,

I’m wondering if the setting up of the GPU operated by cudaSetdevice can be made persistent between succesive calls to the same executable from the system.

As I’ve measured in my particular executable setting up the GPU takes 2 seconds. At least that is what I measure for cudaSetDevice. Other stuff (data transfer from CPU and actual computations) take just 0.5s, so that the setting up is my actual bottleneck.

In my actual application, I call that particular executable thousands of times, so that I’m wondering if it is possible to arrange things so that the setting up does not need to be performed each time.

Any suggestion is very welcome!

Can you start up the GPU application as a separate persistent process or daemon?

Hi Gregory,

thanks for the quick answer! That’s encouraging.

Now, you mean the CUDA executable? Well, that is my question: I do not know if it is possible. Actually I do not know how to find out information on this issue. I’ve been googling around for quite a time, but haven’t found concrete guidelines.


So it definitely is possible, although it will probably be very annoying to write the code for it. Essentially you want to start up your cuda program and then use something like sockets or an IPC library to get the cuda application to perform the interaction with the GPU. You will probably have to use a real IPC library (boost has a decent one) to share memory and get decent performance. Another option woud be to push the logic that cals your cuda application into the application itself.

Hi Gregory,

thanks for the hint… I took a look and the IPC option seemed difficult, so I took the other option, which was the lesser evil: rewrite the application.

I’ve seen that the Matlab mex-files can do this task, by the way. Between calls to the GPU from matlab they can keep the GPU status (or part of it)