Scheduling multiple applications on a single GPU

Dear All,

I have two CUDA based applications sharing a single GPU in time-shared way. When I run the two applications together, the applications result in sub-optimal performance. It will be great if someone can explain the way CUDA drivers schedule the applications when multiple applications share a single GPU. Will it be a round-robin scheduling or FIFO scheduling or something else?

Thanks in Advance,

I’m not sure which it is (round-robin or FIFO), but there is definitely some overhead when the GPU switches between contexts. The switching overhead seems to be less for Fermi GPUs, but it will be noticeable if both programs execute many small kernels, since the GPU will try to time slice between kernels.