Two tasks on the same GPU?

Hi everybody!!

I need to test something. Take an example to explain : I have 2 big matrices 1024*1024 I want to multiply using CUDA or CUBLAS (this multiplication is done ad infinitum in an infinite while loop). But, in the same time, I want to visualize an important scene in a viewer 3D.

Both of them need to be execute with the less latency as possible and without deadlock (if possible). So is it possible to run these 2 tasks in the same time on the GPU?

Thanks for help.

There is no time sharing in CUDA at present. Your GPU is either running CUDA or rendering, but not both. If the GPU is running an active display, you won’t be able to run an “infinite” kernel - the operating system display driver will intervene and kill the CUDA kernel after about 5 seconds of uninterrupted running.

Multiplying 1024x1024 matrices using cublas would probably take in the order of 10ms on top modern GPUs. That’s without copying data between host and device.

When an app issues many commands concurrently (for example render calls and kernel calls) the device automatically builds a queue and will execute commands in order but without timeslicing.

So the option I see is to have a host-side loop over cublas calls. Looping over kernels instead of within a kernel allows the device to switch to another task. For minimal latency you could interleave matrix mul routines with rendering calls in that loop.

Of course with 10ms for matrix mul alone, your upper bound for rendering speed is 100 FPS. How fast do you need the visualization to be? Also, benchmark cublas performance on your hardware, don’t trust me :)