CUSPARSE multithreaded


does anybody know, if it is possible to start two or more CUSPARSE function parallel on one device? Perhaps with MPI?

Thanks all.

CUSPARSE supports streams, so if you have a Fermi card and launch a pair of sparse operations into different streams, there is some possibility for the kernels to run simultaneously on the same device. Otherwise, there is not way that I am aware of. It is possible to establish two contexts on a single device (and two CUSPARSE operations could be run from different threads into those context), but that would not produce parallel execution, just two contexts competing for resources the same device in a non-deterministic way.

Correct, stream is the way to go with CUBLAS and CUSPARSE. It allows you to run Concurrent kernels but also to overlap Device/Host memcpy and Kernel run if you use pinned memory.