Easiest way to invoke two different kernels simultaneously ?

Michael_H1 · April 11, 2012, 5:06pm

What is the easiest way to invoke two different kernels simultaneously ?
Kernel A + Kernel B

There are two warp schedulers per SM in FERMI.
Is it possible to assign kernel A and kernel B to each of those schedulers ? so that they can execute simultaneously ?

Or the only possible way is to rely on GPU’s context-switching capability ?
To achieve this, should I generate two CPU threads and call CUDA_Kernel simultaneously ?
If so, GPU still executes one kernel at a time, right ? (it does context-switching…)

Or any suggestions ?
Thank you

seibert · April 11, 2012, 5:16pm

Fermi supports the launch of multiple kernels on the same GPU using the concept of “streams”. Kernels on different streams can execute at the same time on one GPU, and kernels in the same stream execute in order. See section 3.2.5.5 in the CUDA Programming Guide.

Michael_H1 · April 11, 2012, 5:23pm

Thanks, does this mean that if I use Fermi GPU that has more than one warp schedulers per SM, two different kernels will be executed on each SM at the same time ?

seibert · April 11, 2012, 6:56pm

There is no relation between the number of warp schedulers per multiprocessor and the number of concurrent kernels. The limit on Fermi is 16 concurrent kernels, I believe, although it is up to the driver how many it actually runs simultaneously. (That’s an important note! Streams tell CUDA what kernels can run concurrently, but it does not guarantee concurrency. Streams still work on pre-Fermi GPUs, but the kernels get run sequentially.)

Gert-Jan · April 12, 2012, 7:57am

According to the Fermi whitepaper (page 18), all SMs are first filled with threads from the first kernel, after that threads from a second kernel are used. When all SMs are completely filled, threads from a next kernel have to wait until another kernel finishes. So it is concurrent, but I don’t think there is a way to tell the GPU: use 5 SMs for kernel-1, the other SMs for kernel-2. Would be nice though.

Topic		Replies	Views
Fermi streams and kernels CUDA Programming and Performance	5	1804	July 22, 2010
Kernel scheduling with Fermi independent blocks can be placed in new streams? CUDA Programming and Performance	14	13202	January 22, 2010
Concurrent kernels execution using streams in multiple CPU threads CUDA Programming and Performance	7	10609	June 26, 2012
can we use different Kernels on diffferent cores of a GPU at the same time ? CUDA Programming and Performance	5	3670	October 20, 2010
Fermi concurrent kernels hyperthreading CUDA Programming and Performance	7	4931	October 24, 2010
Could you please post a sample code on CUDA Programming and Performance	3	9957	March 4, 2011
Streaming Concurrent Kernels (in Fermi GPUs) ... CUDA Programming and Performance	2	1386	May 7, 2013
putting multiprocessors in group CUDA Programming and Performance	6	1677	November 27, 2009
Fermi speculation Kernel invocation in kernel code CUDA Programming and Performance	10	4294	October 20, 2009
Maximum concurent kernels For numbers of streams > 16 CUDA Programming and Performance	0	943	April 8, 2011

Easiest way to invoke two different kernels simultaneously ?

Related topics