Resource partitioning in Streams

sajshaj94 · November 13, 2019, 6:19am

Hi,
It is known that with the use of streams one can concurrently run multiple kernels in device. I want to know how are the cores/ resources divided within the streams. Are there options for the user to configure the cores/ resources used for each thread.

If the resources are partitioned by the compiler, then is there any logic in the partition or is itequally partitioned?

sajshaj94 · November 19, 2019, 5:14pm

Can someone answer my question…Please

Robert_Crovella · November 19, 2019, 8:49pm

the resources are scheduled by the GPU at runtime via the kernel launch system and the GPU block scheduler. None of these mechanisms are specified or documented. The GPU block scheduler is free to choose blocks from any launched kernels, in any order, at any time, for deposit on any GPU SM. There is no formal partitioning.

The only user-control over stream activity is the stream priority mechanism which is documented in the programming guide. This effectively prioritizes blocks that are waiting to be scheduled by the GPU block scheduler, so that blocks from higher priority streams receive scheduling priority over blocks from lower priority streams.

sajshaj94 · December 3, 2019, 5:07pm

Thanks for suggesting stream priority.
But in my application I have 5 streams. How can set different priorities for different streams. Reading the document I find only two priorities available.
If at all there is only two priorities, then can can I set one among the 5 to be of high priority and the rest streams having the same priority?

Robert_Crovella · December 4, 2019, 2:47am

Yes you can set one to be high prority the rest to be low priority.

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#stream-priorities

The number of priorities is subject to change. Study the above doc. Yes, currently I believe there will only be 2.