The partitioning of SMs can be achieved using MPS resource provisioning. This implies that the work be broken into separate processes. For a single process, the only methodologies are the ones already mentioned, the primary one being stream priorities. (Another effective method is probably to use 2 or more GPUs.) I have made suggestions about how to use stream priorities to give best progress to the high priority stream here. I don’t have any further suggestions. It’s quite possible these suggestions don’t address every case.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| cuda stream high priority could not always schedule high prority | 2 | 765 | July 11, 2019 | |
| How high priority stream preemption | 12 | 6867 | November 30, 2022 | |
| Running CUDA kernels from two different pthreads | 7 | 2971 | May 10, 2016 | |
| Priority of concurrent CUDA kernel execution on TX1 | 5 | 1409 | October 18, 2021 | |
| How to verify that high priority stream is served | 12 | 2065 | April 24, 2025 | |
| Fixing SMs for a kernel | 11 | 2268 | August 30, 2016 | |
| Resource partitioning in Streams | 4 | 781 | December 4, 2019 | |
| handling thread priorities - (how) is it possible? | 1 | 2440 | September 16, 2007 | |
| Streaming Concurrent Kernels (in Fermi GPUs) ... | 2 | 1425 | May 7, 2013 | |
| Questions of CUDA stream priority | 10 | 4354 | April 19, 2023 |