How does the GPU decide when to switch-out a CUDA context?

howkong · June 18, 2021, 8:57am

So I am running an application that requires a GPU, and I notice this phenomenon where the GPU has nothing else to do, but the CUDA context stays on the device for about 1ms before being switched out.

I observed this phenomenon using Nsight Systems, shown below:
context

One can see that, after the GPU has nothing left to run, the process flow is transferred from the GPU to the DLA. However, the GPU still keeps that CUDA context on the device for a short while of 800us. (My program requires both the GPU and the DLA to compute a neural network.)

So what exactly is the GPU’s context switch strategy?

Robert_Crovella · June 18, 2021, 2:19pm

As far as I know the details of context switching are not published. You may wish to ask Jetson questions on the Jetson forums.

Topic		Replies	Views
GPU Context switch of multiple processes CUDA Programming and Performance	8	4219	February 24, 2021
Metrics on cuda context switch CUDA Programming and Performance	0	370	December 8, 2020
CUDA context switching overhead of current GPU CUDA Programming and Performance	6	3121	May 5, 2024
Does CPU process wait when calling CUDA code CUDA Programming and Performance	4	3376	October 8, 2017
Contexts: Performance question overhead by switching the context CUDA Programming and Performance	3	2867	February 6, 2009
How is the laptop GPU able to do the rendering and execute a cuda program at the same time CUDA Programming and Performance	6	898	August 15, 2023
why in thread context switching there is no need to store state? CUDA Programming and Performance	1	1304	June 3, 2015
Using CUDA/CudaContexts simultanously from multiple CPU threads CUDA Programming and Performance	4	5571	February 3, 2010
Question about Cuda2 floating contex Context switch in host threads CUDA Programming and Performance	2	3198	August 23, 2008
Unexplained gaps in CUDA stream execution Profiling x86 Windows Targets	8	1730	March 26, 2025

How does the GPU decide when to switch-out a CUDA context?

Related topics