What I intended to ask was that suppose there are two CUDA applications running - to be more precise each of these CUDA app has a different “kernel” code. Will the driver submit each of these kernels serially ( i.e. after kernel 1 finishes, only then submit the next job to GPU) OR can the two kernel operations be interleaved, like on a convetional CPU. In the latter case, when the GPU switches between the kernels - it needs to store the context and the state of the kernel1. Is this possible? If yes, will the host code have access to this context?