CPU-GPU Parallel programming (Python)

Is there a way we could concurrently run functions on CPU and GPU (using Python)? I’m already using Numba to do thread level scheduling for compute intensive function on the GPU, but I now also need to add parallelism amongst CPU-GPU. Once we ensure that the GPU shared memory has all the data to start processing, I need to trigger the GPU start and then in parallel run some functions on the host using the CPU. Is there a standard library/way to achieve this? Appreciate any pointers in this regard.

numba execution characteristics will be similar to CUDA C++ in this regard.

Launch your numba kernel. Thereafter (until you do some sort of synchronization such as a copy of data from device to host) any CPU code you write after the kernel launch will run concurrently (on the CPU) with the kernel (running on the GPU).

Thanks Robert. I was thinking on similar lines but wasn’t very sure. I checked that until I put some synchronization for task completion between the cores, I’m effectively running GPU-CPU in parallel. Thanks again.