multithread for CPU/GPU parrallel

Hello, I am working on deploying a deep learning network on Jetson Xavier.
Despite the TensorRT accelerating, I am trying to paralleling the whole process including frame preprocessing and model inferring . How to realize the following pipeline [/img]?
thread1 cpugpucpugpucpugpu
thread2 cpugpucpugpucpugpu
Because the Xavier is multi-core platform.
Any advice ?

thread1 cpugpucpugpucpugpu
thread2 xxxcpugpucpugpucpugpu
So that the GPU is always in calculating.

Hi,

Xavier only has one GPU.
You will need to put the separate CUDA tasks in one application and different CUDA streams.

You can find more detail in this comment:
https://devtalk.nvidia.com/default/topic/1068756/jetson-agx-xavier/best-approaches-for-mps-on-xavier/post/5413979/#5413979

Thanks.