Hi,
Jetson/ARM platform does not support MPS (Multi-Process Service).
You will need to put the separate ML/CUDA tasks in one application and different CUDA streams.
Since one CPU process creates one CUDA context.
If the CUDA tasks running on GPU are in different processes, they will run in different CUDA context.
The GPU resource for different CUDA contexts are time-sliced, indicating the kernel can’t run in parallel:
https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#multiple-contexts
Thanks.