Running two neural networks simultaneously


I am relatively new to the Jetson Xavier NX. I understand that the Jetson can run multiple neural networks in parallel as seen from multiple demonstrations.

The issue I’m facing right now is running two programs simultaneously. I have a face recognition code written in python using the face_recognition package and a pose estimation code based on NVIDIA’s trt_pose. I understand that running both of these codes together would result in an overloaded RAM and probably cause my jetson to crash. My question is:

  1. In order to run these programs in parallel, do I have to utilise TensorRT? i.e instead of using the face_recognition package, I should use a TensorRT-accelerated face recognition program.

  2. Is there a workaround to run both these programs in parallel? e.g utilising GPU and different DLA for each program

Though I’m not sure assigning GPU and DLA for each program is possible nor does it solve the problem of RAM usage.


1. TensorRT has been optimized for Jetson.
You can get a better performance and less memory usage with it.

2. Another way is to deploy the model with different CUDA stream but in same process.

Since only TensorRT can deploy a model on DLA.
If you want to assign the app with DLA+GPU, to migrate your use case into TensorRT is essential.



Thanks for the reply and help! Regarding your second answer, are you referring to the DeepStream SDK? Apologies if I sound selfish but if there are any documents or links that you could share to help me understand the deployment using different CUDA streams, I would greatly appreciate it!


It indicates CUDA stream from the CUDA library.
You can find some introduction in this webinar:

Below is a TensorRT example for multi-stream usage for your reference:



This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.