I have two different neural networks that I run (inference) on the Jetson TX2 that I need them to run at the same time.
When running each of them in a dedicated processes (written in C++ using TRT 6 since it is JetPack 4.3, both running in FP16 mode) I managed to get consistent FPS in both processes (~10FPS in the heavy network and ~10FPS in the smaller network).
But when I take the same exact code and use two threads then the smaller network runs at inconsistent speed between 6-10FPS.
Both threads/processes create their own runtime (tried sharing it, didn’t change) and their own context so I would assume they would behave the same way, is there something shared when you run the code in the same processes?
You will need to use single process and multi-thread to allow GPU to run concurrently.
This is the constraint of Jetson’s GPU context.
So in the two processes scenario, the two processes shared the GPU resource in a time-slicing manner and the performance is stable.
But in two-threads implementation, GPU tries to fully occupy the resources so might cause the inconsistent speed you mentioned.
Is your inference job occupied ~99% GPU resources?
If yes, you can just use the two processes way since no much perf gain if running them concurrently.
I prefer to use one process with two threads due to the memory overhead of the CUDA/TensorRT libraries (the way their are built with the CUDA kernels not being part of the shared memory) so that is why I am trying to move from two processes to one process with two threads.
Few followup questions:
- Who does the timeslicing when running in multi-process mode? is that the driver?
- If I am understanding correctly what you are saying about two processes is that if I use one thread that gets jobs (inference) from say two threads but serialize the execution of the inference it will create the same timeslicing exprience from two processes?
Another option - if I use one stream in two threads will it do the same time-slicing as using two processes?
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.