Questions for deepstream/tensorRT/containers on Nano

My current scenario requirements, the plan was to run yolov4, sort, and trt_pose on Jetson Nano or Tx2. These have been run separately in the deepstream and tensorrt environments. But yolov4 has some problems with trt_pose, so some problems need to be confirmed.

  1. How tensorrt loads multiple models at the same time and executes them sequentially. Are there any relevant documents or sample code?
  2. What is the approximate accuracy of peoplenet? If we need to add 1-2 additional objects recognitions, can we continue training with TLT on the basis of peoplenet pretrained model?
  3. The deepstream sample code shows, the tracking function is completed with the NvMOT library. Haven’t figured out the relationship between NvMOT and NvDCF tracker? If we use this lib instead of sort, is there any more detailed documentation and sample codes?
  4. NGC provides some containers that already have apps inside. Are these containers more suitable for multiple input sources, and each container is responsible for each one input video streaming? If we want to use only one video input streaming, and the input images are separately inferred by applications on several containers, or the results of the previous container’s inference are output to the next container, how does the scene container like this directly communicate?
  5. Regarding gesture recognition, the jeston projects has a sample, but I haven’t tried it yet. On jetson nano or Tx2, which one is more suitable, openpose or trt_pose?

Hi,

1.
You will need to use one TensorRT engine for each model.
For multiple models, it is similar to run the pipeline multiple time.

2.
Based on below blog, PeopleNet accuracy is around 84%.
And yes, you can refine the network with TLT directly.

3. You can find some detail about NvDCF and NvMOT below:
https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream%20Plugins%20Development%20Guide/deepstream_plugin_details.html#wwpID0E0540HA

4.
If you are using the source that directly connects to the device (ex. csi camera), it’s more recommended to use native Deepstream sample.
This communication between container will need to go through file or network.

5.
trt_pose is using TensorRT for inference.
It can give you a better performance.

Thanks.

1 Like

For the Q1, so it means if I use multiple models, I have to use deepstream?

Hi,

Deepstream do provide the API for using multiple TensorRT engine.
So you don’t need to take care about the low level implementation.

If you prefer to create the engine on you own, it’s also possible.

Thanks.