GPU difference inference


Can the tlt models generated on GPU 1080 be inferred on Triton server with GPU 3070, 3080?

Yes, but please note that it is .etlt model. You can run tao export and then copy the .etlt model to Trition server.

Thank you!


How to run a detectnet_v2 model on the triton server, it is a little confusing?

For running detectnet_v2 model in triton server, you can refer to Integrating TAO CV Models with Triton Inference Server — TAO Toolkit 3.21.11 documentation.

I saw this documentation earlier, but it is not clear…
Is there another client that I have to install for Detectnet_v2?

After you have trained a detectnet_v2 model, you can deploy the model to run inference.
For inference, usually there are below ways.

  1. Directly run “tao detectnet_v2 infernece xxx”
  2. Run it with deepstream , refer to DetectNet_v2 — TAO Toolkit 3.21.11 documentation
  3. Run with triton server. You can run it refer to the NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton ( . Peoplenet or dashcamnet is actually based on detectnet_v2 network.
  4. You can also run with your standalone inference code. For example, Run PeopleNet with tensorrt - #21 by carlos.alvarez

How about the post processing, is it a file or a folder?
If it is a file then what is the extension?

The DetectNet_v2 inference sample has 2 components that can be configured

  1. [Model Repository]
  2. [Post Processor]

For running in triton server, please refer to the README.
For example, running official released peoplenet model with it, see its postprocessing config file in tao-toolkit-triton-apps/clustering_config_peoplenet.prototxt at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps (

Can I run classification with a standalone inference code?

Yes, it can. You can leverage
or search/find some topics in forum. For example,

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.