Training a model (with TAO?) using deepstream SDK 6.4 docker

Is there documentation that explains how TAO can be used on the deepstream SDK 6.4 docker deepstream:6.4-gc-triton-devel
Based on the description I would expect this docker contains all TAO tools but I can’t find this in the container.
Documentation at some point simple reverts to TAO Toolkit Quick Start Guide - NVIDIA Docs which describes how to install older dockers.
The whole thing appears to be overly complicated for my simple use case.

Is there an easy/advised way to tune some yolo/resnet like models for end use on jetson orin (ubuntu 22.04 on x86 is available for training)
Getting inference running on the DS6.4 docker was very easy, I would expect that custom model training on that docker is as easy to set up but the TAO 5.2.0 documentation does not help me.

Or maybe TAO is overkill for me and I should try training using more lower level APIs? tensorrt (I used to use keras)?
Although the quantization looks like an important step which is probably the main added value of TAO for me.

Please redirect me to the correct forum for this question if this is not the place to post it.

TIA

• Hardware (Jetson Orin Nano/Geforce RTX 2060/RTX 4080 Laptop)
• Network Type (Detectnet_v2/Yolo_v4/Classification/etc)

Officially, to run inference with TAO models in deepstream is the github GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream. Some examples can be found in the topic Run peoplenet on python virtual environment without jetson - #2 by Morganh. You can use the docker nvcr.io/nvidia/deepstream:6.4-triton-multiarch.

TAO is expected to run training with dgpu and x86_64 system. Currently, TAO does not support training on arm-based system. But for inference, any dgpu machine or Jetson devices are supported.

TAO has its own docker, see TAO Toolkit | NVIDIA NGC, you can run with it. DS docker is not designed to run TAO training by default.

OK, thanks.

So what is the advised setup

  1. Train/tune models by using the DS6.4 deepstream:6.4-gc-triton-devel docker but setup TAO 5.2.0 ‘Python wheels’ mode in the docker?

  2. Ignore the newer DS version docker, simply follow TAO 5.2.0 ‘containers directly’ mode based on older container specified in the documentation (e.g. nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 for yolo) .

  3. Use the latest docker from the catalog (TAO Toolkit | NVIDIA NGC) e.g. nvcr.io/nvidia/tao/tao-toolkit:5.2.0.1-pyt1.14.0. Unclear to me if it will support yolo or facedetectir? (i don’t see this in the catalog) FYI: I have worked in tensorflow 1/keras in the past but am willing to switch to pytorch/other if that environment is advised/more future proof.

  4. Wait a bit because nvidia is working on a new (docker based) TAO environment?

FYI: It’s setting up a training environment which I’m struggling massively with. The DS inference sets up easily enough.
FYI 2:The fact that TAO training is only available on x86_64 is not a problem for me, I plan to use RTX 2060 (laptop, ubuntu 22.04) or RTX 4080 (laptop) if speed/memory depletion requires it.

To install TAO with wheels inside DS6.4 deepstream:6.4-gc-triton-devel docker should be a way. But not sure the status as we did not verify on the ds docker.

Please note that different networks will use different docker. For yolo, yes, please use nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5.

Please note that different networks will use different docker. Refer to TAO Toolkit Launcher - NVIDIA Docs.
For yolo, please use tf1 versions in (TAO Toolkit | NVIDIA NGC)

TAO contains several dockers. User can choose different docker based on different networks.
Officially, we provide tao-launcher. It will auto select the docker.

You can use tao docker to train. After that, copy the output model into deepstream docker.

I have set up my environment using the tao dockers.
It all looks functional.
Thanks for the explanation.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.