TAO to Deepstream - How to export classifier?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)

GPU

• DeepStream Version

nvcr.io/nvidia/deepstream:9.0-triton-multiarch

• JetPack Version (valid for Jetson only)

N/A

• TensorRT Version

nvcr.io/nvidia/deepstream:9.0-triton-multiarch

• NVIDIA GPU Driver Version (valid for GPU only)

590.48.01

• Issue Type( questions, new requirements, bugs)

question/bug?

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Full repo to reproduce is here:

I am trying to train a classifier for use inside deepstream. My real dataset is much bigger but I can’t share it so I have made the smaller example.

My workflow is:

  1. Train classifier in TAO based on GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC | NVIDIA NGC

Train script is here: deepstream-classification-min-example/train/train.sh at main · thebruce87m/deepstream-classification-min-example · GitHub

and settings here: deepstream-classification-min-example/train/train.yaml at main · thebruce87m/deepstream-classification-min-example · GitHub

  1. Export onnx.

I do onnx since the options listed here: Deploying to DeepStream for Classification TF1/TF2/PyTorch — Tao Toolkit are either deprecated (.etlt or TAO Convertor) or suggest using TAO Deploy which ends up with a tensorrt version mismatch when you try to run the .engine in the deepstream docker. So none of the official options actually work.

The export script is here: deepstream-classification-min-example/train/export-onnx.sh at main · thebruce87m/deepstream-classification-min-example · GitHub

and settings here: deepstream-classification-min-example/train/export.yaml at main · thebruce87m/deepstream-classification-min-example · GitHub

  1. Attempt to use classifier in the deepstream app

As you can see in the repo I have modified the deepstream_test2_app from the python examples.

I use the people/vehicle object detector as the PGIE. My dataset was taken from the crops it produced so these should be similar to what is being sent to the SGIE.

Basically I can’t get the classifier to work properly in deepstream. I believe I am following an “official path” that should just work but obviously I am doing something wrong.

I train a simple model with three classes and the validation looks good after training but in deepstream they all classify as the same class (red). I see this in my large dataset too. If I use the python model directly from the TAO training in a python script it works and everything classifies correctly but somewhere along the line converting to onnx, .engine or maybe the parameters in the deepstream config are wrong and the deepstream performance doesn’t match.

How do I use the classifier in deepstream?

I noticed that here: TAO Deploy Overview — Tao Toolkit

It says you can install nvidia-tao-deploy using pip. Great I thought, this will solve the .engine tensorrt incompatibility if I do this from within the deepstream container so that it generates an engine that is compatible with deepstream but no joy.

pip install nvidia-tao-deploy

But if you do this within nvcr.io/nvidia/deepstream:9.0-triton-multiarch you get the eventual error:

  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [9 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-xqeus28n/scikit-image_5abbdf7ad39f44c7942341c105e02548/setup.py", line 234, in <module>
          'build_ext': openmp_build_ext(),
                       ^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-xqeus28n/scikit-image_5abbdf7ad39f44c7942341c105e02548/setup.py", line 58, in openmp_build_ext
          from numpy.distutils.command.build_ext import build_ext
      ModuleNotFoundError: No module named 'numpy.distutils'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
root@n-deskmini:/code/deepstream_test2_app# 
root@n-deskmini:/code/deepstream_test2_app# 
root@n-deskmini:/code/deepstream_test2_app# pip install numpy
Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (1.26.4)

After the onnx is generated, then please refer to official guide deepstream_tao_apps/apps at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub to deploy it.

Thanks for the response. Did you send that the correct link? It has no instructions on how to deploy an onnx successfully.

I am missing critical information. Even the example classification files in the link you sent, e.g. deepstream_tao_apps/deepstream_app_tao_configs/nvinfer/config_infer_secondary_vehiclemakenet.yml at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub has a bunch of configs that are no doubt model specific:

offsets: 124;117;104
infer-dims: 3;224;224
uff-input-blob-name: input_1
model-color-format: 0
output-blob-names: predictions/Softmax:0

But the classification has 3 different options for training: TF1/TF2/PyTorch

and apparently 3 different preprocessing methods that can be used ( according to Deploying to DeepStream for Classification TF1/TF2/PyTorch — Tao Toolkit ) of caffe, torch and tf.

As far as I can see from your link there is no indication of which method the classifier was trained so I have no idea on what the correct parameters are for my model.

Can you please review the example code I provided and advise the correct parameters? I have looked at the documentation and I might be missing something but there seems to be a gap between training a model in TAO and getting the correct parameters for running it in deepstream.

Just a reminder of how I trained. My bash script was:

#!/usr/bin/env bash
set -euo pipefail

REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
WORKSPACE_DIR="$REPO_ROOT"

DOCKER_REGISTRY="nvcr.io"
DOCKER_NAME="nvidia/tao/tao-toolkit"
DOCKER_TAG="6.26.3-pyt"
DOCKER_CONTAINER="$DOCKER_REGISTRY/$DOCKER_NAME:$DOCKER_TAG"

docker run \
  -it \
  --rm \
  --gpus all \
  --ipc=host \
  --ulimit memlock=-1 \
  --ulimit stack=67108864 \
  --user "$(id -u):$(id -g)" \
  -v "$WORKSPACE_DIR":/workspace \
  "$DOCKER_CONTAINER" \
  classification_pyt train \
  -e /workspace/train/train.yaml

And the train.yaml is:

results_dir: /workspace/train/results/

model:
  backbone:
    type: resnet_18
    pretrained_backbone_path: null
    freeze_backbone: false
  head:
    type: TAOLinearClsHead
    binary: false
    in_channels: 512
    topk: [1]

dataset:
  dataset: CLDataset
  root_dir: /workspace/data/
  num_classes: 3
  img_size: 224
  batch_size: 9
  workers: 4
  augmentation:
    mixup_cutmix: false
    random_flip:
      enable: true
      hflip_probability: 0.5
      vflip_probability: 0.0
    random_aug:
      enable: true
    random_erase:
      enable: true
    random_rotate:
      enable: false
    random_color:
      enable: false
    with_scale_random_crop:
      enable: true
    with_random_crop: false
    with_random_blur: true
  train_dataset:
    images_dir: /workspace/data/train
  val_dataset:
    images_dir: /workspace/data/val
  test_dataset:
    images_dir: /workspace/data/test

train:
  num_gpus: 1
  gpu_ids: [0]
  num_epochs: 30
  validation_interval: 1
  checkpoint_interval: 1
  pretrained_model_path: null
  optim:
    optim: sgd
    lr: 0.01
    momentum: 0.9
    weight_decay: 0.0001

If this isn’t the recommended way of training to deploy to deepstream then please tell me what the recommended way is and I will use that. I’m just trying to find the “happy path” to get something working.

So I found in another answer this link:

Which shows the following snippet in tao-deploy/nvidia_tao_deploy/cv/classification_pyt/scripts/inference.py

    dl = ClassificationLoader(
        trt_infer._input_shape,
        [cfg.dataset.data.test.data_prefix],
        mapping_dict,
        is_inference=True,
        data_format="channels_first",
        mode="torch",
        batch_size=batch_size,
        image_mean=image_mean,
        image_std=img_std,
        dtype=trt_infer.inputs[0].host.dtype)

mode="torch" would suggest that the loading mode is torch.

Which according to the documents here: Deploying to DeepStream for Classification TF1/TF2/PyTorch — Tao Toolkit

has the following settings:

net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0

So I think these might be correct for classification_pyt? What about the other settings?

OK, I think I found the source of the major problem: The label file for training through TAO and the label file for deepstream are different formats

I have three test classes for my classifier - red, green and blue. I want it to detect things that are red, green and blue since this would also help to test whether I had the RGB/BGR setting for the model correct too.

The TAO format for classes.txt is:

red
green
blue

but the deepstream format specified to the sgie, e.g. labelfile-path: /code/deepstream_test2_app/deepstream-labels.txt is:

red;green;blue

So if your model only ever predicts one class then this could be the problem. Basically my model would only see “red” in deepstream.

Yes, it is correct. Refer to TAO 5.0 Classification (PyTorch) deploy error - #47 by Morganh.