Tao5 unet catastrophic changes from tao3 unet

I use tao to train models and export to TensorRT to use in our C++ applications.

As I needed to use some tao5 models, I had to install newer versions of CUDA and TensorRT on all computers.

This made the tao3 models unusable since they were created with the older versions, so we proceeded to test retraining one vgg16 unet model under tao5, to our surprise of major differences that resulted in the catastrophic failure of the previous applications to run.

The major difference we found so far is the size of the resulting feature vectors where in tao3 unet, the resulting feature vector is of size rows * columns * number of classes, and each feature contains the probability value for the pixel being in each class. outputDims: (1, 704, 1280, 6) for 6 classes

In tao5 unet, the feature vector is rows * columns * 1. outputDims: (1, 704, 1280, 1). I am assuming that the value is the class prediction. Although at this time all values return 0.

In tao3 unet, we did two things to the input buffer:

First, we normalized the image with

        cv::subtract(image, cv::Scalar(127.5f, 127.5f, 127.5f), image, cv::noArray(), -1);
        cv::divide  (image, cv::Scalar(127.5f, 127.5f, 127.5f), image, 1, -1);

And second, we did a NHWC to NCHW conversion.

My specific questions for tao5 unet are:

  1. What pre-processing do we need to do to each video frame?
  2. What values are returned in the feature vector after inference?
  3. Is there model documentation on the input and output specifications?

Many thanks!


Yes, there are some changes since TAO 4.0.

Currently for UNet,
GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream,

 argmax_1/output : A [batchSize, H, W, 1] tensor containing the class id per pixel location

Previously for Unet,
GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tao3.0_ds6.2ga,

 softmax_1: A [batchSize, H, W, C] tensor containing the scores for each class

Related topic:
DeepStream compatibility issues with UNET output layer change - #28 by adityasingh.


I also asked about pre-processing:

normalization -1 to 1 or something else?
BGR to RGB ?
convert NHWC to NCHW?

Right now we’ve tried all kind of combinations, but the resulting inference in tensorRT is always 0

Happy Holidays, and Thanks!

Refer to tao_tensorflow1_backend/nvidia_tao_tf1/cv/unet/utils/evaluate_trt.py at main · NVIDIA/tao_tensorflow1_backend · GitHub, it will convert to nchw.
Also, according to deepstream_tao_apps/configs/nvinfer/unet_tao/pgie_unet_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub, it is BGR.
The preprocessing is the same as we discussed in Custom TAO unet model classifying only two classes on Deepstream! - #33 by Morganh.
More info can be found in tao_tensorflow1_backend/nvidia_tao_tf1/cv/unet/utils/data_loader.py at c7a3926ddddf3911842e057620bceb45bb5303cc · NVIDIA/tao_tensorflow1_backend · GitHub.

More, instead of tensorrt engine, please check if the tlt you trained in tao3 can run inference well.

I retrained the tao3 model in tao5, created the TensorRT engine in tao5, and run inference successfully in tao5.

But I am unable to get it to work in C++ with TensorRT

The tao3 model doesn’t work anymore, since I changed the CUDA and TensorRT versions.

To narrow down, please use deepstream_tao_apps github to run.
You can config your engine file in line30, i.e.,
deepstream_tao_apps/configs/nvinfer/unet_tao/pgie_unet_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub.

Then run the Unet according to GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream,
For example,

./apps/tao_segmentation/ds-tao-segmentation -c configs/nvinfer/unet_tao/pgie_unet_tao_config.txt -i file:///$DS_SRC_PATH/samples/streams/sample_720p.mp4

@morgan I won’t be able to do that, After long hours of many installation problems, it turns out Owner avatar deepstream_tao_apps requires DeepStream SDK 6.4 GA which is not compatible with Ubuntu 20.04 (couldn’t install requirement libssl3, which is only available on Ubuntu 22.04 onwards) which we MUST use because we are on ROS.


You can use below way instead. Run inside a deepstream docker and then run the inference in it.

$ docker run --runtime=nvidia -it --rm nvcr.io/nvidia/deepstream:6.3-triton-multiarch  /bin/bash
# apt install libeigen3-dev && cd /usr/include && ln -sf eigen3/Eigen Eigen
# cd -
# git clone https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps.git
# cd deepstream_tao_apps
# export CUDA_VER=12.1
# make
1 Like

@Morganh It turns out that not only the structure of the output tensor changed, but also the data type.

How about a warning of such major changes at the beginning of the notebook???

Agree. I will sync with internal team for this. And could you share the finding about the data type?

The data type used to be double now int32

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.