Custom TAO unet model classifying only two classes on Deepstream!

I did, same identical result in fp32 mode.

I see what you say about the third class. But the more obvious classifications are not happening, while they do happen with tao unet inference.

tao-converter -k nvidia_tlt -p input_1,1x3x512x512,4x3x512x512,16x3x512x512 -t fp32 -e ./6S001_fp32.engine ./trtfp32.6s01.etlt

model-engine-file=../../models/peopleSemSegNet/6S001_fp32.engine

./apps/tao_segmentation/ds-tao-segmentation -c ./configs/peopleSemSegNet_tao/pgie_peopleSemSegNet_tao_configC.txt -i /home/david/Envs/TAO/6S001/data/images/test/0009.jpg

Could you try to check the pixel value of output image? Are there totally 5 kinds of pixel value?

More, to narrow down, please try to use “tao unet infernece xxx” to run against the .engine file or .trt file generated by deepstream.

1 Like

Hi,
The exported model is fp16 but the config file is for int8
(network-mode=1) , Need to change to the right mode

Also better run ds unet segmentation instead of peopleseg as there might be different post processing (peopleseg is specific use case)

Thanks

Morganh: There are only three classes in the output image.

As far as running “tao unet infernece xxx” , the first image on this post was generated from with the tao notebook by:

tao unet inference --gpu_index=$GPU_INDEX -e $SPECS_DIR/unet_retrain_vgg_6S.txt \
                  -m $USER_EXPERIMENT_DIR/export/tao.fp326s01.engine\
                  -o $USER_EXPERIMENT_DIR/export/ \
                  -k $KEY

Please notice that I use

tao.fp326s01.engine for the engine that was tao converted with tao and can ONLY BE USED in TAO
ds.fp326s01.engine for the engine that was tao-converted from with the deepstream docker and can ONLY BE USED IN THAT DOCKER
AI01.fp326s01.engine for the engine that was converted with tao-converter on the “host PC” which is what I use in C++

Also, as I said, the engine file is generated by tao-converter, so I dont understand what do you mean by ".. the .engine file or .trt file generated by deepstream.. "?

I fed the tao generated etlt model as

model-engine-file=/home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt

And got deserialization errors which I assume are a function of version issues between the dockers involved in deepstream 6.0.1 and the tao docker that generated the model. This type of error, by the way, is extremely frustrating and has resulted in a huge waste of time for my my team.

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::29] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::76] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt
0:00:00.793285413 1010 0x562ebb9a2260 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1889> [UID = 1]: deserialize engine from file :/home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt failed
0:00:00.793369210 1010 0x562ebb9a2260 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1996> [UID = 1]: deserialize backend context from engine from file :/home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt failed, try rebuild
0:00:00.793382472 1010 0x562ebb9a2260 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:861 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:799 failed to build network.
0:00:00.793765976 1010 0x562ebb9a2260 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1934> [UID = 1]: build engine file failed
0:00:00.793794617 1010 0x562ebb9a2260 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2020> [UID = 1]: build backend context failed
0:00:00.793809719 1010 0x562ebb9a2260 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1257> [UID = 1]: generate backend failed, check config file settings
0:00:00.793867609 1010 0x562ebb9a2260 WARN nvinfer gstnvinfer.cpp:841:gst_nvinfer_start: error: Failed to create NvDsInferContext instance
0:00:00.793876393 1010 0x562ebb9a2260 WARN nvinfer gstnvinfer.cpp:841:gst_nvinfer_start: error: Config file path: ./configs/peopleSemSegNet_tao/pgie_peopleSemSegNet_tao_configC.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

eenav:

I am running ./apps/tao_segmentation/ds-tao-segmentation with a config file modified from peopleSemSegNet_tao which was the closest segmentation related tao apps config .

The network-mode=1 was at 1 because of a previous test, but I run in all three modes and makes no difference.

But I think the problem is elsewhere, perhaps in tao-converter since using the model in C++ and tensorRT produce an identical result.

Because the Tensorrt version is different. See more in UNET — TAO Toolkit 3.22.02 documentation

If the TAO inference environment has the same version of TensorRT and CUDA against the environment where you run deepstream inference , the unet engine file(generated via deepstream) can be directly loaded, and vice versa.

Since you already have similar result between your own c++ inference and deepstream inference, I suggest you to do below experiment. This experiment is only related to "tao unet inference xxx " and your c++ inference.

  1. Try to use "tao unet export xxx " to generate .engine file. See UNET — TAO Toolkit 3.22.02 documentation
  2. Then use "tao unet inference xxx " to check if the inference result is expected.
  3. Then run your own c++ inference to check if the .engine file has similar result as step 2.

" … has the same version of TensorRT and CUDA against the environment where you run deepstream inference …"

Not so. If I use such a model I get de-serialization errors from Deepstream. As I explained, what I did is to install Tao-converter in the deepstream docker, and tao-convert the etlt file created by tao toolkit. I’ve been precisely trying to do what you suggest to a great expense of time for no result.

Exactly the same situation with the models to be used from C++ , which are created by tao-converter from the same tao toolkit generated etlt, on the local machine host environment, which has installed tensrorRT 8.2.4 GA.

This is another of the many sources of frustration that took us a very long time to understand, and that is very difficult to overcome.

On the other hand, the way the tao launcher instantiates docker images for command execution from within the notebooks has made it impossible to access the docker (since its unknow which docker is used) and understand the versions of CUDA and tensorRT and other tools used ( or for that matter to run other tools simultaneosly to map de hyperparamenters effect on training making hyperparameter tuning very difficult in comparison to other options ) and be able to replicate that on the host environment.

As far as the steps you suggest:

  1. Try to use "tao unet export xxx " to generate .engine file. See UNET — TAO Toolkit 3.22.02 documentation
  2. Then use "tao unet inference xxx " to check if the inference result is expected.

As I said, I did that and the result is the first picture on my post, and good enough at this stage.

  1. Then run your own c++ inference to check if the .engine file has similar result as step 2.

Loading the same model on the host environment on C++ produces deserialization errors. I need to use tao-converter on the etlt file from the host environment. Te same goes for Deepstream.

Update:

By way a a little bit of docker curiosity, I was able to access the tao docker image nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3 and see that the tensrorrt version is 8.0.1-1+cuda11.3.

I will attempt to downgrade the version of tensorrt and try the experiment.

thanks

Update 1:

I have CUDA 11.4 installed in my host environment and

sudo apt-get install tensorrt

Refused to install, I am presuming I need to install CUDA 11.3 which will mean a complete recompiling of openCV with its own issues. A one day project…

Could you run c++ inference inside the same environment where you run “tao unet inference xxx” ? Then there should not be deserialization errors.

One more experiment if you run inference with deepstream.
Please note that the deepstream6 can parse onnx etlt model as well(the info in deepstream_tao_apps/pgie_unet_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub is not updated, sorry for that) .

So, you can config .etlt model and the key in ds config file and then comment out model-engine-file, to run inference again.
The ds will generate trt engine automatically.

tlt-encoded-model=your.etlt
tlt-model-key=yourkey
#model-engine-file=…/…/models/peopleSemSegNet/6S001_fp16.engine

Thanks! Tried that and got error:

Regenerated the engine :


!tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/retrain/weights/model_retrained.tlt \
               -k $KEY \
               -e $SPECS_DIR/unet_retrain_vgg_6S.txt \
               -o $USER_EXPERIMENT_DIR/export/tao.fp32_6s01.etlt \
               --data_type fp32 \
               --engine_file $USER_EXPERIMENT_DIR/export/tao.fp32_6s01.engine \
               --max_batch_size 3

the deepstream config has:

tlt-encoded-model=home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt
tlt-model-key=nvidia_tlt

Inside the docker I’m running:


./apps/tao_segmentation/ds-tao-segmentation -c ./configs/peopleSemSegNet_tao/pgie_peopleSemSegNet_tao_configC.txt -i /home/david/Envs/TAO/6S001/data/images/test/0009.jpg

and the output is

root@AI01:/opt/nvidia/deepstream/deepstream-6.0/deepstream_tao_apps# ./apps/tao_segmentation/ds-tao-segmentation -c ./configs/peopleSemSegNet_tao/pgie_peopleSemSegNet_tao_configC.txt -i /home/david/Envs/TAO/6S001/data/images/test/0009.jpg

Now playing: ./configs/peopleSemSegNet_tao/pgie_peopleSemSegNet_tao_configC.txt
0:00:00.242633444 901 0x564bb6593460

INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files

NvDsInferCudaEngineGetFromTltModel: Failed to open TLT encoded model file /opt/nvidia/deepstream/deepstream-6.0/deepstream_tao_apps/configs/peopleSemSegNet_tao/home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:724 Failed to create network using custom network creation function
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:789 Failed to get cuda engine from custom library API
0:00:00.762695569 901 0x564bb6593460 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1934> [UID = 1]: build engine file failed
terminate called after throwing an instance of ‘nvinfer1::InternalError’
what(): Assertion mRefCount > 0 failed.
Aborted (core dumped)

listing of host mapped directory containing the tao models from inside the deepstream docker

root@AI01:/home/david/Envs/TAO/6S001/export# ls -l
total 297824
-rw-rw-r-- 1 1000 1000 37 May 9 14:01 6S001labels.txt
-rw-rw-r-- 1 1000 1000 103645829 May 6 11:40 AI01.fp32_6s01.engine
-rw-r–r-- 1 root root 69246012 May 9 13:57 dsfp32.6s01.engine
drwxr-xr-x 2 1000 1000 4096 May 3 20:04 mask_labels_trt
-rw-r–r-- 1 1000 1000 533 May 10 09:09 results_trt.json
-rw-r–r-- 1 1000 1000 67658529 May 10 09:08 tao.fp32_6s01.engine
-rw-r–r-- 1 1000 1000 64388119 May 10 09:08 tao.fp32_6s01.etlt
-rw-r–r-- 1 1000 1000 80 May 3 14:19 target_class_id_mapping.json
drwxr-xr-x 2 1000 1000 4096 May 3 14:19 vis_overlay_trt

May I know which docker are you running?

The docker is

nvcr.io/nvidia/deepstream 6.0.1-devel

with the deepstream-tao-apps installed

Seems that the path is not correct. Please double check.

As for the path, yes, I saw that, but the config file has

tlt-encoded-model=home/david/Envs/TAO/6S001/export/tao.fp32_6s01.etlt

so I have no idea how that directory concatenation occurs…

at any rate, I copied the etlt files to opt/nvidia/deepstream/deepstream-6.0/deepstream_tao_apps/configs/peopleSemSegNet_tao/

and included in the config file

tlt-encoded-model=tao.fp32_6s01.etlt

and that runs to the same result

I guess thats a bug for deepstream that when using tlt-encoded-model it concatenates the config path with the etlt filename

UPDATE:

After downgrading on the host computer the same versions of tensorRT, CUDA and cuDNN as in the docker images, and rebuilding openCV from source… I was able to do the following tests:

  1. Regenerated the trt engine from the tao toolkit notebook
  2. Used the same engine for tao inference within the tao notebook
  3. Used the same engine for deepstream inference from tao apps
  4. Used the same engine for C++ inference

In addition,

  1. Run inference from deepstream tao apps using the original retrained etlt

Sadly, there is no change. tao inference produces the expected good results, the rest do not.

Is there a way to have the deepstream tao app to output the image mask?

Here are the output images:

  1. from tao inference

  2. from deepstream etlt

  3. from deepstream trt engine

4 from C++

There is an easy solution. You can refer to deepstream_tao_apps/pgie_unet_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub

You just need to put your .etlt model to the models folder.

I solved the directory concatenation issue.

Please read my previous post which is more relevant.

If possible, could you share the .etlt model, key for reproducing? We will check further internally.