Deploying etlt models on xavier deepstream

I have trained a unet resnet18 binary semantic segmentation model using tao toolkit , I want to deploy this on xavier nx , I want to use python for the deepstream app to deploy , I haven’t found any guide anywhere to do that , please help me. I am running a docker container with deepstream 6.2 . I want to use only the .etlt file , and the engine file to be generated on the machine , for this I read in some threads that I have to generate the calibration cache file , I haven’t been able to do that as well. I tried converting the .etlt model to engine by using tao converter on the xavier machine , I’m getting a error “Model has dynamic shape but no optimization profile specified.” , what is the optimization profile for unet pretrained resnet18 segmentation model?

If you use the .etlt model file generated by tao and apply it in deepstream, I think you can generate the engine file without tao-toolkit. Deepstream can generate the engine file automatically. This will be able to fix your dynamic exception problems.
The above is my project experience, the official may give a better solution.

Could you refer to the link below to learn how to deploy the models with deepstream?
https://github.com/NVIDIA-AI-IOT/deepstream_python_apps

I have modified the config dstest_segmentation_config_industrial.txt , but I’m not sure if the modifications I made are correct as they are not working . Here is the modified config:

[property]
gpu-id=3
net-scale-factor=0.003921568627451
tlt-model-key=nvidia_tlt
model-color-format=2
tlt-encoded-model=model_isbi.etlt
infer-dims=1;320;320
uff-input-order=0
uff-input-blob-name=input_1
batch-size=1

network-mode=0
num-detected-classes=1
interval=0
gie-unique-id=1
network-type=2
output-blob-names=conv2d_19/Sigmoid
segmentation-threshold=0.5
#parse-bbox-func-name=NvDsInferParseCustomSSD
#custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
pre-cluster-threshold=0.5
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Per class configuration

#[class-attrs-2]
#threshold=0.6
#roi-top-offset=20
#roi-bottom-offset=10
#detected-min-w=40
#detected-min-h=40
#detected-max-w=400
#detected-max-h=800

I have tried to do that , but I don’t know if my modifications to the config file are correct , and also I don’t have the calibration cache , even with the calibration options given when running “tao export” it is not generating the calibration file. I have posted my modified config file , please do look at it.

The model trained by tao is compatible and has good performance on deepstream and supports accelerated inference.
Where did you modify the configuration file from? If there is a problem with the deployment, it may be that there is a problem with the file you modified the configuration, I can send it to you if you need it.

please do send it if you can , thanks !

This is the configuration file for inference, which needs to be adapted to your environment and can be used with simple modifications.
You are using the ssd model, right?

config.txt

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=dssd_labels.txt
model-engine-file=../../models/dssd/dssd.etlt_b1_gpu0_int8.engine
tlt-encoded-model=../../models/dssd/dssd.etlt
int8-calib-file=../../models/dssd/dssd_cal.bin
tlt-model-key=nvidia_tlt
infer-dims=3;544;960
uff-input-order=0
maintain-aspect-ratio=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=5
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
output-blob-names=NMS
parse-bbox-func-name=NvDsInferParseCustomNMSTLT
custom-lib-path=../../post_processor/libnvds_infercustomparser_tao.so

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

I dont have the engine and and calibration files though , I am using the Unet model for segmentation

[property]
gpu-id=0
net-scale-factor=0.007843
model-color-format=1
offsets=127.5;127.5;127.5
labelfile-path=unet_labels.txt
##Replace following path to your model file
#model-engine-file=../../models/unet/unet_resnet18.etlt_b1_gpu0_fp16.engine
#current DS cannot parse onnx etlt model, so you need to
#convert the etlt model to TensoRT engine first use tao-convert
tlt-encoded-model=../../models/unet/unet_resnet18.etlt
tlt-model-key=tlt_encode
infer-dims=3;320;320
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=3
interval=0
gie-unique-id=1
network-type=2
output-blob-names=argmax_1
segmentation-threshold=0.0
##specify the output tensor order, 0(default value) for CHW and 1 for HWC
segmentation-output-order=1

[class-attrs-all]
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

When I run the deepstream app , I’m getting the following error:
Opening in BLOCKING MODE
ERROR: Failed to set cuda device (3)., cuda err_no:101, err_str:cudaErrorInvalidDevice
0:00:00.514800080 546 0x39b8aa10 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::~NvDsInferContextImpl() <nvdsinfer_context_impl.cpp:2045> [UID = 1]: Failed to set cuda device 3 (cudaErrorInvalidDevice).
0:00:00.514911504 546 0x39b8aa10 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Failed to create NvDsInferContext instance
0:00:00.514946928 546 0x39b8aa10 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Config file path: dstest_segmentation_config_industrial.txt, NvDsInfer Error: NVDSINFER_CUDA_ERROR
Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-nvinference-engine:
Config file path: dstest_segmentation_config_industrial.txt, NvDsInfer Error: NVDSINFER_CUDA_ERROR

I know that cuda device is available , I just ran pytorch script to check for available cuda devices and Xavier showed up there

It’s always a step forward.
This is not a very complicated problem, this is related to your pipeline configuration file and environment, I think you can create a new topic to deal with this problem

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Thanks for @autodrive2022 's answer.
@sanjay15, you can try to refer to our guide first to set up your enviroment step by step.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Quickstart.html