Poor Result After INT8 Optimization (TLT Getting Started Guide)

Environment

Google GCP spun up with nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
TensorRT Version : 7.0.0-1+cuda10.0
GPU Type : NVIDIA Tesla T4
Nvidia Driver Version : 4.2.2 [L4T 32.2.1]
Operating System + Version : Ubuntu 18.04, Linux kernel 5.3.0-1032-gcp
Python Version (if applicable) : 3.6

Description

I tranied the DetectNet-v2 resnet-18 for kitti with docker image nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 per this guide - Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation

Based on what I can see, step 1-8 were no problem. Evaluation from step 7 gave 83.72% AP for car, 84.40 for cyclist, and 74.9 for pedestrian. Trained the model, Pruned, Re-trained, then Trained again. When I went through step 8. Visualize inferences, it seemed good (cars were marked, people were marked, etc.).

Step 9 is where things didn’t work out for me well. For 9A, I was able to create calibration.tensor file using tlt-int8-tensorfile After that I was also able to use tlt-export to get the .etlt file. Lastly, converting using tlt-converter I was able to create TensorRT file. I am assuming that I went through above steps successfully, because logs didn’t have any error. Please see the logs for above comments here - Microsoft OneDrive - Access files anywhere. Create docs with free Office Online.

When I ran tlt-infer, the result was actually really bad. I tried changing -m flag when I did tlt-int8-tensorfile by using 20% of the training data (based on what I was from tlt-train output - 2020-11-03 15:02:27,668 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 6434, number of sources: 1, batch size per gpu: 4, steps: 1609) So 20% of 6434 was about 1286.

Even then, the result was as shown here. Result image - Microsoft OneDrive - Access files anywhere. Create docs with free Office Online.

Question

I would appreciate any pointer as I can’t seem to figure out what I implemented wrong. My end goal is to create TensorRT file for TX2, but I wanted to try the tutorial before moving forward (and I do understand that TX2 TensorRT shouldn’t be INT8).

Why were you training with nvcr.io/nvidia/tlt-streamanalytics:v1.0_py2? It seems to be a version one year ago. Could you try a new one? For example, 1.0.1 or 2.0_dp docker.

More, even with nvcr.io/nvidia/tlt-streamanalytics:v1.0_py2, I think the tlt-infer result will not be bad. I will find time to check your result.

@Morganh
I apologize. I actually copied that incorrectly. I am using the 2.0_py3 version (nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 )

Please share the command and full log when you run tlt-infer.
Suggest you to save your jupyter notebook as an html file, then attach here. Thanks.

Oh, ignore my request above. I saw some logs in one folder link you shared.
Can you paste your $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt ?

@Morganh Please let me know if this is what you were looking for. In addition to the file you requested, I am including the other kitti files. “Attach file” says that i can only attach “select images or files from your device (jpg, jpeg, png, gif, log, doc, docx, txt, cpp, c, rtf, gzip, zip, gz)”

Is there a different way to add html file? In the meantime, here are the contents of those files. https://1drv.ms/u/s!AjcYy-uvHk09j8ZOE5n642_c2BxGsg?e=WKQKmS

Html is not needed now. Please attach your $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt
Thanks.

@Morganh
Please see attached files.detectnet_v2_retrain_resnet18_kitti.txt (5.2 KB) detectnet_v2_train_resnet18_kitti (1).txt (5.2 KB) detectnet_v2_inference_kitti_tlt.txt (2.2 KB) detectnet_v2_inference_kitti_etlt.txt (2.2 KB)

Hi @a428tm,
I find your error is similar to Nvidia TLT.
To narrow down, could you please run tlt-infer against the tlt model instead of trt.engine?
That means,
Please try examples/specs/detectnet_v2_inference_kitti_tlt.txt to confirm your tlt model can get good inference result (section 10)

@Morganh

Thanks for the quick response. I did read the post before posting mine. Not sure how that relates but please do let me know if I made similar mistake as OG post did.

As for the inference using trt.engine -
If i understood your question, are you asking me to check Step 8 again? For step 8, inside of detectnet_v2_inference_kitti_tlt.txt, I am using this

tlt_config{
model: “/workspace/tlt-experiments/detectnet_v2/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt”
}

The result is as shown in this image (which I thought was good).

Please let me know if this isn’t what you were asking for.

Thank you,
Jae

Yes, that’s it. Correct, step 8. Your tlt model runs well. But trt engine runs into bad inference result.
So, seems that there is something wrong when you generate int8 trt engine.
To narrow down, could you please generate fp32 trt engine instead of int8 trt engine? Then try to run tlt-infer against the fp32 trt engine to check if it runs well.

@Morganh
Roger that. I actually tried doing that (actually FP16) on my own but failed. Would you mind if I review what I tried to make sure there was no error in the steps I took?

  1. Since I will be running FP32 (or 16), I DON’T run Step 9A-1

!tlt-int8-tensorfile detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt
-m 1286
-o $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor

  1. I went straight to tlt-export step. 9A-2

!rm -rf $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
!rm -rf $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin
!tlt-export detectnet_v2
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
-k $KEY
–data_type fp16
–verbose

The output for this step was as shown below. Important thing I found was at the very bottom where it said

2020-11-09 06:49:21,999 [DEBUG] iva.common.export.base_exporter: Data file doesn’t exist. Pulling input dimensions from the network.

Based on what I saw from this doc I thought that the above arguments were the only thing needed; however, it seems like I am missing something. I thought about using --cal_data_file as stated in the OG Jupyter notebook; however, that gets created in Step 9A-1. So I wanted to double check with you to see if I am missing something before I proceeded with running it with FP 16. FYI, my goal is to eventually compile using TX2 so I wanted to do FP16 rather than FP32.

Using TensorFlow backend.
2020-11-09 06:49:08.385214: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:12.129246: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-09 06:49:12.129449: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:12.130119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-11-09 06:49:12.130154: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:12.130206: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-09 06:49:12.131645: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-09 06:49:12.131732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-09 06:49:12.133776: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-09 06:49:12.135364: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-09 06:49:12.135448: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:49:12.135568: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:12.136271: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:12.136860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-09 06:49:12.136906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:13.051226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 06:49:13.051287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-11-09 06:49:13.051298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-11-09 06:49:13.051533: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:13.052246: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:13.052912: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:13.053560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13813 MB memory) → physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
2020-11-09 06:49:16.141875: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:16.142599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-11-09 06:49:16.142656: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:16.142711: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-09 06:49:16.142743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-09 06:49:16.142760: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-09 06:49:16.142781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-09 06:49:16.142818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-09 06:49:16.142869: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:49:16.142967: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:16.143608: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:16.144208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-09 06:49:16.144245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 06:49:16.144257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-11-09 06:49:16.144275: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-11-09 06:49:16.144381: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:16.145033: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:16.145652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13813 MB memory) → physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
2020-11-09 06:49:16,634 [DEBUG] iva.common.export.base_exporter: Saving etlt model file at: /workspace/tlt-experiments/detectnet_v2/experiment_dir_final/resnet18_detector.etlt.
2020-11-09 06:49:17,631 [DEBUG] modulus.export._uff: Patching keras BatchNormalization…
2020-11-09 06:49:17,631 [DEBUG] modulus.export._uff: Patching keras Dropout…
2020-11-09 06:49:17,631 [DEBUG] modulus.export._uff: Patching UFF TensorFlow converter apply_fused_padding…
2020-11-09 06:49:18.610437: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:18.611149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-11-09 06:49:18.611208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:18.611266: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-09 06:49:18.611315: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-09 06:49:18.611336: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-09 06:49:18.611359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-09 06:49:18.611383: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-09 06:49:18.611405: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:49:18.611501: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:18.612126: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:18.612715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-09 06:49:18.612764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 06:49:18.612783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-11-09 06:49:18.612805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-11-09 06:49:18.612921: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:18.613493: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:18.614099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13813 MB memory) → physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
2020-11-09 06:49:19,094 [DEBUG] modulus.export._uff: Unpatching keras BatchNormalization layer…
2020-11-09 06:49:19,094 [DEBUG] modulus.export._uff: Unpatching keras Dropout layer…
2020-11-09 06:49:21.153768: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.154474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-11-09 06:49:21.154531: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:21.154584: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-09 06:49:21.154614: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-09 06:49:21.154633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-09 06:49:21.154653: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-09 06:49:21.154672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-09 06:49:21.154693: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:49:21.154796: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.155417: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.156041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-09 06:49:21.156432: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.157048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-11-09 06:49:21.157102: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-09 06:49:21.157155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-09 06:49:21.157181: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-09 06:49:21.157202: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-09 06:49:21.157222: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-09 06:49:21.157242: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-09 06:49:21.157262: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:49:21.157341: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.157997: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.158607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-09 06:49:21.158645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 06:49:21.158660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-11-09 06:49:21.158668: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-11-09 06:49:21.158775: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.159432: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-09 06:49:21.160025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13813 MB memory) → physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
DEBUG [/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘output_cov/Sigmoid’, ‘output_bbox/BiasAdd’] as outputs
2020-11-09 06:49:21,999 [DEBUG] iva.common.export.base_exporter: Data file doesn’t exist. Pulling input dimensions from the network.
2020-11-09 06:49:21,999 [DEBUG] iva.common.export.base_exporter: Input dims: (3, 384, 1248)

Your etlt model should be available. Please check and run tlt-converter with fp16 mode.

Or you can deploy the etlt model into TX2 directly to run inference.

Since I am hoping to get higher FPS result, I am working on exporting the file to TRT.
I will try deploying the etlt as you suggested but I hope to be able to take advantage of TRT as well if possible.

!tlt-converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
-k $KEY
-o output_cov/Sigmoid,output_bbox/BiasAdd
-d 3,384,1248
-i nchw
-t fp16
-e $USER_EXPERIMENT_DIR/experiment_dir_final/fp16_resnet18_detector.trt \

and the error output is -

[ERROR] UffParser: Unsupported number of graph 0
[ERROR] Failed to parse the model, please check the encoding key to make sure it’s correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

If I am using my API that I saved on env variable, and I am getting this error. Would you happen to know why? Should I rerun to create OG engine, then create another .tlt?

Thanks for such quick responses and your guidance!

For fps, it is the same result when you deploy etlt model or trt engine in TX2. Because if you deploy etlt model, actually the deepstream app will convert it into one trt engine before running inference.

For your “Failed to parse the model, please check the encoding key to make sure it’s correct”, please check the $KEY is correct or you can set the key explicitly in the tlt-converter command.

More, please make sure you add " \ " in your tlt-converter command.
For example,

-d 3,384,1248 \
-i nchw \

Seems like "" was dropped when I copied the text, but yes. I do have that in the notebook. I am not sure why I have the KEY issue. I didn’t run into any.

I created another KEY and now it seems like I have to run all the steps above again. Let me check that and get back to you. Training will probably take 7-8 hours. So I will get back to you when I am back at Step 9 again.

Actually for the KEY error, it is a common question in TLT forum. Normally, it is not needed to generate a new key. Just need to check if the key is set correctly.

1 Like