Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) Nano
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Unet
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) v3.0-py3
• Training spec file(If have, please share here) experiment_spec.txt (16.9 KB)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Hello, I am experiencing trouble reproducing results that I see when running tlt-infer on a trained unet model and the same model but exported and converted to a trt engine. The segmentation result with tlt-infer is much more accurate, whereas trt engine produces more errors and masks seem to be off the objects. I have checked many times that I do the preprocessing as required, run on same image etc. The issue persists with both a fp16 and fp32 engine. I attach the result from trt-infer and a mask for lane marking class produced by trt engine, to illustrate the kind of difference. Please let me know if you need extra input from me like models or specs
no, this is a result of a custom script where I use python api of tensorrt to run the engine, and save binary masks of each class in the picture separately
More, to narrow down the issue, please generate trt engine in the tlt docker instead of nano.
And run tlt unet inference against this “.engine” file or “.trt” file.
I cannot reproduce with one official purpose-build unet model. Please follow below step to check if you can get the same result as mine.
Then , please double check your previous result. If possible, you can share your tlt model, etlt model and test image with me.
I ran the test you suggested and I also get absolutely identical results for tlt and trt inferences.
Then I retried training the model to be 100% sure I did not miss smth in configs. I stopped the training midway, ran inference and got this
As you see, the trt result is much closer to tlt, but still noticebly off. Seems like the objects’ masks are somehow skewed to the left, especially on the left lane marking. Probably with more training the difference grows larger, because on my first examples you can see the same effect but to a greater extent.
Since your trial was using vanilla-unet-dynamic and mine uses resnet18, could you please make a trial with a resnet based model to see if that is where the issue lies?