Different result between tlt-infer and trt engine unet segmentation model

it is Mapillary Vistas, just with some classes merged as per training spec

Thanks for the info.

May I know which docker did you use to train and get above-mentioned tlt model?
TLT 3.0-dp-py3 docker or TLT 3.0-py3 docker?

And could you please share which folder did you use to train, and which folder did you use to set as Mask folder?

  1. docker_tag: v3.0-py3
  2. sorry, didnt’get yout question

For item2, see the folders of mapilary,
morganh@dl:~/demo_3.0/unet_mapillary$ ls training/
images v1.2 v2.0
morganh@dl:~/demo_3.0/unet_mapillary$ ls training/v1.2/
instances labels panoptic
morganh@dl:~/demo_3.0/unet_mapillary$ ls training/v2.0/
instances labels panoptic polygons

In your spec file,
train_images_path: "/new/media/hdd/datasets/mapillary/vistas/images/train"
which folder is it?

And for
train_masks_path: "/new/media/hdd/datasets/mapillary/vistas/masks/train"
which folder is it as well?


train_images_path: "/new/media/hdd/datasets/mapillary/vistas/images/train" is a folder where I have all the images from ~/demo_3.0/unet_mapillary/training/images but resized to 640x640 with padding.

train_masks_path: "/new/media/hdd/datasets/mapillary/vistas/masks/train" is ~/demo_3.0/unet_mapillary/training/v2.0/labels but converted to format required by tlt and also resized to 640x640 with padding

For “converted to format required by tlt”, do you mean converting to gray images?

I mean that lables of mapillary are represented by different colors, while “UNet expects the images and corresponding masks encoded as images. Each mask image is a single-channel image, where every pixel is assigned an integer value that represents the segmentation class.”

OK, got it.

I’m still pushing internal team to check the difference between tlt-infer and trt engine.
BTW, may I know which dgpu did you run Unet training?