Error while using tlt-infer with mask_rcnn on a custom dataset

pejc · January 15, 2021, 5:22pm

Hi, I used the tlt_instance_segmentation docker with the notebook to train a model on my custom dataset containing only 1 class ( 2 with background ). Everything works fine up until I try to infer on my test set:

    # Running inference for detection on n images
    !tlt-infer mask_rcnn -i $RAW_DATA_DIR/images/test \
                     -o $USER_EXPERIMENT_DIR/maskrcnn_annotated_images \
                     -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                     -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-25000.tlt \
                     -l $SPECS_DIR/coco_labels.txt \
                     -t 0.5 \
                     -b 2 \
                     -k $KEY \
                     --include_mask

I get this error:

Using TensorFlow backend.
2021-01-15 16:57:52.117876: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Traceback (most recent call last):
  File "/usr/local/bin/tlt-infer", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_infer.py", line 60, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py", line 288, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py", line 230, in infer
AssertionError: Batch size should not be greater than the number of samples.

Any clues?
Obviously, the folder does contain more than 2 images. The same error is also thrown with a batch of 1.
Thanks

Morganh · January 16, 2021, 12:59pm

Which kind of image in your $RAW_DATA_DIR/images/test?
Is it jpg or others?

pejc · January 16, 2021, 2:57pm

They are .png files.

Morganh · January 16, 2021, 3:04pm

Could you try to convert the png files to jpg files and retry? I find that “tlt-infer mask_rcnn” only supports jpg images.

$ for i in .png ; do convert “$i” "${i%.}.jpg" ; done

pejc · January 16, 2021, 4:46pm

Alright, it is the solution. It wasn’t specified in the TLT docs, so I wasn’t sure.

But now I get the same error as this guy : Error running MaskRCNN inference after custom training - #6 by hyperlight

Any ETA on the fix?

Morganh · January 17, 2021, 2:38am

It will be fixed in next release. Please stay tune for it.