We have been successful in training a model using this example (https://developer.nvidia.com/blog/training-instance-segmentation-models-using-maskrcnn-on-the-transfer-learning-toolkit/). I have run into quite a few hiccups along the way since there are some differences in the official documentation and this example. I was able to get around all the issues with some trial and error but I have reached a blocker when I try to run the inference step.
No matter what I try, I can’t seem to get the label file parameter to work properly to capture the COCO Json output or the inference and then the script crashes due to a miss configuration. I have tried playing around with some of the settings in the configuration file but I always get the same error message. I have put the initial values back as to not break the default configuration provided with the examples.
Here is the command line I am using. I have placed 3 images in the input directory, I have set a value of 3 for the batch size, but no matter what value I put, I always get the same error.
tlt-infer mask_rcnn -i /data/testModel/input_dir -o /data/testModel/output_dir -e /workspace/examples/maskrcnn/specs/maskrcnn_train_resnet50.txt -m /data/tlt_instance_segmentation_vresnet50/model.step-5000.tlt -b 3 -l /data/testModel/label.txt --include_mask
Label file does not exist. Skipping…
Traceback (most recent call last):
File “/usr/local/bin/tlt-infer”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_infer.py”, line 60, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py”, line 351, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py”, line 261, in infer
AssertionError: Batch size should not be greater than the number of samples.