tlt-train error when deploy mobilenet_v2 by using DetectNet

m.billson16 · November 26, 2019, 10:20am

Okay Morganh, Thank you for your big help. I will try to narrow down the resolution and do some experiment again.

Morganh · November 26, 2019, 10:27am

Hi m.billson16,
Glad to know you solve the issue.
For mAP(Mean average_precision), it is another topic. Please create a new topic if needed.

But firstly I want to tell you, in the beginning of training, low mAP is expected. It is related to batch-size, epoch, etc.
For detectnet_v2, suggest that you set batch-size= 4 and epoch=120 in your spec file. Monitor the mAP in the end.
Thanks very much for using TLT!

Morganh · November 26, 2019, 10:29am

I mean you can decrease the quantity of images/labels. It is not related to resolution.

m.billson16 · November 26, 2019, 10:35am

Okay, Thank you very much again Morganh.
I want to ask if batch-size=4 and epoch=120 is the default value for training?

m.billson16 · November 26, 2019, 10:35am

Okay Morganh, Thank you very much.

m.billson16 · November 26, 2019, 10:47am

Hello Morganh.
I have tried to narrow down my images/labels into 50 images, with num_shards = 2 and val_split=20 and still got the same error.

Traceback (most recent call last):
  File "/usr/local/bin/tlt-dataset-convert", line 10, in <module>
    sys.exit(main())
  File "./detectnet_v2/scripts/dataset_convert.py", line 64, in main
  File "./detectnet_v2/dataio/dataset_converter_lib.py", line 74, in convert
  File "./detectnet_v2/dataio/dataset_converter_lib.py", line 108, in _write_partitions
  File "./detectnet_v2/dataio/dataset_converter_lib.py", line 149, in _write_shard
  File "./detectnet_v2/dataio/kitti_converter_lib.py", line 169, in _create_example_proto
  File "./detectnet_v2/dataio/kitti_converter_lib.py", line 272, in _add_targets
TypeError: object of type 'int' has no len()

I really don’t understand why these things can happen again. Do you have any idea? I’m very sorry for troubling you

Morganh · November 26, 2019, 12:15pm

I suggest you add several images/labels based on previous 47 images. If meet error, that means there is something wrong in those image/labels which you newly added.

m.billson16 · November 27, 2019, 6:45am

Okay Morganh, Thank you very much. Thanks for the big help

Topic		Replies	Views
Training detectnet_v2 Issue TAO Toolkit	15	1846	October 12, 2021
TFRecord creation process TAO Toolkit	6	803	October 12, 2021
Tlt-train loss is minimal but performances are bad TAO Toolkit	11	518	October 12, 2021
Error on tlt-training detectnet_v2? TAO Toolkit	6	473	October 12, 2021
TLT training error : Key cost_sums/cyclist-bbox not found in checkpoint TAO Toolkit	6	1194	October 12, 2021
Error with tlt train in official Jupyter notebook TLT 3.0 TAO Toolkit	7	800	October 12, 2021
Error training Faster RCNN model TAO Toolkit	17	1554	October 12, 2021
Tao detectnet_v2 train failed with g_error_metadata.to_exception in autograph module TAO Toolkit tao	12	1393	January 10, 2022
Core dump Illegal Instruction on detectnet_v2 example TAO Toolkit	17	1992	October 12, 2021
ValueError: No dataset tfrecords file found at path TAO Toolkit	10	1663	October 12, 2021

tlt-train error when deploy mobilenet_v2 by using DetectNet

Related topics