Yes.OOM is resolved because of the resizing of images as you suggested. Please find the updated spec file ‘detectnet_config.txt’ in the question #1
Have you generated new tfrecords files based on resized image/labels?
More, can you paste AP result of all the class?
Yes, i have generated tfrecords by using the resized images and labels.
Epoch 101/120
=========================
Validation cost: -0.000010
Mean average_precision (in %): 0.0000
class name average precision (in %)
-------------------------- --------------------------
Closed-column-tip 0
Flare-tip 0
Ladders 0
Platforms 0
Stack-diameter-change-zone 0
Stack-shells 0
Stack-tip 0
Can you tell me the quantity of total images and each class’s images?
I have 1952 images.
Using TensorFlow backend.
2019-12-05 07:54:06,902 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2019-12-05 07:54:06,908 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 1562 Val: 390
2019-12-05 07:54:06,908 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2019-12-05 07:54:06,909 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
/usr/local/lib/python2.7/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:266: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2019-12-05 07:54:06,964 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2019-12-05 07:54:07,005 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2019-12-05 07:54:07,045 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3
2019-12-05 07:54:07,091 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4
2019-12-05 07:54:07,135 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5
2019-12-05 07:54:07,176 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6
2019-12-05 07:54:07,226 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7
2019-12-05 07:54:07,280 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8
2019-12-05 07:54:07,323 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9
2019-12-05 07:54:07,366 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
closed-column-tip: 61
stack-diameter-change-zone: 42
platforms: 992
stack-shells: 752
ladders: 637
stack-tip: 130
flare-tip: 27
2019-12-05 07:54:07,366 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2019-12-05 07:54:07,542 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2019-12-05 07:54:07,714 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2
2019-12-05 07:54:07,880 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3
2019-12-05 07:54:08,053 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4
2019-12-05 07:54:08,223 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5
2019-12-05 07:54:08,395 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6
2019-12-05 07:54:08,563 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7
2019-12-05 07:54:08,744 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8
2019-12-05 07:54:08,925 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9
2019-12-05 07:54:09,100 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
closed-column-tip: 177
stack-diameter-change-zone: 154
stack-tip: 555
stack-shells: 2952
ladders: 2404
platforms: 2852
flare-tip: 59
2019-12-05 07:54:09,100 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2019-12-05 07:54:09,100 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO -
Wrote the following numbers of objects:
closed-column-tip: 238
stack-diameter-change-zone: 196
platforms: 3844
stack-shells: 3704
ladders: 3041
stack-tip: 685
flare-tip: 86
2019-12-05 07:54:09,100 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map.
Label in GT: Label in tfrecords file
Closed-column-tip: closed-column-tip
Stack-diameter-change-zone: stack-diameter-change-zone
Platforms: platforms
Stack-shells: stack-shells
Ladders: ladders
Stack-tip: stack-tip
Flare-tip: flare-tip
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.
2019-12-05 07:54:09,100 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.
Hi samjith888,
Please see line 67
“For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.”
And also https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#dataloader
The class names key in the target_class_mapping must be identical to the one shown in the dataset converter log, so that the correct classes are picked up for training.
Can you modify all the class of your spec file? For example,
target_class_mapping {
key: "Platforms"
value: "Platforms"
}
change to
target_class_mapping {
key: "platforms"
value: "platforms"
}
I have done it for target_class_mapping , but i got the following error…
Traceback (most recent call last):
File "/usr/local/bin/tlt-train-g1", line 10, in <module>
sys.exit(main())
File "./common/magnet_train.py", line 37, in main
File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
File "./detectnet_v2/scripts/train.py", line 632, in main
File "./detectnet_v2/scripts/train.py", line 556, in run_experiment
File "./detectnet_v2/scripts/train.py", line 490, in train_gridbox
File "./detectnet_v2/scripts/train.py", line 136, in run_training_loop
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 676, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1270, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1327, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1091, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: bboxes class ID out of range [0, 7[, got-1
[[node BboxRasterizer_1/RasterizeBbox (defined at <string>:159) ]]
[[node resnet18_nopool_bn_detectnet_v2/block_4b_bn_1/AssignMovingAvg (defined at /opt/nvidia/third_party/keras/tensorflow_backend.py:186) ]]
Caused by op u'BboxRasterizer_1/RasterizeBbox', defined at:
File "/usr/local/bin/tlt-train-g1", line 10, in <module>
sys.exit(main())
File "./common/magnet_train.py", line 37, in main
File "</usr/local/lib/python2.7/dist-packages/decorator.pyc:decorator-gen-2>", line 2, in main
File "./detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
File "./detectnet_v2/scripts/train.py", line 632, in main
File "./detectnet_v2/scripts/train.py", line 556, in run_experiment
File "./detectnet_v2/scripts/train.py", line 466, in train_gridbox
File "./detectnet_v2/scripts/train.py", line 308, in build_training_graph
File "./detectnet_v2/scripts/train.py", line 215, in rasterize_tensors
File "./detectnet_v2/model/detectnet_model.py", line 557, in generate_ground_truth_tensors
File "./detectnet_v2/objectives/objective_set.py", line 256, in generate_ground_truth_tensors
File "./detectnet_v2/rasterizers/bbox_rasterizer.py", line 377, in rasterize_labels
File "./modulus/processors/processors.py", line 227, in __call__
File "./modulus/processors/bbox_rasterizer.py", line 190, in call
File "<string>", line 159, in rasterize_bbox
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): bboxes class ID out of range [0, 7[, got-1
[[node BboxRasterizer_1/RasterizeBbox (defined at <string>:159) ]]
[[node resnet18_nopool_bn_detectnet_v2/block_4b_bn_1/AssignMovingAvg (defined at /opt/nvidia/third_party/keras/tensorflow_backend.py:186) ]]
The first letter of every class is capital in my label.txt file
Hi samjith888,
Please paste your latest spec here.
Hi samjith888,
Is your issue fixed? Can we close this topic? Thanks.
Issue not solved, so i moved into Faster_rcnn.
Please paste your latest spec here.
Your first letter of every class is capital.In tfreocrd, all the classes will be written into low case.