Error while training detectnet v2 taotollkit on default notebook

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I’m trying to run this cell:

!tao model detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
-n resnet18_detector
–gpus $NUM_GPUS

But I get the following error:

2024-03-08 17:12:34,417 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2024-03-08 17:12:34,467 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2024-03-08 17:12:34,479 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/franciscorocha/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2024-03-08 17:12:34,479 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
2024-03-08 23:12:36.068612: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2024-03-08 23:12:36,099 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2024-03-08 23:12:36,963 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:36,981 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:36,984 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:37,902 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:39,111 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:39,131 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:39,133 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-03-08 23:12:40,674 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.logging.logging 197: Log file already exists at /workspace/tao-experiments/detectnet_v2/experiment_dir_unpruned/status.json
2024-03-08 23:12:40,674 [TAO Toolkit] [INFO] root 2102: Starting DetectNet_v2 Training job
2024-03-08 23:12:40,675 [TAO Toolkit] [INFO] main 817: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt.
2024-03-08 23:12:40,675 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.spec_handler.spec_loader 113: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt
2024-03-08 23:12:40,676 [TAO Toolkit] [INFO] root 2102: 61:29 : ’ dbscan_min_samples: 0.0500000007451’: Couldn’t parse integer: 0.0500000007451
Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1702, in _ParseAbstractInteger
return int(text, 0)
ValueError: invalid literal for int() with base 0: ‘0.0500000007451’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1652, in _ConsumeInteger
result = ParseInteger(tokenizer.token, is_signed=is_signed, is_long=is_long)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1674, in ParseInteger
result = _ParseAbstractInteger(text)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1704, in _ParseAbstractInteger
raise ValueError(‘Couldn't parse integer: %s’ % orig_text)
ValueError: Couldn’t parse integer: 0.0500000007451

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py”, line 1067, in
raise e
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py”, line 1046, in
main()
File “/usr/local/lib/python3.8/dist-packages/decorator.py”, line 232, in fun
return caller(func, *(extras + args), **kw)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/utilities/timer.py”, line 46, in wrapped_fn
return_args = fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py”, line 1024, in main
run_experiment(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py”, line 821, in run_experiment
experiment_spec = load_experiment_spec(
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/spec_handler/spec_loader.py”, line 136, in load_experiment_spec
experiment_spec = load_proto(spec_path, experiment_spec, default_spec_path,
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/spec_handler/spec_loader.py”, line 114, in load_proto
_load_from_file(spec_path, proto_buffer)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/spec_handler/spec_loader.py”, line 100, in _load_from_file
merge_text_proto(f.read(), pb2)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 719, in Merge
return MergeLines(
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 793, in MergeLines
return parser.MergeLines(lines, message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 818, in MergeLines
self._ParseOrMerge(lines, message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 837, in _ParseOrMerge
self._MergeField(tokenizer, message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 967, in _MergeField
merger(tokenizer, message, field)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1042, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 967, in _MergeField
merger(tokenizer, message, field)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1042, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 967, in _MergeField
merger(tokenizer, message, field)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1042, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 967, in _MergeField
merger(tokenizer, message, field)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1042, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 967, in _MergeField
merger(tokenizer, message, field)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1076, in _MergeScalarField
value = _ConsumeInt32(tokenizer)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1573, in _ConsumeInt32
return _ConsumeInteger(tokenizer, is_signed=True, is_long=False)
File “/usr/local/lib/python3.8/dist-packages/google/protobuf/text_format.py”, line 1654, in _ConsumeInteger
raise tokenizer.ParseError(str(e))
google.protobuf.text_format.ParseError: 61:29 : ’ dbscan_min_samples: 0.0500000007451’: Couldn’t parse integer: 0.0500000007451
Execution status: FAIL
2024-03-08 17:12:42,667 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Please set dbscan_min_samples: 1 .
You can refer to https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/detectnet_v2.html#creating-a-configuration-file and tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt at main · NVIDIA/tao_tutorials · GitHub.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.