Please provide the following information when requesting support.
• Hardware (T4)
• Network Type (Detectnet_v20
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
Configuration of the TAO Toolkit Instance
task_group:
model:
dockers:
nvidia/tao/tao-toolkit:
5.0.0-tf2.11.0:
docker_registry: nvcr.io
tasks:
1. classification_tf2
2. efficientdet_tf2
5.0.0-tf1.15.5:
docker_registry: nvcr.io
tasks:
1. bpnet
2. classification_tf1
3. converter
4. detectnet_v2
5. dssd
6. efficientdet_tf1
7. faster_rcnn
8. fpenet
9. lprnet
10. mask_rcnn
11. multitask_classification
12. retinanet
13. ssd
14. unet
15. yolo_v3
16. yolo_v4
17. yolo_v4_tiny
5.2.0-pyt2.1.0:
docker_registry: nvcr.io
tasks:
1. action_recognition
2. centerpose
3. deformable_detr
4. dino
5. mal
6. ml_recog
7. ocdnet
8. ocrnet
9. optical_inspection
10. pointpillars
11. pose_classification
12. re_identification
13. visual_changenet
5.2.0-pyt1.14.0:
docker_registry: nvcr.io
tasks:
1. classification_pyt
2. segformer
dataset:
dockers:
nvidia/tao/tao-toolkit:
5.2.0-data-services:
docker_registry: nvcr.io
tasks:
1. augmentation
2. auto_label
3. annotations
4. analytics
deploy:
dockers:
nvidia/tao/tao-toolkit:
5.2.0-deploy:
docker_registry: nvcr.io
tasks:
1. visual_changenet
2. centerpose
3. classification_pyt
4. classification_tf1
5. classification_tf2
6. deformable_detr
7. detectnet_v2
8. dino
9. dssd
10. efficientdet_tf1
11. efficientdet_tf2
12. faster_rcnn
13. lprnet
14. mask_rcnn
15. ml_recog
16. multitask_classification
17. ocdnet
18. ocrnet
19. optical_inspection
20. retinanet
21. segformer
22. ssd
23. trtexec
24. unet
25. yolo_v3
26. yolo_v4
27. yolo_v4_tiny
format_version: 3.0
toolkit_version: 5.2.0
published_date: 12/06/2023
• Training spec file()
detectnet_v2_tfrecords_kitti_trainval.txt (302 Bytes)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
!tao model detectnet_v2 dataset_convert
-d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt
-o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval
-r $USER_EXPERIMENT_DIR/
2024-01-13 11:48:42,411 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2024-01-13 11:48:42,485 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2024-01-13 11:48:42,499 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
2024-01-13 03:48:43.188617: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2024-01-13 03:48:43,242 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2024-01-13 03:48:44,965 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:45,008 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:45,012 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:46,917 [TAO Toolkit] [WARNING] matplotlib 500: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-oq_eer2u because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2024-01-13 03:48:47,220 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:49,173 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:49,214 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:49,218 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-01-13 03:48:49,829 [TAO Toolkit] [INFO] root 2102: Starting Object Detection Dataset Convert.
2024-01-13 03:48:49,830 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.build_converter 87: Instantiating a kitti converter
2024-01-13 03:48:49,830 [TAO Toolkit] [INFO] root 2102: Instantiating a kitti converter
2024-01-13 03:48:49,830 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 71: Creating output directory /workspace/tao-experiments/Data/tfrecords/kitti_trainval
2024-01-13 03:48:49,830 [TAO Toolkit] [INFO] root 2102: Generating partitions
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.kitti_converter_lib 176: Num images in
Train: 26 Val: 4
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] root 2102: Num images in
Train: 26 Val: 4
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.kitti_converter_lib 197: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] root 2102: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 0
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 0
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
2024-01-13 03:48:49,831 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
2024-01-13 03:48:49,831 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 1
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 1
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 2
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 2
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 3
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 3
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 4
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 4
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 5
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 5
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 6
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 6
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 7
2024-01-13 03:48:49,832 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 7
2024-01-13 03:48:49,833 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 8
2024-01-13 03:48:49,833 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 8
2024-01-13 03:48:49,833 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 9
2024-01-13 03:48:49,833 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 9
2024-01-13 03:48:49,838 [TAO Toolkit] [INFO] root 2102: [Errno 2] No such file or directory: ‘/workspace/tao-experiments/Data/Train/Labels/00002.txt’
Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/dataset_convert.py”, line 168, in
raise e
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/dataset_convert.py”, line 137, in
main()
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/dataset_convert.py”, line 132, in main
converter.convert()
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py”, line 86, in convert
object_count = self._write_partitions(partitions)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py”, line 141, in _write_partitions
shard_object_count = self._write_shard(shard, p, s)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py”, line 185, in _write_shard
example = self._create_example_proto(frame_id)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/kitti_converter_lib.py”, line 223, in _create_example_proto
self._add_targets(example, frame_id)
File “/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/kitti_converter_lib.py”, line 329, in _add_targets
with open(label_file) as lf:
FileNotFoundError: [Errno 2] No such file or directory: ‘/workspace/tao-experiments/Data/Train/Labels/00002.txt’
Execution status: FAIL
2024-01-13 11:48:57,468 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.