TAO Toolkit - Gazenet dataset_convert err - OSError: Unable to find a data source

• Hardware (RTX 3080)
• Network Type (GazeNet)
• TAO Version

(nvidiaVenv) sam@MSI:~$ tao info --verbose
Configuration of the TAO Toolkit Instance

dockers:
        nvidia/tao/tao-toolkit-tf:
                v3.21.11-tf1.15.5-py3:
                        docker_registry: nvcr.io
                        tasks:
                                1. augment
                                2. bpnet
                                3. classification
                                4. dssd
                                5. emotionnet
                                6. efficientdet
                                7. fpenet
                                8. gazenet
                                9. gesturenet
                                10. heartratenet
                                11. lprnet
                                12. mask_rcnn
                                13. multitask_classification
                                14. retinanet
                                15. ssd
                                16. unet
                                17. yolo_v3
                                18. yolo_v4
                                19. yolo_v4_tiny
                                20. converter
                v3.21.11-tf1.15.4-py3:
                        docker_registry: nvcr.io
                        tasks:
                                1. detectnet_v2
                                2. faster_rcnn
        nvidia/tao/tao-toolkit-pyt:
                v3.21.11-py3:
                        docker_registry: nvcr.io
                        tasks:
                                1. speech_to_text
                                2. speech_to_text_citrinet
                                3. text_classification
                                4. question_answering
                                5. token_classification
                                6. intent_slot_classification
                                7. punctuation_and_capitalization
                                8. spectro_gen
                                9. vocoder
                                10. action_recognition
        nvidia/tao/tao-toolkit-lm:
                v3.21.08-py3:
                        docker_registry: nvcr.io
                        tasks:
                                1. n_gram
format_version: 2.0
toolkit_version: 3.21.11
published_date: 11/08/2021

• Training spec file (I am using the exact instructions/training specs from the GazeNet Jupyter Notebook)
• How to reproduce the issue ?
To reproduce the error, go through the GazeNet Jupyter Notebook and follow the instructions exactly. It is on this command that I get an error:

tao gazenet dataset_convert -folder-suffix pipeline \
                             -norm_folder_name Norm_Data \
                             -sets p01-day03 \
                             -data_root_path /home/sam/cv_samples_v1.3.0/gazenet/MPIIFaceGaze/sample-dataset

Please note that I have adjusted the data_root_path path, and I believe that is the only adjustment I have made while following the Jupyter notebook.

When running the command, there a quite a few TensorFlow warnings that appear, with the error traceback at the end of the output:

2022-01-18 19:15:09.075541: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-h6t_1sup because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/augmentation_helper.py:22: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

2022-01-18 19:15:10,957 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/augmentation_helper.py:22: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/augmentation_helper.py:22: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

2022-01-18 19:15:10,957 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/augmentation_helper.py:22: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/gazenet_dataloader_augmentation_V2.py:38: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

2022-01-18 19:15:10,958 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/dataloader/gazenet_dataloader_augmentation_V2.py:38: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.
2022-01-18 19:15:11,039 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/data_converter.py:159: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/data_converter.py:162: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/scripts/dataset_convert.py", line 85, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/scripts/dataset_convert.py", line 78, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/tfrecord_manager.py", line 60, in __init__
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/tfrecord_manager.py", line 102, in _extract_cosmos_paths
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/set_strategy_generator.py", line 50, in __init__
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/gaze_strategy.py", line 57, in __init__
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/dataio/gaze_strategy.py", line 198, in _set_source_paths
OSError: Unable to find a data source
Traceback (most recent call last):
  File "/usr/local/bin/gazenet", line 8, in <module>
    sys.exit(main())
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/mitgazenet/entrypoint/gazenet.py", line 13, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/common/entrypoint/entrypoint.py", line 300, in launch_job
AssertionError: Process run failed.

I believe this developers forum post depicts a similar issue near the bottom comments for the GazeNet model, so I am unsure if this is the same underlying issue or not.

Please note that I am using wsl2, ubuntu 20.04, and cuda version 11.6.

The developers forum post is different from yours.
In that topic, the end user is using his own images.

For your topic, did you follow the notebook without any change? If yes, I believe there is no issue. I just run it yesterday.

1 Like

I reverted the Jupyter notebook to it’s original form, redownloaded the dataset, and reran the commands, and it seemed to work fine this time. I definitely did have that issue for a bit, so I’m not sure what exactly caused my initial issue.

I super appreciate your response and help on this!

Thanks for the info. Glad to see the issue is gone.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.