Tfrecords are not generated but output is showing tfRecords generated

Please provide the following information when requesting support.

• Configuration of the TAO Toolkit Instance

dockers:
nvidia/tao/tao-toolkit-tf:
v3.21.11-tf1.15.5-py3:
nvidia/tao/tao-toolkit-lm:
format_version: 2.0
toolkit_version: 3.21.11
published_date: 11/08/2021

• when i am training the model it is showing me the error as shown on right side of screenshot and also tfRecord folder created is empty

os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_train.txt
-o " + self.data_download + “/training/tfrecords/train”)
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_val.txt
-o " + self.data_download + “/val/tfrecords/val”)
using these commands for tfRecords generation

Please share the log when generate tfrecord files.

Logs are like this and tfRecords folder is created as shown in other screenshot but this folder is empty.


Folder created.

As below said, there are no images.

So, please double check the path to your images.

Can you share the spec file when generate tfrecord?

Can you share the command how did you generate tfrecord?

This the configuration file
[PATH]
USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4_tiny
DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
LOCAL_PROJECT_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline
LOCAL_DATA_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline/data
LOCAL_EXPERIMENT_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline/yolo_v4_tiny
SPECS_DIR=/workspace/tao-experiments/yolo_v4_tiny/specs
KEY=nvidia_tlt
LOCAL_SPECS=/home/vlabs-ashu/Desktop/TAO_pipeline/specs

reading paths
self.user_experiment = str(cfg[‘PATH’][‘USER_EXPERIMENT_DIR’])
self.data_download = str(cfg[‘PATH’][‘DATA_DOWNLOAD_DIR’])
self.local_project = str(cfg[‘PATH’][‘LOCAL_PROJECT_DIR’])
self.local_data = str(cfg[‘PATH’][‘LOCAL_DATA_DIR’])
self.local_experiment = str(cfg[‘PATH’][‘LOCAL_EXPERIMENT_DIR’])
self.specs_dir = str(cfg[‘PATH’][‘SPECS_DIR’])
self.local_specs = str(cfg[‘PATH’][‘LOCAL_SPECS’])
self.key = str(cfg[‘PATH’][‘KEY’])

Generating tfRecords
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_train.txt
-o " + self.data_download + “/training/tfrecords/train”)
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_val.txt
-o " + self.data_download + “/val/tfrecords/val”)

Usually this kind of issue may result from:

  • The path is not correct. Please make sure the images are available in the correct path
  • Make sure the extension of images are correct
  • Please note that the path should be the path inside the docker. Thus, need to double check the ~/.tao_mounts.json
  • For debug use, you can run “tao detectnet_v2” to trigger an interactive session to check.

Thanks, resolved. Extension of images were not correct.

Tf records are generated but but i am still getting the error on training


This is the error but while debugging tfRecords are there.

Please double check if you set correct tfrecords file path in the training spec file.
Similar topic is ValueError: No dataset tfrecords file found at path - #7 by Morganh

I am able to train the model in my system but when i am running trying to train in other system, i am getting this type of error.

So, the original issue is fixed on your side, right? Could you share the experience and what is the root cause?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.