Please provide the following information when requesting support.
• Configuration of the TAO Toolkit Instance
dockers:
nvidia/tao/tao-toolkit-tf:
v3.21.11-tf1.15.5-py3:
nvidia/tao/tao-toolkit-lm:
format_version: 2.0
toolkit_version: 3.21.11
published_date: 11/08/2021
• when i am training the model it is showing me the error as shown on right side of screenshot and also tfRecord folder created is empty
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_train.txt
-o " + self.data_download + “/training/tfrecords/train”)
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_val.txt
-o " + self.data_download + “/val/tfrecords/val”)
using these commands for tfRecords generation
Morganh
#2
Please share the log when generate tfrecord files.
Logs are like this and tfRecords folder is created as shown in other screenshot but this folder is empty.
Morganh
#5
As below said, there are no images.
So, please double check the path to your images.
Can you share the spec file when generate tfrecord?
Morganh
#8
Can you share the command how did you generate tfrecord?
This the configuration file
[PATH]
USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4_tiny
DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
LOCAL_PROJECT_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline
LOCAL_DATA_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline/data
LOCAL_EXPERIMENT_DIR=/home/vlabs-ashu/Desktop/TAO_pipeline/yolo_v4_tiny
SPECS_DIR=/workspace/tao-experiments/yolo_v4_tiny/specs
KEY=nvidia_tlt
LOCAL_SPECS=/home/vlabs-ashu/Desktop/TAO_pipeline/specs
reading paths
self.user_experiment = str(cfg[‘PATH’][‘USER_EXPERIMENT_DIR’])
self.data_download = str(cfg[‘PATH’][‘DATA_DOWNLOAD_DIR’])
self.local_project = str(cfg[‘PATH’][‘LOCAL_PROJECT_DIR’])
self.local_data = str(cfg[‘PATH’][‘LOCAL_DATA_DIR’])
self.local_experiment = str(cfg[‘PATH’][‘LOCAL_EXPERIMENT_DIR’])
self.specs_dir = str(cfg[‘PATH’][‘SPECS_DIR’])
self.local_specs = str(cfg[‘PATH’][‘LOCAL_SPECS’])
self.key = str(cfg[‘PATH’][‘KEY’])
Generating tfRecords
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_train.txt
-o " + self.data_download + “/training/tfrecords/train”)
os.system("tao yolo_v4_tiny dataset_convert -d " + self.specs_dir + "/yolo_v4_tiny_tfrecords_kitti_val.txt
-o " + self.data_download + “/val/tfrecords/val”)
Morganh
#10
Usually this kind of issue may result from:
- The path is not correct. Please make sure the images are available in the correct path
- Make sure the extension of images are correct
- Please note that the path should be the path inside the docker. Thus, need to double check the ~/.tao_mounts.json
- For debug use, you can run “tao detectnet_v2” to trigger an interactive session to check.
Thanks, resolved. Extension of images were not correct.
Tf records are generated but but i am still getting the error on training
This is the error but while debugging tfRecords are there.
Morganh
#14
Please double check if you set correct tfrecords file path in the training spec file.
Similar topic is ValueError: No dataset tfrecords file found at path - #7 by Morganh
I am able to train the model in my system but when i am running trying to train in other system, i am getting this type of error.
So, the original issue is fixed on your side, right? Could you share the experience and what is the root cause?
system
closed
#17
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.