Hello. When I execute tao model yolo_v3 dataset_convert -d /workspace/tao-experiments/kitti_config -o /workspace/tao-experiments/output_tfrecords
, I encounter an error: tensorflow.python.framework.errors_impl.PermissionDeniedError: /workspace/tao-experiments/output_tfrecords-fold-000-of-002-shard-00000-of-00010; Permission denied
. I have checked the UID and GID for Docker, and I have confirmed the write permissions for the mounted output_tfrecords
directory, but I cannot resolve this error.
Could you share the ~/.tao_mounts.json file?
.tao_mounts.zip (500 Bytes)
Could you remove "user": "1000:1000",
and retry?
Succeeded. thank you.
I successfully converted the dataset, but when I tried to run tao model yolo_v3 train, I encountered the error “tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0”. It seems like the TFRecord files were not properly generated. Some of the files turned out to be empty. Do you know what could be the cause?
Do you have only a few label files?
Please checkval_split
and num_shards
in the spec file when run dataset_convert. If there are 15 label files, num_shards=10
and val_split=20
, then only 15*20%=3 labels for 0-index. But the 3 labels cannot be split to 10 shards. Then empty tfrecords exists.
The issue with empty files being generated has been resolved, but the error “tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0” is still occurring.
Could you upload the full log? You can upload a txt file using
Where can I get the log file? Is it okay to just paste the execution results from the terminal?
Please copy the log when you run tao model yolo_v3 train
.
log.txt (8.8 KB)
Attach.
Could you upload the training spec file?
Attach.
spec.txt (2.2 KB)
Can you run below and share the result?
tao model yolo_v3 run ls -rltsh /workspace/tao-experiments/data/output_tfrecords/*
and
tao model yolo_v3 run ls /workspace/tao-experiments/data/train |wc -l
log.txt (2.2 KB)
So, there is not tfrecords files under /workspace/tao-experiments/data/output_tfrecords/
.
It is not expected.
Please make sure the link is available.
Please note that the path is defined in ~/.tao_mounts.json file.
To debug, you can open a terminal and then debug inside the docker.
$ tao model yolo_v3 run /bin/bash
I have confirmed that the file path is correct. Does this mean that the tfrecords file has not been generated correctly?
According to above log, the path is not available. Please double check. You can debug inside the docker.
From your command line, it is /workspace/tao-experiments/output_tfrecords
instead of /workspace/tao-experiments/data/output_tfrecords
.