No such file or directory: 'data/stapler/sub-train-annotations-bbox.csv'

Hell,o, I am trying to follow Dusty’s Jetson Nano training guide on Github. And once I have my data set annoted and ready to train, I run the suggested command and get prompt with an error about some .csv that I dont have.

Here is the prompt:

root@jb-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --data=data/stapler --model=models/stapler --batch-size=1 --epochs=2
2023-02-21 01:34:03 - Using CUDA…
2023-02-21 01:34:03 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=1, checkpoint_folder=‘models/stapler’, dataset_type=‘open_images’, datasets=[‘data/stapler’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=2, num_workers=2, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)
2023-02-21 01:34:03 - Prepare training datasets.
2023-02-21 01:34:03 - loading annotations from: data/stapler/sub-train-annotations-bbox.csv
Traceback (most recent call last):
File “train_ssd.py”, line 221, in
dataset_type=“train”, balance_data=args.balance_data)
File “/jetson-inference/python/training/detection/ssd/vision/datasets/open_images.py”, line 19, in init
self.data, self.class_names, self.class_dict = self._read_data()
File “/jetson-inference/python/training/detection/ssd/vision/datasets/open_images.py”, line 65, in _read_data
annotations = pd.read_csv(annotation_file)
File “/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py”, line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File “/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py”, line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File “/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py”, line 948, in init
self._make_engine(self.engine)
File “/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py”, line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File “/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py”, line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File “pandas/_libs/parsers.pyx”, line 382, in pandas._libs.parsers.TextReader.cinit
File “pandas/_libs/parsers.pyx”, line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: ‘data/stapler/sub-train-annotations-bbox.csv’

Thanks for any suggestions

figured i was missing
–dataset-type=voc
in my command
I then got a missing “test.txt” file but i just created a blank one for the Main of the ImageSets folder.
Now it seems to be training

Got a new error pop up, I can make another page if needed. Here is what I get

File “train_ssd.py”, line 346, in
val_loss, val_regression_loss, val_classification_loss = test(val_loader, net, criterion, DEVICE)
File “train_ssd.py”, line 165, in test
return running_loss / num, running_regression_loss / num, running_classification_loss / num
ZeroDivisionError: float division by zero

Hi @jpbaehr, I think this error is because the test.txt file you created is blank. Try copying your trainval.txt file to test.txt instead.

Thanks for the suggestion. Unfortunately I am still experiencing the issue. Cannot find similar issues online so anything helps, as I am also new to Machine Learning!

Here is the full console:
root@jb-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --dataset-type=voc --data=data/stapler --model-dir=models/stapler --batch-size=4 --workers=2 --epochs=3
2023-02-21 20:28:30 - Using CUDA…
2023-02-21 20:28:30 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=4, checkpoint_folder=‘models/stapler’, dataset_type=‘voc’, datasets=[‘data/stapler’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=3, num_workers=2, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)
2023-02-21 20:28:30 - Prepare training datasets.
2023-02-21 20:28:31 - VOC Labels read from file: (‘BACKGROUND’, ‘Stapler’)
2023-02-21 20:28:31 - Stored labels into file models/stapler/labels.txt.
2023-02-21 20:28:31 - Train dataset size: 61
2023-02-21 20:28:31 - Prepare Validation datasets.
2023-02-21 20:28:31 - VOC Labels read from file: (‘BACKGROUND’, ‘Stapler’)
2023-02-21 20:28:31 - Validation dataset size: 0
2023-02-21 20:28:31 - Build network.
2023-02-21 20:28:31 - Init from pretrained ssd models/mobilenet-v1-ssd-mp-0_675.pth
2023-02-21 20:28:31 - Took 0.54 seconds to load the model.
2023-02-21 20:29:10 - Learning rate: 0.01, Base net learning rate: 0.001, Extra Layers learning rate: 0.01.
2023-02-21 20:29:10 - Uses CosineAnnealingLR scheduler.
2023-02-21 20:29:10 - Start training from epoch 0.
/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at ttps://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
torch.optim — PyTorch 1.13 documentation”, UserWarning)
/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction=‘sum’ instead.
warnings.warn(warning.format(ret))
2023-02-21 20:31:04 - Epoch: 0, Step: 10/16, Avg Loss: 8.6959, Avg Regression Loss 3.0986, Avg Classification Loss: 5.5973
Traceback (most recent call last):
File “train_ssd.py”, line 346, in
val_loss, val_regression_loss, val_classification_loss = test(val_loader, net, criterion, DEVICE)
File “train_ssd.py”, line 165, in test
return running_loss / num, running_regression_loss / num, running_classification_loss / num
ZeroDivisionError: float division by zero

I think your ImageSets/Main/test.txt file may still be blank? It’s not loading any samples for the validation set

Silly me, I edited the wrong dataset ImageSet that I was targeting for training.
So far it seems to be working, as it trained and exited program! Thanks Dusty, you da man

No worries, glad you got it working!
You’ll probably want to train it for more than 3 epochs to get a useable model. I typically train for like 30 epochs, but YMMV.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.