Train_ssd.py dosen't work with pascal voc dataset

hi everyone, i am trying to train ssd-mobilnet object detector with a dateset annotated on CVAT
when i run the command:

python3 train_ssd.py --dataset-type=voc --data=data/training_test --model-dir=models/model_test

i get the following error:

Traceback (most recent call last):
File “train_ssd.py”, line 343, in
device=DEVICE, debug_steps=args.debug_steps, epoch=epoch)
File “train_ssd.py”, line 113, in train
for i, data in enumerate(loader):
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 521, in next
data = self._next_data()
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1203, in _next_data
return self._process_data(data)
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1229, in _process_data
data.reraise()
File “/home/user/.local/lib/python3.6/site-packages/torch/_utils.py”, line 434, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py”, line 287, in _worker_loop
data = fetcher.fetch(index)
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py”, line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py”, line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataset.py”, line 308, in getitem
return self.datasets[dataset_idx][sample_idx]
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py”, line 81, in getitem
image, boxes, labels = self.transform(image, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/ssd/data_preprocessing.py”, line 34, in call
return self.augment(img, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/transforms/transforms.py”, line 55, in call
img, boxes, labels = t(img, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/transforms/transforms.py”, line 214, in call
return torch.from_numpy(cvimage.astype(np.float32)).permute(2, 0, 1), boxes, labels
RuntimeError: Numpy is not available

searching a bit on the net i found a way recommended by @dusty_nv to debug

this is what i get:

getitem_ image_id=IMG_20220110_212704
boxes=[[ 1163.79003906 504.25 1299.86999512 625.76000977]]
labels=[1]
getitem image_id=IMG_20220110_213146
boxes=[[ 1299.88000488 1777.64001465 1654.67004395 1928.31005859]]
labels=[1]
getitem image_id=IMG_20220110_211451
boxes=[[ 547.98999023 2064.38989258 2317.12988281 2496.95996094]]
labels=[1]
Traceback (most recent call last):
File “train_ssd.py”, line 343, in
device=DEVICE, debug_steps=args.debug_steps, epoch=epoch)
File “train_ssd.py”, line 113, in train
for i, data in enumerate(loader):
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 521, in next
data = self._next_data()
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1203, in _next_data
return self._process_data(data)
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1229, in _process_data
data.reraise()
File “/home/user/.local/lib/python3.6/site-packages/torch/_utils.py”, line 434, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py”, line 287, in _worker_loop
data = fetcher.fetch(index)
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py”, line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py”, line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataset.py”, line 308, in getitem
return self.datasets[dataset_idx][sample_idx]
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py”, line 81, in getitem
image, boxes, labels = self.transform(image, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/ssd/data_preprocessing.py”, line 34, in call
return self.augment(img, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/transforms/transforms.py”, line 55, in call
img, boxes, labels = t(img, boxes, labels)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/transforms/transforms.py”, line 214, in call
return torch.from_numpy(cvimage.astype(np.float32)).permute(2, 0, 1), boxes, labels
RuntimeError: Numpy is not available

I tried to delete the photo and the annotation of the last image as suggested (IMG_20220110_211451).

However I get an error again:

Traceback (most recent call last):
File “train_ssd.py”, line 214, in
target_transform=target_transform)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py”, line 36, in init
self.ids = self._read_image_ids(image_sets_file)
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py”, line 111, in _read_image_ids
if self._get_num_annotations(image_id) > 0:
File “/home/user/Downloads/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py”, line 123, in _get_num_annotations
objects = ET.parse(annotation_file).findall(“object”)
File “/usr/lib/python3.6/xml/etree/ElementTree.py”, line 1196, in parse
tree.parse(source, parser)
File “/usr/lib/python3.6/xml/etree/ElementTree.py”, line 586, in parse
source = open(source, “rb”)
FileNotFoundError: [Errno 2] No such file or directory: ‘data/training_test/Annotations/IMG_20220110_211451.xml’

i have tried debugging multiple times and it gives me a different image each time so i’m not even sure if this is the way to go.
if anyone could help me i would be grateful.

Hi,

RuntimeError: Numpy is not available

The error indicates a missing library.
Would you mind installing it and trying it again?

$ sudo apt-get install python3-pip
$ pip3 install numpy==1.19.4

Thanks.

hi, I have updated numpy,

however, get an error again:

Illegal instruction (core dumped)

I followed the instructions on this page:

now i get an error similar to before:

RuntimeError: DataLoader worker (pid 11661) is killed by signal: Segmentation fault.
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “train_ssd.py”, line 343, in
device=DEVICE, debug_steps=args.debug_steps, epoch=epoch)
File “train_ssd.py”, line 113, in train
for i, data in enumerate(loader):
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 521, in next
data = self._next_data()
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1186, in _next_data
idx, data = self._get_data()
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1152, in _get_data
success, data = self._try_get_data()
File “/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 1003, in _try_get_data
raise RuntimeError(‘DataLoader worker (pid(s) {}) exited unexpectedly’.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 11661) exited unexpectedly

Thanks in advance

Hi @utente1480, have you tried using the jetson-inference docker container to rule out any issues with the dependencies (like PyTorch and numpy)?

Also, if you want you can send me your dataset, and I’ll take a look at it and try training it. You can upload it to google drive or somewhere and share the link.

hi, i think i understand the problem, i could not download pytorch because it was not able to find the qt5base package, and in turn i can’t download qt5base because it doesn’t find the libqt5sql5-sqlite package, etc.
I will try to use docker container, hoping to be able to get all the packages; thanks to both of you for your help