Export onnx.py is failing for own camera captured data sets

Following error is observed on Jetson nano 2GB while exporting onnx model. Training was successful
Traceback (most recent call last):
File “onnx_export.py”, line 86, in
net.load(args.input)
File “/jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py”, line 135, in load
self.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 571, in load
with _open_file_like(f, ‘rb’) as opened_file:
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 229, in _open_file_like
return _open_file(name_or_buffer, mode)
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 210, in init
super(_open_file, self).init(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: ‘models/face/’

Tried the solution provided in “https://forums.developer.nvidia.com/t/onnx-export-py-outputs-size-mismatch-for-classification

Thers’s no newline/endline characted in both own created labels.txt or pytorch created one. There’s only one difference that pytorch generated model ha one more extra label ‘BACKGROUND’.
But I don’t expect it to be necessary as as in video tutorial “Jetson AI Fundamentals - S3E5 - Training Object Detection Models - YouTube” no class for background was created.
Can anyone help ?
Thanks & Regards,
Dipankar Sil

I tried adding --input argument then the issue bounced back to similar issue of size mismatch as mentioned in “Onnx_export.py outputs size mismatch for classification_headers.0.weight / bias errors
But i tried to diff between labels.txt created by me and then by Pytorch, following is the output
“**diff data/face/labels.txt models/face/labels.txt **
0a1
> BACKGROUND
As we can see there’s no difference in new line character in both the files.
current error after passing --input argument is this
"Traceback (most recent call last):
File “onnx_export.py”, line 86, in
net.load(args.input)
File “/jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py”, line 135, in load
self.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py”, line 1045, in load_state_dict
self.class.name, “\n\t”.join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SSD:
size mismatch for classification_headers.0.weight: copying a param with shape torch.Size([126, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 512, 3, 3]).
size mismatch for classification_headers.0.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]).
size mismatch for classification_headers.1.weight: copying a param with shape torch.Size([126, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 1024, 3, 3]).
size mismatch for classification_headers.1.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]).
size mismatch for classification_headers.2.weight: copying a param with shape torch.Size([126, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 512, 3, 3]).
size mismatch for classification_headers.2.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]).
size mismatch for classification_headers.3.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 256, 3, 3]).
size mismatch for classification_headers.3.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]).
size mismatch for classification_headers.4.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 256, 3, 3]).
size mismatch for classification_headers.4.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]).
size mismatch for classification_headers.5.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([18, 256, 3, 3]).
size mismatch for classification_headers.5.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([18]). "
Please Help.
Thanks & Regards,
Dipankar Sil

Hi,

Could you also add the --labels to specify your output class?

Thanks.

1 Like

Thanks --labels worked in creating .onnx using onnx_export.py.py, but --input argument remains mandatory, else the issue mentioned in the first post remains.
So presently we need to manually note which epoch has the least loss and point that to script while creating onnx.
Thanks & Regards,
Dipankar Sil

PyTorch automatically adds the BACKGROUND class while training, so it is expected to find this line in the labels.txt that gets saved along with the model. That labels.txt should be used while exporting to ONNX. Normally that labels.txt with BACKGROUND does automatically get used while exporting to ONNX, I suppose except if the dataset directory and model directory was the same or the files were inadvertently copied.

1 Like