Data corruption when running train_ssd script

gvc.nitw · August 28, 2022, 8:19am

Hi , I’m going over Jetson AI Fundamentals - S3E5 - Training Object Detection Models, and trying to run the train_ssd.py script on my Jetson nano 2GB board. And see data corruption. I tried to re-download the images and also downloaded a smaller number of images, but no luck. Any idea what could be going on here?

root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --data=data/fruit --model-dir=models/fruit --batch-size=2 --epochs=1 --workers=1

2022-08-28 07:05:00 - Using CUDA…

2022-08-28 07:05:00 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=2, checkpoint_folder=‘models/fruit’, dataset_type=‘open_images’, datasets=[‘data/fruit’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=1, num_workers=1, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)

2022-08-28 07:05:00 - Prepare training datasets.

2022-08-28 07:05:00 - loading annotations from: data/fruit/sub-train-annotations-bbox.csv

2022-08-28 07:05:00 - annotations loaded from: data/fruit/sub-train-annotations-bbox.csv

num images: 2022

2022-08-28 07:05:06 - Dataset Summary:Number of Images: 2022

Minimum Number of Images for a Class: -1

Label Distribution:

Apple: 1212

Banana: 585

Grape: 1004

Orange: 2167

Pear: 353

Pineapple: 249

Strawberry: 3121

Watermelon: 320

2022-08-28 07:05:06 - Stored labels into file models/fruit/labels.txt.

2022-08-28 07:05:06 - Train dataset size: 2022

2022-08-28 07:05:06 - Prepare Validation datasets.

2022-08-28 07:05:06 - loading annotations from: data/fruit/sub-test-annotations-bbox.csv

2022-08-28 07:05:06 - annotations loaded from: data/fruit/sub-test-annotations-bbox.csv

num images: 365

2022-08-28 07:05:07 - Dataset Summary:Number of Images: 365

Minimum Number of Images for a Class: -1

Label Distribution:

Apple: 103

Banana: 48

Grape: 181

Orange: 223

Pear: 52

Pineapple: 51

Strawberry: 310

Watermelon: 68

2022-08-28 07:05:07 - Validation dataset size: 365

2022-08-28 07:05:07 - Build network.

2022-08-28 07:05:07 - Init from pretrained ssd models/mobilenet-v1-ssd-mp-0_675.pth

Traceback (most recent call last):

File “train_ssd.py”, line 309, in

net.init_from_pretrained_ssd(args.pretrained_ssd)

File “/jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py”, line 119, in init_from_pretrained_ssd

state_dict = torch.load(model, map_location=lambda storage, loc: storage)

File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 608, in load

return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)

File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 794, in _legacy_load

deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)

RuntimeError: unexpected EOF, expected 708562 more bytes. The file might be corrupted.

AastaLLL · August 29, 2022, 2:57am

Hi,

Could you try to download the mobilenet-v1-ssd-mp-0_675.pth again to see if it helps?

github.com

dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md#setup

<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="pytorch-collect.md">Back</a> | <a href="pytorch-collect-detection.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>Transfer Learning - Object Detection</sup></s></p>

# Re-training SSD-Mobilenet

Next, we'll train our own SSD-Mobilenet object detection model using PyTorch and the [Open Images](https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=detection&c=%2Fm%2F06l9r) dataset.  SSD-Mobilenet is a popular network architecture for realtime object detection on mobile and embedded devices that combines the [SSD-300](https://arxiv.org/abs/1512.02325) Single-Shot MultiBox Detector with a [Mobilenet](https://arxiv.org/abs/1704.04861) backbone.  

<a href="https://arxiv.org/abs/1512.02325"><img src="https://github.com/dusty-nv/jetson-inference/raw/dev/docs/images/pytorch-ssd-mobilenet.jpg"></a>

In the example below, we'll train a custom detection model that locates 8 different varieties of fruit, although you are welcome to pick from any of the [600 classes](https://github.com/dusty-nv/pytorch-ssd/blob/master/open_images_classes.txt) in the Open Images dataset to train your model on.  You can visually browse the dataset [here](https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=detection).

<img src="https://github.com/dusty-nv/jetson-inference/raw/dev/docs/images/pytorch-fruit.jpg">

To get started, first make sure that you have [JetPack 4.4](https://developer.nvidia.com/embedded/jetpack) or newer and [PyTorch installed](pytorch-transfer-learning.md#installing-pytorch) for **Python 3.6** on your Jetson.  JetPack 4.4 includes TensorRT 7.1, which is the minimum TensorRT version that supports loading SSD-Mobilenet via ONNX.  And the PyTorch training scripts used for training SSD-Mobilenet are for Python3, so PyTorch should be installed for Python 3.6.

## Setup

> **note:** first make sure that you have [JetPack 4.4](https://developer.nvidia.com/embedded/jetpack) or newer on your Jetson and [PyTorch installed](pytorch-transfer-learning.md#installing-pytorch) for **Python 3.6**

This file has been truncated. show original

Thanks.

gvc.nitw · September 4, 2022, 10:27pm

Hi AastaLLL, Thanks for the response. Re-installation helped. But now I see the below err message when trying to export the trained model file using onnx. I was able to successfully train the model and saved it to :
root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd/models/cups# ls
labels.txt mb1-ssd-Epoch-0-Loss-nan.pth

root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd# python3 onnx_export.py --model-dir=models/cups
Namespace(batch_size=1, height=300, input=‘’, labels=‘labels.txt’, model_dir=‘models/cups’, net=‘ssd-mobilenet’, output=‘’, width=300)
running on device cuda:0
found best checkpoint with loss 10000.000000 ()
creating network: ssd-mobilenet
num classes: 5
loading checkpoint: models/cups/
Traceback (most recent call last):
File “onnx_export.py”, line 86, in
net.load(args.input)
File “/jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py”, line 135, in load
self.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 594, in load
with _open_file_like(f, ‘rb’) as opened_file:
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 211, in init
super(_open_file, self).init(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: ‘models/cups/’

dusty_nv · September 5, 2022, 1:06pm

Hi @gvc.nitw, you weren’t able to train the model successfully, because it has NaN loss. Typically this is related to the contents of the training dataset. However if you still want to export it, you can do so like this:

python3 onnx_export.py --input=models/cups/mb1-ssd-Epoch-0-Loss-nan.pth --labels=models/cups/labels.txt

gvc.nitw · September 5, 2022, 6:42pm

Hi Dusty, thanks for the inputs. Here is the log when training samples. Any idea whats going wrong here?
root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --dataset-type=voc --data=data/cups --model-dir=models/cups --batch-size=2 --workers=1 --epochs=1

2022-09-04 18:39:08 - Using CUDA…

2022-09-04 18:39:08 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=2, checkpoint_folder=‘models/cups’, dataset_type=‘voc’, datasets=[‘data/cups’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=1, num_workers=1, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)

2022-09-04 18:39:08 - Prepare training datasets.

2022-09-04 18:39:09 - VOC Labels read from file: (‘BACKGROUND’, ‘Pink’, ‘Orange’, ‘Yellow’, ‘white’)

2022-09-04 18:39:09 - Stored labels into file models/cups/labels.txt.

2022-09-04 18:39:09 - Train dataset size: 670

2022-09-04 18:39:09 - Prepare Validation datasets.

2022-09-04 18:39:09 - VOC Labels read from file: (‘BACKGROUND’, ‘Pink’, ‘Orange’, ‘Yellow’, ‘white’)

2022-09-04 18:39:09 - Validation dataset size: 670

2022-09-04 18:39:09 - Build network.

2022-09-04 18:39:09 - Init from pretrained ssd models/mobilenet-v1-ssd-mp-0_675.pth

2022-09-04 18:39:10 - Took 0.51 seconds to load the model.

2022-09-04 18:39:41 - Learning rate: 0.01, Base net learning rate: 0.001, Extra Layers learning rate: 0.01.

2022-09-04 18:39:41 - Uses CosineAnnealingLR scheduler.

2022-09-04 18:39:41 - Start training from epoch 0.

/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at torch.optim — PyTorch 1.12 documentation

“torch.optim — PyTorch 1.12 documentation”, UserWarning)

/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction=‘sum’ instead.

warnings.warn(warning.format(ret))

2022-09-04 18:41:32 - Epoch: 0, Step: 10/335, Avg Loss: 13.1701, Avg Regression Loss 3.8224, Avg Classification Loss: 9.3477

2022-09-04 18:41:38 - Epoch: 0, Step: 20/335, Avg Loss: 9.2941, Avg Regression Loss 3.2296, Avg Classification Loss: 6.0645

2022-09-04 18:41:44 - Epoch: 0, Step: 30/335, Avg Loss: 7.3793, Avg Regression Loss 2.6704, Avg Classification Loss: 4.7089

2022-09-04 18:41:53 - Epoch: 0, Step: 40/335, Avg Loss: 6.4903, Avg Regression Loss 1.9795, Avg Classification Loss: 4.5108

2022-09-04 18:41:57 - Epoch: 0, Step: 50/335, Avg Loss: 6.3240, Avg Regression Loss 2.1447, Avg Classification Loss: 4.1793

2022-09-04 18:42:02 - Epoch: 0, Step: 60/335, Avg Loss: 5.6483, Avg Regression Loss 1.9217, Avg Classification Loss: 3.7266

2022-09-04 18:42:20 - Epoch: 0, Step: 70/335, Avg Loss: 7.3445, Avg Regression Loss 2.8529, Avg Classification Loss: 4.4916

2022-09-04 18:42:26 - Epoch: 0, Step: 80/335, Avg Loss: 6.4288, Avg Regression Loss 2.1637, Avg Classification Loss: 4.2651

2022-09-04 18:42:32 - Epoch: 0, Step: 90/335, Avg Loss: 5.8095, Avg Regression Loss 1.7508, Avg Classification Loss: 4.0587

2022-09-04 18:42:38 - Epoch: 0, Step: 100/335, Avg Loss: 6.0344, Avg Regression Loss 2.0333, Avg Classification Loss: 4.0011

2022-09-04 18:42:42 - Epoch: 0, Step: 110/335, Avg Loss: 5.2173, Avg Regression Loss 1.4284, Avg Classification Loss: 3.7889

2022-09-04 18:42:49 - Epoch: 0, Step: 120/335, Avg Loss: 5.9220, Avg Regression Loss 1.9940, Avg Classification Loss: 3.9280

2022-09-04 18:42:53 - Epoch: 0, Step: 130/335, Avg Loss: 4.7990, Avg Regression Loss 1.2666, Avg Classification Loss: 3.5324

2022-09-04 18:42:58 - Epoch: 0, Step: 140/335, Avg Loss: 5.1643, Avg Regression Loss 1.4534, Avg Classification Loss: 3.7109

2022-09-04 18:43:10 - Epoch: 0, Step: 150/335, Avg Loss: 4.9472, Avg Regression Loss 1.4633, Avg Classification Loss: 3.4839

2022-09-04 18:43:14 - Epoch: 0, Step: 160/335, Avg Loss: 5.7760, Avg Regression Loss 1.7511, Avg Classification Loss: 4.0249

2022-09-04 18:43:18 - Epoch: 0, Step: 170/335, Avg Loss: 4.6594, Avg Regression Loss 1.1498, Avg Classification Loss: 3.5096

2022-09-04 18:43:23 - Epoch: 0, Step: 180/335, Avg Loss: 4.6275, Avg Regression Loss 1.0692, Avg Classification Loss: 3.5582

2022-09-04 18:43:28 - Epoch: 0, Step: 190/335, Avg Loss: 5.4802, Avg Regression Loss 1.5011, Avg Classification Loss: 3.9791

2022-09-04 18:43:32 - Epoch: 0, Step: 200/335, Avg Loss: 4.3423, Avg Regression Loss 1.1471, Avg Classification Loss: 3.1952

2022-09-04 18:43:36 - Epoch: 0, Step: 210/335, Avg Loss: 4.0183, Avg Regression Loss 0.8555, Avg Classification Loss: 3.1628

2022-09-04 18:43:42 - Epoch: 0, Step: 220/335, Avg Loss: 5.1966, Avg Regression Loss 1.3432, Avg Classification Loss: 3.8534

2022-09-04 18:43:54 - Epoch: 0, Step: 230/335, Avg Loss: 5.4519, Avg Regression Loss 1.4244, Avg Classification Loss: 4.0275

2022-09-04 18:43:59 - Epoch: 0, Step: 240/335, Avg Loss: 4.0353, Avg Regression Loss 0.9766, Avg Classification Loss: 3.0587

2022-09-04 18:44:05 - Epoch: 0, Step: 250/335, Avg Loss: 5.1421, Avg Regression Loss 1.6244, Avg Classification Loss: 3.5177

2022-09-04 18:44:09 - Epoch: 0, Step: 260/335, Avg Loss: 4.2774, Avg Regression Loss 1.0469, Avg Classification Loss: 3.2305

2022-09-04 18:44:15 - Epoch: 0, Step: 270/335, Avg Loss: 4.7652, Avg Regression Loss 1.2696, Avg Classification Loss: 3.4956

2022-09-04 18:44:19 - Epoch: 0, Step: 280/335, Avg Loss: 3.9492, Avg Regression Loss 1.0722, Avg Classification Loss: 2.8770

2022-09-04 18:44:24 - Epoch: 0, Step: 290/335, Avg Loss: 4.5804, Avg Regression Loss 1.3029, Avg Classification Loss: 3.2775

2022-09-04 18:44:29 - Epoch: 0, Step: 300/335, Avg Loss: 4.0685, Avg Regression Loss 1.0787, Avg Classification Loss: 2.9898

2022-09-04 18:44:33 - Epoch: 0, Step: 310/335, Avg Loss: 3.5197, Avg Regression Loss 0.8993, Avg Classification Loss: 2.6204

2022-09-04 18:44:37 - Epoch: 0, Step: 320/335, Avg Loss: 3.8622, Avg Regression Loss 1.1079, Avg Classification Loss: 2.7543

2022-09-04 18:44:42 - Epoch: 0, Step: 330/335, Avg Loss: 4.4249, Avg Regression Loss 1.0579, Avg Classification Loss: 3.3670

2022-09-04 18:45:24 - Epoch: 0, Validation Loss: nan, Validation Regression Loss nan, Validation Classification Loss: 2.6631

2022-09-04 18:45:25 - Saved model models/cups/mb1-ssd-Epoch-0-Loss-nan.pth

2022-09-04 18:45:25 - Task done, exiting program.

dusty_nv · September 6, 2022, 1:28pm

Hmm since the NaN happens in the validation step, but during training it is okay, I would recommend training it for more epochs and seeing what happens.

If the NaN happens during training, normally you would track down which image of the dataset is causing it, and either fix that image or remove it.

If the training loss continues to decrease each epoch, but the validation loss continues to be NaN, then you can manually try onnx_export.py like shown above, or you could send me your dataset and I will take a deeper look at what is happening.

gvc.nitw · September 11, 2022, 5:04am

Hi Dusty, I did try with more epochs and still see the NaN model generated. Attaching the log and my dataset for reference. Pls take a look and let me know what could be wrong here.
Attaching data sets in two parts as the files are larger than 100MB.

cups_part1.zip (57.9 MB)

root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --dataset-type=voc --data=data/cups --model-dir=models/cups --batch-size=2 --workers=1 --epochs=3
2022-09-11 02:36:26 - Using CUDA…
2022-09-11 02:36:26 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=2, checkpoint_folder=‘models/cups’, dataset_type=‘voc’, datasets=[‘data/cups’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=3, num_workers=1, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)
2022-09-11 02:36:26 - Prepare training datasets.
2022-09-11 02:36:27 - VOC Labels read from file: (‘BACKGROUND’, ‘Pink’, ‘Orange’, ‘Yellow’, ‘white’)
2022-09-11 02:36:27 - Stored labels into file models/cups/labels.txt.
2022-09-11 02:36:27 - Train dataset size: 670
2022-09-11 02:36:27 - Prepare Validation datasets.
2022-09-11 02:36:27 - VOC Labels read from file: (‘BACKGROUND’, ‘Pink’, ‘Orange’, ‘Yellow’, ‘white’)
2022-09-11 02:36:27 - Validation dataset size: 670
2022-09-11 02:36:27 - Build network.
2022-09-11 02:36:27 - Init from pretrained ssd models/mobilenet-v1-ssd-mp-0_675.pth
2022-09-11 02:36:28 - Took 0.53 seconds to load the model.
2022-09-11 02:37:13 - Learning rate: 0.01, Base net learning rate: 0.001, Extra Layers learning rate: 0.01.
2022-09-11 02:37:14 - Uses CosineAnnealingLR scheduler.
2022-09-11 02:37:14 - Start training from epoch 0.
/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at torch.optim — PyTorch 1.12 documentation
“torch.optim — PyTorch 1.12 documentation”, UserWarning)
/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction=‘sum’ instead.
warnings.warn(warning.format(ret))
2022-09-11 02:39:12 - Epoch: 0, Step: 10/335, Avg Loss: 11.3822, Avg Regression Loss 3.0875, Avg Classification Loss: 8.2947
2022-09-11 02:39:24 - Epoch: 0, Step: 20/335, Avg Loss: 9.2260, Avg Regression Loss 3.5546, Avg Classification Loss: 5.6714
2022-09-11 02:39:29 - Epoch: 0, Step: 30/335, Avg Loss: 7.7749, Avg Regression Loss 2.5897, Avg Classification Loss: 5.1852
2022-09-11 02:39:34 - Epoch: 0, Step: 40/335, Avg Loss: 7.0474, Avg Regression Loss 2.6217, Avg Classification Loss: 4.4257
2022-09-11 02:39:38 - Epoch: 0, Step: 50/335, Avg Loss: 8.0076, Avg Regression Loss 2.1141, Avg Classification Loss: 5.8935
2022-09-11 02:39:45 - Epoch: 0, Step: 60/335, Avg Loss: 6.0375, Avg Regression Loss 1.6928, Avg Classification Loss: 4.3446
2022-09-11 02:39:50 - Epoch: 0, Step: 70/335, Avg Loss: 5.3666, Avg Regression Loss 1.5696, Avg Classification Loss: 3.7970
2022-09-11 02:39:58 - Epoch: 0, Step: 80/335, Avg Loss: 6.3329, Avg Regression Loss 2.2350, Avg Classification Loss: 4.0980
2022-09-11 02:40:02 - Epoch: 0, Step: 90/335, Avg Loss: 5.9979, Avg Regression Loss 2.0423, Avg Classification Loss: 3.9556
2022-09-11 02:40:06 - Epoch: 0, Step: 100/335, Avg Loss: 5.0405, Avg Regression Loss 1.2298, Avg Classification Loss: 3.8107
2022-09-11 02:40:10 - Epoch: 0, Step: 110/335, Avg Loss: 4.8876, Avg Regression Loss 1.2977, Avg Classification Loss: 3.5899
2022-09-11 02:40:15 - Epoch: 0, Step: 120/335, Avg Loss: 5.0144, Avg Regression Loss 1.1843, Avg Classification Loss: 3.8300
2022-09-11 02:40:20 - Epoch: 0, Step: 130/335, Avg Loss: 5.3371, Avg Regression Loss 1.4427, Avg Classification Loss: 3.8944
2022-09-11 02:40:24 - Epoch: 0, Step: 140/335, Avg Loss: 5.2135, Avg Regression Loss 1.4089, Avg Classification Loss: 3.8046
2022-09-11 02:40:28 - Epoch: 0, Step: 150/335, Avg Loss: 5.5782, Avg Regression Loss 1.5915, Avg Classification Loss: 3.9866
2022-09-11 02:40:34 - Epoch: 0, Step: 160/335, Avg Loss: 5.3204, Avg Regression Loss 1.5334, Avg Classification Loss: 3.7870
2022-09-11 02:40:38 - Epoch: 0, Step: 170/335, Avg Loss: 5.0619, Avg Regression Loss 1.6004, Avg Classification Loss: 3.4615
2022-09-11 02:40:42 - Epoch: 0, Step: 180/335, Avg Loss: 3.9702, Avg Regression Loss 0.7339, Avg Classification Loss: 3.2363
2022-09-11 02:40:46 - Epoch: 0, Step: 190/335, Avg Loss: 4.2404, Avg Regression Loss 1.0805, Avg Classification Loss: 3.1599
2022-09-11 02:41:02 - Epoch: 0, Step: 200/335, Avg Loss: 5.2811, Avg Regression Loss 1.7144, Avg Classification Loss: 3.5667
2022-09-11 02:41:10 - Epoch: 0, Step: 210/335, Avg Loss: 5.0336, Avg Regression Loss 0.8910, Avg Classification Loss: 4.1426
2022-09-11 02:41:15 - Epoch: 0, Step: 220/335, Avg Loss: 5.2741, Avg Regression Loss 1.8023, Avg Classification Loss: 3.4719
2022-09-11 02:41:20 - Epoch: 0, Step: 230/335, Avg Loss: 4.2335, Avg Regression Loss 1.0515, Avg Classification Loss: 3.1820
2022-09-11 02:41:24 - Epoch: 0, Step: 240/335, Avg Loss: 4.9408, Avg Regression Loss 1.3504, Avg Classification Loss: 3.5904
2022-09-11 02:41:30 - Epoch: 0, Step: 250/335, Avg Loss: 4.7384, Avg Regression Loss 1.0783, Avg Classification Loss: 3.6601
2022-09-11 02:41:43 - Epoch: 0, Step: 260/335, Avg Loss: 4.2065, Avg Regression Loss 1.1114, Avg Classification Loss: 3.0951
2022-09-11 02:41:49 - Epoch: 0, Step: 270/335, Avg Loss: 5.8756, Avg Regression Loss 1.6882, Avg Classification Loss: 4.1874
2022-09-11 02:41:54 - Epoch: 0, Step: 280/335, Avg Loss: 3.9843, Avg Regression Loss 0.9590, Avg Classification Loss: 3.0253
2022-09-11 02:41:58 - Epoch: 0, Step: 290/335, Avg Loss: 4.0933, Avg Regression Loss 1.0170, Avg Classification Loss: 3.0762
2022-09-11 02:42:02 - Epoch: 0, Step: 300/335, Avg Loss: 3.8073, Avg Regression Loss 1.0837, Avg Classification Loss: 2.7236
2022-09-11 02:42:06 - Epoch: 0, Step: 310/335, Avg Loss: 3.5797, Avg Regression Loss 0.8343, Avg Classification Loss: 2.7455
2022-09-11 02:42:13 - Epoch: 0, Step: 320/335, Avg Loss: 4.0300, Avg Regression Loss 1.0609, Avg Classification Loss: 2.9691
2022-09-11 02:42:18 - Epoch: 0, Step: 330/335, Avg Loss: 4.2227, Avg Regression Loss 1.0146, Avg Classification Loss: 3.2081
2022-09-11 02:42:56 - Epoch: 0, Validation Loss: nan, Validation Regression Loss nan, Validation Classification Loss: 2.6859
2022-09-11 02:42:57 - Saved model models/cups/mb1-ssd-Epoch-0-Loss-nan.pth
2022-09-11 02:43:09 - Epoch: 1, Step: 10/335, Avg Loss: 4.7528, Avg Regression Loss 1.4569, Avg Classification Loss: 3.2959
2022-09-11 02:43:14 - Epoch: 1, Step: 20/335, Avg Loss: 4.5625, Avg Regression Loss 1.3868, Avg Classification Loss: 3.1757
2022-09-11 02:43:18 - Epoch: 1, Step: 30/335, Avg Loss: 4.1727, Avg Regression Loss 0.9965, Avg Classification Loss: 3.1762
2022-09-11 02:43:22 - Epoch: 1, Step: 40/335, Avg Loss: 3.9218, Avg Regression Loss 1.0655, Avg Classification Loss: 2.8563
2022-09-11 02:43:34 - Epoch: 1, Step: 50/335, Avg Loss: 4.2946, Avg Regression Loss 1.0416, Avg Classification Loss: 3.2531
2022-09-11 02:43:39 - Epoch: 1, Step: 60/335, Avg Loss: 4.1086, Avg Regression Loss 1.1720, Avg Classification Loss: 2.9366
2022-09-11 02:43:44 - Epoch: 1, Step: 70/335, Avg Loss: 3.8122, Avg Regression Loss 0.9168, Avg Classification Loss: 2.8954
2022-09-11 02:43:48 - Epoch: 1, Step: 80/335, Avg Loss: 3.7327, Avg Regression Loss 0.8699, Avg Classification Loss: 2.8627
2022-09-11 02:43:53 - Epoch: 1, Step: 90/335, Avg Loss: 3.8755, Avg Regression Loss 0.9999, Avg Classification Loss: 2.8756
2022-09-11 02:43:57 - Epoch: 1, Step: 100/335, Avg Loss: 4.2911, Avg Regression Loss 1.1613, Avg Classification Loss: 3.1298
2022-09-11 02:44:01 - Epoch: 1, Step: 110/335, Avg Loss: 4.0683, Avg Regression Loss 0.9485, Avg Classification Loss: 3.1198
2022-09-11 02:44:08 - Epoch: 1, Step: 120/335, Avg Loss: 4.8175, Avg Regression Loss 1.3730, Avg Classification Loss: 3.4445
2022-09-11 02:44:13 - Epoch: 1, Step: 130/335, Avg Loss: 3.9986, Avg Regression Loss 0.8723, Avg Classification Loss: 3.1263
2022-09-11 02:44:22 - Epoch: 1, Step: 140/335, Avg Loss: 4.1537, Avg Regression Loss 0.9875, Avg Classification Loss: 3.1663
2022-09-11 02:44:26 - Epoch: 1, Step: 150/335, Avg Loss: 3.2616, Avg Regression Loss 0.6619, Avg Classification Loss: 2.5997
2022-09-11 02:44:31 - Epoch: 1, Step: 160/335, Avg Loss: 3.9602, Avg Regression Loss 0.9796, Avg Classification Loss: 2.9806
2022-09-11 02:44:35 - Epoch: 1, Step: 170/335, Avg Loss: 3.6410, Avg Regression Loss 0.9303, Avg Classification Loss: 2.7107
2022-09-11 02:44:39 - Epoch: 1, Step: 180/335, Avg Loss: 3.2275, Avg Regression Loss 0.6541, Avg Classification Loss: 2.5734
2022-09-11 02:44:43 - Epoch: 1, Step: 190/335, Avg Loss: 3.2455, Avg Regression Loss 0.7881, Avg Classification Loss: 2.4574
2022-09-11 02:44:51 - Epoch: 1, Step: 200/335, Avg Loss: 3.9106, Avg Regression Loss 0.8192, Avg Classification Loss: 3.0913
2022-09-11 02:44:55 - Epoch: 1, Step: 210/335, Avg Loss: 3.6226, Avg Regression Loss 0.7729, Avg Classification Loss: 2.8497
2022-09-11 02:45:00 - Epoch: 1, Step: 220/335, Avg Loss: 3.6930, Avg Regression Loss 0.8867, Avg Classification Loss: 2.8063
2022-09-11 02:45:06 - Epoch: 1, Step: 230/335, Avg Loss: 3.2695, Avg Regression Loss 0.6764, Avg Classification Loss: 2.5931
2022-09-11 02:45:10 - Epoch: 1, Step: 240/335, Avg Loss: 4.4048, Avg Regression Loss 1.0668, Avg Classification Loss: 3.3380
2022-09-11 02:45:15 - Epoch: 1, Step: 250/335, Avg Loss: 4.0364, Avg Regression Loss 0.9123, Avg Classification Loss: 3.1241
2022-09-11 02:45:20 - Epoch: 1, Step: 260/335, Avg Loss: 3.6870, Avg Regression Loss 0.8954, Avg Classification Loss: 2.7916
2022-09-11 02:45:25 - Epoch: 1, Step: 270/335, Avg Loss: 4.1630, Avg Regression Loss 1.0255, Avg Classification Loss: 3.1375
2022-09-11 02:45:29 - Epoch: 1, Step: 280/335, Avg Loss: 3.8322, Avg Regression Loss 1.0760, Avg Classification Loss: 2.7562
2022-09-11 02:45:34 - Epoch: 1, Step: 290/335, Avg Loss: 3.5010, Avg Regression Loss 0.9813, Avg Classification Loss: 2.5197
2022-09-11 02:45:45 - Epoch: 1, Step: 300/335, Avg Loss: 3.2187, Avg Regression Loss 0.7922, Avg Classification Loss: 2.4264
2022-09-11 02:45:50 - Epoch: 1, Step: 310/335, Avg Loss: 3.4653, Avg Regression Loss 0.8551, Avg Classification Loss: 2.6102
2022-09-11 02:45:56 - Epoch: 1, Step: 320/335, Avg Loss: 3.5922, Avg Regression Loss 0.9663, Avg Classification Loss: 2.6259
2022-09-11 02:46:02 - Epoch: 1, Step: 330/335, Avg Loss: 3.7291, Avg Regression Loss 0.8820, Avg Classification Loss: 2.8471
2022-09-11 02:46:40 - Epoch: 1, Validation Loss: nan, Validation Regression Loss nan, Validation Classification Loss: 2.0872
2022-09-11 02:46:41 - Saved model models/cups/mb1-ssd-Epoch-1-Loss-nan.pth
2022-09-11 02:46:46 - Epoch: 2, Step: 10/335, Avg Loss: 4.5928, Avg Regression Loss 1.3204, Avg Classification Loss: 3.2724
2022-09-11 02:46:57 - Epoch: 2, Step: 20/335, Avg Loss: 3.6015, Avg Regression Loss 1.1403, Avg Classification Loss: 2.4612
2022-09-11 02:47:02 - Epoch: 2, Step: 30/335, Avg Loss: 4.5021, Avg Regression Loss 1.2308, Avg Classification Loss: 3.2713
2022-09-11 02:47:06 - Epoch: 2, Step: 40/335, Avg Loss: 3.6274, Avg Regression Loss 0.8591, Avg Classification Loss: 2.7683
2022-09-11 02:47:10 - Epoch: 2, Step: 50/335, Avg Loss: 3.7444, Avg Regression Loss 0.9220, Avg Classification Loss: 2.8223
2022-09-11 02:47:16 - Epoch: 2, Step: 60/335, Avg Loss: 4.8939, Avg Regression Loss 1.6893, Avg Classification Loss: 3.2046
2022-09-11 02:47:21 - Epoch: 2, Step: 70/335, Avg Loss: 4.2902, Avg Regression Loss 1.1281, Avg Classification Loss: 3.1621
2022-09-11 02:47:25 - Epoch: 2, Step: 80/335, Avg Loss: 3.4214, Avg Regression Loss 1.0068, Avg Classification Loss: 2.4146
2022-09-11 02:47:30 - Epoch: 2, Step: 90/335, Avg Loss: 4.2179, Avg Regression Loss 1.1306, Avg Classification Loss: 3.0873
2022-09-11 02:47:35 - Epoch: 2, Step: 100/335, Avg Loss: 3.7106, Avg Regression Loss 0.9373, Avg Classification Loss: 2.7733
2022-09-11 02:47:39 - Epoch: 2, Step: 110/335, Avg Loss: 3.0466, Avg Regression Loss 0.6821, Avg Classification Loss: 2.3644
2022-09-11 02:47:45 - Epoch: 2, Step: 120/335, Avg Loss: 4.7590, Avg Regression Loss 1.4710, Avg Classification Loss: 3.2879
2022-09-11 02:47:51 - Epoch: 2, Step: 130/335, Avg Loss: 3.8756, Avg Regression Loss 0.9354, Avg Classification Loss: 2.9402
2022-09-11 02:47:55 - Epoch: 2, Step: 140/335, Avg Loss: 3.0483, Avg Regression Loss 0.6718, Avg Classification Loss: 2.3764
2022-09-11 02:48:01 - Epoch: 2, Step: 150/335, Avg Loss: 3.3646, Avg Regression Loss 0.8444, Avg Classification Loss: 2.5202
2022-09-11 02:48:05 - Epoch: 2, Step: 160/335, Avg Loss: 3.3813, Avg Regression Loss 0.8697, Avg Classification Loss: 2.5116
2022-09-11 02:48:09 - Epoch: 2, Step: 170/335, Avg Loss: 2.9238, Avg Regression Loss 0.7592, Avg Classification Loss: 2.1646
2022-09-11 02:48:14 - Epoch: 2, Step: 180/335, Avg Loss: 2.9975, Avg Regression Loss 0.8017, Avg Classification Loss: 2.1959
2022-09-11 02:48:19 - Epoch: 2, Step: 190/335, Avg Loss: 2.6866, Avg Regression Loss 0.6453, Avg Classification Loss: 2.0413
2022-09-11 02:48:23 - Epoch: 2, Step: 200/335, Avg Loss: 2.9810, Avg Regression Loss 0.8142, Avg Classification Loss: 2.1668
2022-09-11 02:48:28 - Epoch: 2, Step: 210/335, Avg Loss: 2.8310, Avg Regression Loss 0.4889, Avg Classification Loss: 2.3421
2022-09-11 02:48:33 - Epoch: 2, Step: 220/335, Avg Loss: 2.9918, Avg Regression Loss 0.7754, Avg Classification Loss: 2.2164
2022-09-11 02:48:37 - Epoch: 2, Step: 230/335, Avg Loss: 3.3247, Avg Regression Loss 0.9024, Avg Classification Loss: 2.4223
2022-09-11 02:48:41 - Epoch: 2, Step: 240/335, Avg Loss: 3.6176, Avg Regression Loss 0.9046, Avg Classification Loss: 2.7131
2022-09-11 02:48:45 - Epoch: 2, Step: 250/335, Avg Loss: 3.6077, Avg Regression Loss 0.8150, Avg Classification Loss: 2.7926
2022-09-11 02:48:50 - Epoch: 2, Step: 260/335, Avg Loss: 3.1581, Avg Regression Loss 0.7109, Avg Classification Loss: 2.4472
2022-09-11 02:48:56 - Epoch: 2, Step: 270/335, Avg Loss: 3.9875, Avg Regression Loss 1.0965, Avg Classification Loss: 2.8911
2022-09-11 02:49:00 - Epoch: 2, Step: 280/335, Avg Loss: 2.9721, Avg Regression Loss 0.6941, Avg Classification Loss: 2.2780
2022-09-11 02:49:05 - Epoch: 2, Step: 290/335, Avg Loss: 2.9225, Avg Regression Loss 0.7209, Avg Classification Loss: 2.2016
2022-09-11 02:49:09 - Epoch: 2, Step: 300/335, Avg Loss: 2.5306, Avg Regression Loss 0.5464, Avg Classification Loss: 1.9843
2022-09-11 02:49:14 - Epoch: 2, Step: 310/335, Avg Loss: 2.9787, Avg Regression Loss 0.8526, Avg Classification Loss: 2.1262
2022-09-11 02:49:19 - Epoch: 2, Step: 320/335, Avg Loss: 3.0368, Avg Regression Loss 0.7097, Avg Classification Loss: 2.3271
2022-09-11 02:49:23 - Epoch: 2, Step: 330/335, Avg Loss: 2.8131, Avg Regression Loss 0.6010, Avg Classification Loss: 2.2120
2022-09-11 02:49:58 - Epoch: 2, Validation Loss: nan, Validation Regression Loss nan, Validation Classification Loss: 1.6630
2022-09-11 02:49:58 - Saved model models/cups/mb1-ssd-Epoch-2-Loss-nan.pth
2022-09-11 02:49:58 - Task done, exiting program.
root@gvcjn-desktop:/jetson-inference/python/training/detection/ssd# python3 onnx_export.py --input=models/cups/mb1-ssd-Epoch-0-Loss-nan.pth --labels=models/cups/labels.txt
Namespace(batch_size=1, height=300, input=‘models/cups/mb1-ssd-Epoch-0-Loss-nan.pth’, labels=‘models/cups/labels.txt’, model_dir=‘’, net=‘ssd-mobilenet’, output=‘’, width=300)
running on device cuda:0
creating network: ssd-mobilenet
num classes: 5
loading checkpoint: models/cups/mb1-ssd-Epoch-0-Loss-nan.pth
exporting model to ONNX…
graph(%input_0 : Float(1, 3, 300, 300, strides=[270000, 90000, 300, 1], requires_grad=0, device=cuda:0),
%extras.0.0.weight : Float(256, 1024, 1, 1, strides=[1024, 1, 1, 1], requires_grad=1, device=cuda:0),
%extras.0.0.bias : Float(256, strides=[1], requires_grad=1, device=cuda:0),
%extras.0.2.weight : Float(512, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%extras.0.2.bias : Float(512, strides=[1], requires_grad=1, device=cuda:0),
%extras.1.0.weight : Float(128, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=1, device=cuda:0),
%extras.1.0.bias : Float(128, strides=[1], requires_grad=1, device=cuda:0),
%extras.1.2.weight : Float(256, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=1, device=cuda:0),
%extras.1.2.bias : Float(256, strides=[1], requires_grad=1, device=cuda:0),
%extras.2.0.weight : Float(128, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=1, device=cuda:0),
%extras.2.0.bias : Float(128, strides=[1], requires_grad=1, device=cuda:0),
%extras.2.2.weight : Float(256, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=1, device=cuda:0),
%extras.2.2.bias : Float(256, strides=[1], requires_grad=1, device=cuda:0),
%extras.3.0.weight : Float(128, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=1, device=cuda:0),
%extras.3.0.bias : Float(128, strides=[1], requires_grad=1, device=cuda:0),
%extras.3.2.weight : Float(256, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=1, device=cuda:0),
%extras.3.2.bias : Float(256, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.0.weight : Float(30, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.0.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.1.weight : Float(30, 1024, 3, 3, strides=[9216, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.1.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.2.weight : Float(30, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.2.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.3.weight : Float(30, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.3.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.4.weight : Float(30, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.4.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%classification_headers.5.weight : Float(30, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%classification_headers.5.bias : Float(30, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.0.weight : Float(24, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.0.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.1.weight : Float(24, 1024, 3, 3, strides=[9216, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.1.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.2.weight : Float(24, 512, 3, 3, strides=[4608, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.2.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.3.weight : Float(24, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.3.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.4.weight : Float(24, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.4.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%regression_headers.5.weight : Float(24, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0),
%regression_headers.5.bias : Float(24, strides=[1], requires_grad=1, device=cuda:0),
%451 : Float(32, 3, 3, 3, strides=[27, 9, 3, 1], requires_grad=0, device=cuda:0),
%452 : Float(32, strides=[1], requires_grad=0, device=cuda:0),
%454 : Float(32, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%455 : Float(32, strides=[1], requires_grad=0, device=cuda:0),
%457 : Float(64, 32, 1, 1, strides=[32, 1, 1, 1], requires_grad=0, device=cuda:0),
%458 : Float(64, strides=[1], requires_grad=0, device=cuda:0),
%460 : Float(64, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%461 : Float(64, strides=[1], requires_grad=0, device=cuda:0),
%463 : Float(128, 64, 1, 1, strides=[64, 1, 1, 1], requires_grad=0, device=cuda:0),
%464 : Float(128, strides=[1], requires_grad=0, device=cuda:0),
%466 : Float(128, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%467 : Float(128, strides=[1], requires_grad=0, device=cuda:0),
%469 : Float(128, 128, 1, 1, strides=[128, 1, 1, 1], requires_grad=0, device=cuda:0),
%470 : Float(128, strides=[1], requires_grad=0, device=cuda:0),
%472 : Float(128, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%473 : Float(128, strides=[1], requires_grad=0, device=cuda:0),
%475 : Float(256, 128, 1, 1, strides=[128, 1, 1, 1], requires_grad=0, device=cuda:0),
%476 : Float(256, strides=[1], requires_grad=0, device=cuda:0),
%478 : Float(256, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%479 : Float(256, strides=[1], requires_grad=0, device=cuda:0),
%481 : Float(256, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=0, device=cuda:0),
%482 : Float(256, strides=[1], requires_grad=0, device=cuda:0),
%484 : Float(256, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%485 : Float(256, strides=[1], requires_grad=0, device=cuda:0),
%487 : Float(512, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=0, device=cuda:0),
%488 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%490 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%491 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%493 : Float(512, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%494 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%496 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%497 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%499 : Float(512, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%500 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%502 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%503 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%505 : Float(512, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%506 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%508 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%509 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%511 : Float(512, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%512 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%514 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%515 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%517 : Float(512, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%518 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%520 : Float(512, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%521 : Float(512, strides=[1], requires_grad=0, device=cuda:0),
%523 : Float(1024, 512, 1, 1, strides=[512, 1, 1, 1], requires_grad=0, device=cuda:0),
%524 : Float(1024, strides=[1], requires_grad=0, device=cuda:0),
%526 : Float(1024, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=0, device=cuda:0),
%527 : Float(1024, strides=[1], requires_grad=0, device=cuda:0),
%529 : Float(1024, 1024, 1, 1, strides=[1024, 1, 1, 1], requires_grad=0, device=cuda:0),
%530 : Float(1024, strides=[1], requires_grad=0, device=cuda:0),
%534 : Long(3, strides=[1], requires_grad=0, device=cpu),
%538 : Long(3, strides=[1], requires_grad=0, device=cpu),
%542 : Long(3, strides=[1], requires_grad=0, device=cpu),
%546 : Long(3, strides=[1], requires_grad=0, device=cpu),
%550 : Long(3, strides=[1], requires_grad=0, device=cpu),
%554 : Long(3, strides=[1], requires_grad=0, device=cpu),
%558 : Long(3, strides=[1], requires_grad=0, device=cpu),
%562 : Long(3, strides=[1], requires_grad=0, device=cpu),
%566 : Long(3, strides=[1], requires_grad=0, device=cpu),
%570 : Long(3, strides=[1], requires_grad=0, device=cpu),
%574 : Long(3, strides=[1], requires_grad=0, device=cpu),
%578 : Long(3, strides=[1], requires_grad=0, device=cpu),
%579 : Float(requires_grad=0, device=cpu),
%580 : Float(requires_grad=0, device=cpu)):
%450 : Float(1, 32, 150, 150, strides=[720000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%input_0, %451, %452)
%205 : Float(1, 32, 150, 150, strides=[720000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%450) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%453 : Float(1, 32, 150, 150, strides=[720000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=32, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%205, %454, %455)
%208 : Float(1, 32, 150, 150, strides=[720000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%453) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%456 : Float(1, 64, 150, 150, strides=[1440000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%208, %457, %458)
%211 : Float(1, 64, 150, 150, strides=[1440000, 22500, 150, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%456) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%459 : Float(1, 64, 75, 75, strides=[360000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=64, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%211, %460, %461)
%214 : Float(1, 64, 75, 75, strides=[360000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%459) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%462 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%214, %463, %464)
%217 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%462) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%465 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=128, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%217, %466, %467)
%220 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%465) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%468 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%220, %469, %470)
%223 : Float(1, 128, 75, 75, strides=[720000, 5625, 75, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%468) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%471 : Float(1, 128, 38, 38, strides=[184832, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=128, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%223, %472, %473)
%226 : Float(1, 128, 38, 38, strides=[184832, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%471) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%474 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%226, %475, %476)
%229 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%474) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%477 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=256, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%229, %478, %479)
%232 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%477) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%480 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%232, %481, %482)
%235 : Float(1, 256, 38, 38, strides=[369664, 1444, 38, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%480) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%483 : Float(1, 256, 19, 19, strides=[92416, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=256, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%235, %484, %485)
%238 : Float(1, 256, 19, 19, strides=[92416, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%483) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%486 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%238, %487, %488)
%241 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%486) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%489 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%241, %490, %491)
%244 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%489) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%492 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%244, %493, %494)
%247 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%492) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%495 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%247, %496, %497)
%250 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%495) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%498 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%250, %499, %500)
%253 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%498) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%501 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%253, %502, %503)
%256 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%501) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%504 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%256, %505, %506)
%259 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%504) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%507 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%259, %508, %509)
%262 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%507) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%510 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%262, %511, %512)
%265 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%510) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%513 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%265, %514, %515)
%268 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%513) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%516 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%268, %517, %518)
%271 : Float(1, 512, 19, 19, strides=[184832, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%516) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%272 : Float(1, 30, 19, 19, strides=[10830, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%271, %classification_headers.0.weight, %classification_headers.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%273 : Float(1, 19, 19, 30, strides=[10830, 570, 30, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%281 : Float(1, 2166, 5, strides=[10830, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%273, %534) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%282 : Float(1, 24, 19, 19, strides=[8664, 361, 19, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%271, %regression_headers.0.weight, %regression_headers.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%283 : Float(1, 19, 19, 24, strides=[8664, 456, 24, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%291 : Float(1, 2166, 4, strides=[8664, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%283, %538) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%519 : Float(1, 512, 10, 10, strides=[51200, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=512, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%271, %520, %521)
%294 : Float(1, 512, 10, 10, strides=[51200, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%519) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%522 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%294, %523, %524)
%297 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%522) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%525 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1024, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%297, %526, %527)
%300 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%525) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%528 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%300, %529, %530)
%303 : Float(1, 1024, 10, 10, strides=[102400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%528) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1297:0
%304 : Float(1, 30, 10, 10, strides=[3000, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%303, %classification_headers.1.weight, %classification_headers.1.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%305 : Float(1, 10, 10, 30, strides=[3000, 300, 30, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%313 : Float(1, 600, 5, strides=[3000, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%305, %542) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%314 : Float(1, 24, 10, 10, strides=[2400, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%303, %regression_headers.1.weight, %regression_headers.1.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%315 : Float(1, 10, 10, 24, strides=[2400, 240, 24, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%323 : Float(1, 600, 4, strides=[2400, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%315, %546) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%324 : Float(1, 256, 10, 10, strides=[25600, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%303, %extras.0.0.weight, %extras.0.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%325 : Float(1, 256, 10, 10, strides=[25600, 100, 10, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%324) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%326 : Float(1, 512, 5, 5, strides=[12800, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%325, %extras.0.2.weight, %extras.0.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%327 : Float(1, 512, 5, 5, strides=[12800, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%326) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%328 : Float(1, 30, 5, 5, strides=[750, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%327, %classification_headers.2.weight, %classification_headers.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%329 : Float(1, 5, 5, 30, strides=[750, 150, 30, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%337 : Float(1, 150, 5, strides=[750, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%329, %550) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%338 : Float(1, 24, 5, 5, strides=[600, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%327, %regression_headers.2.weight, %regression_headers.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%339 : Float(1, 5, 5, 24, strides=[600, 120, 24, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%347 : Float(1, 150, 4, strides=[600, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%339, %554) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%348 : Float(1, 128, 5, 5, strides=[3200, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%327, %extras.1.0.weight, %extras.1.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%349 : Float(1, 128, 5, 5, strides=[3200, 25, 5, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%348) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%350 : Float(1, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%349, %extras.1.2.weight, %extras.1.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%351 : Float(1, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%350) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%352 : Float(1, 30, 3, 3, strides=[270, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%351, %classification_headers.3.weight, %classification_headers.3.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%353 : Float(1, 3, 3, 30, strides=[270, 90, 30, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%361 : Float(1, 54, 5, strides=[270, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%353, %558) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%362 : Float(1, 24, 3, 3, strides=[216, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%351, %regression_headers.3.weight, %regression_headers.3.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%363 : Float(1, 3, 3, 24, strides=[216, 72, 24, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%371 : Float(1, 54, 4, strides=[216, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%363, %562) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%372 : Float(1, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%351, %extras.2.0.weight, %extras.2.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%373 : Float(1, 128, 3, 3, strides=[1152, 9, 3, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%372) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%374 : Float(1, 256, 2, 2, strides=[1024, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%373, %extras.2.2.weight, %extras.2.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%375 : Float(1, 256, 2, 2, strides=[1024, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%374) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%376 : Float(1, 30, 2, 2, strides=[120, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%375, %classification_headers.4.weight, %classification_headers.4.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%377 : Float(1, 2, 2, 30, strides=[120, 60, 30, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%385 : Float(1, 24, 5, strides=[120, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%377, %566) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%386 : Float(1, 24, 2, 2, strides=[96, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%375, %regression_headers.4.weight, %regression_headers.4.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%387 : Float(1, 2, 2, 24, strides=[96, 48, 24, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%395 : Float(1, 24, 4, strides=[96, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%387, %570) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%396 : Float(1, 128, 2, 2, strides=[512, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[1, 1], pads=[0, 0, 0, 0], strides=[1, 1]](%375, %extras.3.0.weight, %extras.3.0.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%397 : Float(1, 128, 2, 2, strides=[512, 4, 2, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%396) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%398 : Float(1, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[2, 2]](%397, %extras.3.2.weight, %extras.3.2.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%399 : Float(1, 256, 1, 1, strides=[256, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Relu(%398) # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1299:0
%400 : Float(1, 30, 1, 1, strides=[30, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%399, %classification_headers.5.weight, %classification_headers.5.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%401 : Float(1, 1, 1, 30, strides=[30, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:102:0
%409 : Float(1, 6, 5, strides=[30, 5, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%401, %574) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:103:0
%410 : Float(1, 24, 1, 1, strides=[24, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[1, 1, 1, 1], strides=[1, 1]](%399, %regression_headers.5.weight, %regression_headers.5.bias) # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:443:0
%411 : Float(1, 1, 1, 24, strides=[24, 1, 1, 1], requires_grad=1, device=cuda:0) = onnx::Transposeperm=[0, 2, 3, 1] # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:106:0
%419 : Float(1, 6, 4, strides=[24, 4, 1], requires_grad=1, device=cuda:0) = onnx::Reshape(%411, %578) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:107:0
%420 : Float(1, 3000, 5, strides=[15000, 5, 1], requires_grad=1, device=cuda:0) = onnx::Concat[axis=1](%281, %313, %337, %361, %385, %409) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:87:0
%421 : Float(1, 3000, 4, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Concat[axis=1](%291, %323, %347, %371, %395, %419) # /jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py:88:0
%scores : Float(1, 3000, 5, strides=[15000, 5, 1], requires_grad=1, device=cuda:0) = onnx::Softmaxaxis=2 # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1680:0
%423 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[2], starts=[0] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:104:0
%424 : Float(requires_grad=0, device=cpu) = onnx::Constantvalue={0.1}
%425 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Mul(%423, %424)
%426 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=0, device=cuda:0) = onnx::Constantvalue=
%427 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Mul(%425, %426) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:104:0
%428 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=0, device=cuda:0) = onnx::Constantvalue=
%429 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Add(%427, %428) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:104:0
%430 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[9223372036854775807], starts=[2] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:105:0
%431 : Float(requires_grad=0, device=cpu) = onnx::Constantvalue={0.2}
%432 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Mul(%430, %431)
%433 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Exp(%432) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:105:0
%434 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=0, device=cuda:0) = onnx::Constantvalue=
%435 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Mul(%433, %434) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:105:0
%436 : Float(1, 3000, 4, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Concat[axis=2](%429, %435) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:106:0
%437 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[2], starts=[0] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:208:0
%438 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[9223372036854775807], starts=[2] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:208:0
%441 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Div(%438, %579)
%442 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Sub(%437, %441) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:208:0
%443 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[2], starts=[0] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:209:0
%444 : Float(1, 3000, 2, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Sliceaxes=[2], ends=[9223372036854775807], starts=[2] # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:209:0
%447 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Div(%444, %580)
%448 : Float(1, 3000, 2, strides=[6000, 2, 1], requires_grad=1, device=cuda:0) = onnx::Add(%443, %447) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:209:0
%boxes : Float(1, 3000, 4, strides=[12000, 4, 1], requires_grad=1, device=cuda:0) = onnx::Concat[axis=2](%442, %448) # /jetson-inference/python/training/detection/ssd/vision/utils/box_utils.py:209:0
return (%scores, %boxes)

gvc.nitw · September 11, 2022, 5:05am

Contd. from above. Reminder of dataset.
cups_part2.zip (63.8 MB)

dusty_nv · September 12, 2022, 12:03pm

If you export models/cups/mb1-ssd-Epoch-2-Loss-nan.pth to ONNX, does it work with detectnet? You may need to train for more epochs, normally I do at least 30 epochs for object detection. When I have the time I will try downloading and training with your dataset.

dusty_nv · September 12, 2022, 8:43pm

Hi @gvc.nitw, upon digging into your dataset and testing it, I found that the 20220829-080201.xml file has an invalid/malformed annotation:

<annotation>
    <filename>20220829-080201.jpg</filename>
    <folder>cups</folder>
    <source>
        <database>cups</database>
        <annotation>custom</annotation>
        <image>custom</image>
    </source>
    <size>
        <width>1280</width>
        <height>720</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>Orange</name>
        <pose>unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>417</xmin>
            <ymin>165</ymin>
            <xmax>757</xmax>
            <ymax>427</ymax>
        </bndbox>
    </object>
    <object>
        <name>Pink</name>
        <pose>unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>127</xmin>
            <ymin>9999</ymin>
            <xmax>254</xmax>
            <ymax>-827705472</ymax>
        </bndbox>
    </object>

(see the second annotation here of the Pink cup with a large positive Y-min and large negative Y-max coodrinate - this is what was causing the NaN’s)

Upon removing the 20220829-080201 line from your trainval.txt and test.txt files, the model trains without NaN’s now. Removing that entry from the ImageSet files means that it doesn’t get used during the training process.

system · October 5, 2022, 4:08am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training with "train_ssd.py" - error at the end of the dataset Jetson AGX Xavier	6	1246	October 18, 2021
Jetson nano start the Docker an error occurred while training your detection model ：Segmentation fault (core dumped) Jetson Nano jetson-inference	7	1234	April 21, 2022
Re-training SSD-Mobilenet: gt_locations consist of nan values which causing Regression Loss to NaN Jetson Nano ai-training	2	922	September 13, 2022
Training Custom FasterRCNN resnet50 Object detection issue TAO Toolkit	9	1120	October 12, 2021
Input to reshape is a tensor with 3067968 values, but the requested shape has 2691200 TAO Toolkit inception	2	14	January 16, 2025
Custom data training failed Jetson Xavier NX jetson-inference	5	405	September 21, 2022
Testing accuracy is too low on ResNET 18 pre tarined model Jetson Nano ai-training	6	1604	August 10, 2022
Jetson nano - train model for my own object detection Jetson Nano ai-training	11	4461	October 15, 2021
TLT Classification example loss and val_acc unable to converge during training TAO Toolkit nvbugs	12	653	October 12, 2021
Invalid argument: Invalid JPEG data or crop window, data size 786432 TAO Toolkit	9	1373	March 20, 2023

Data corruption when running train_ssd script

Related topics