Pickling error while training

sir I was running my training code for a new model and was stuck by this error.
root@manpreet-desktop:/jetson-inference/python/training/detection/ssd# python3 train_ssd.py --dataset-type=voc --data=data/rice --model-dir=models/rice5 --batch-size=2 --workers=1 --epochs=1
2022-07-20 18:10:10 - Using CUDA…
2022-07-20 18:10:10 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=2, checkpoint_folder=‘models/rice5’, dataset_type=‘voc’, datasets=[‘data/rice’], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones=‘80,100’, momentum=0.9, net=‘mb1-ssd’, num_epochs=1, num_workers=1, pretrained_ssd=‘models/mobilenet-v1-ssd-mp-0_675.pth’, resume=None, scheduler=‘cosine’, t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)
2022-07-20 18:10:10 - Prepare training datasets.
2022-07-20 18:10:10 - VOC Labels read from file: (‘BACKGROUND’, ‘Node’)
2022-07-20 18:10:10 - Stored labels into file models/rice5/labels.txt.
2022-07-20 18:10:10 - Train dataset size: 401
2022-07-20 18:10:10 - Prepare Validation datasets.
2022-07-20 18:10:10 - VOC Labels read from file: (‘BACKGROUND’, ‘Node’)
2022-07-20 18:10:10 - Validation dataset size: 401
2022-07-20 18:10:10 - Build network.
2022-07-20 18:10:11 - Init from pretrained ssd models/mobilenet-v1-ssd-mp-0_675.pth
Traceback (most recent call last):
File “train_ssd.py”, line 309, in
net.init_from_pretrained_ssd(args.pretrained_ssd)
File “/jetson-inference/python/training/detection/ssd/vision/ssd/ssd.py”, line 119, in init_from_pretrained_ssd
state_dict = torch.load(model, map_location=lambda storage, loc: storage)
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.6/dist-packages/torch/serialization.py”, line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, ‘<’.
Kindly please inform any curies as soon as possible.

Hi @jino.joy.m, can you run these commands to re-download the base model? Perhaps it is corrupted.

$ cd jetson-inference/python/training/detection/ssd
$ wget https://nvidia.box.com/shared/static/djf5w54rjvpqocsiztzaandq1m3avr7c.pth -O models/mobilenet-v1-ssd-mp-0_675.pth

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.